ARM7 - runs generic
Fifo commands:
Sat Jun 10 10:32:33 CEST 2017
On Thursday I managed to run DS-10 in DeSmuME without tearing in sound. The
problem was that some instruction (or maybe memory type) has assigned wrong
number of cycles, that is needed to access it. The result is that the
processor isn't fast enough to compute one sound "frame" (64*2 samples) within
assigned timeframe. When I set timing for every instruction to 1, it works
flawlessly (I need to fix it in the JIT as well.
I was generating some callgraphs to understand where is each synth/drum
computed: ds10 call graph.
And you can kinda see it from there: one frame is 64 samples, so if node
has 256 ins and 252 outs one way and 4 another, it's probably a 64 sample for
loop done 4 times, etc.
One other nice thing: I traped various instructions to send register contents
to soundcard - it's like probing circuit with oscilloscope. It would be nice
if it could be done run-time.
Plan for today:
- add support for subroutines to callgraph generator (or: buy IDA)
- print full register contents next to each instruction
- src/cli/main.cpp: DIS_AUDIO
- src/armcpu.cpp: x_armlog function
- make gdb working with desmume
OK, so here's another graph I made when fixing bug with "overmerging" of
nodes: graph.
70474bb824b56912cc419221aa8ae015 ds10plus.nds
Sat Jun 10 12:52:40 CEST 2017
Well, I probably shouldn't spend so much time on making tools to answer
questions I don't have. But it makes nice graphs.
I made another one and I swear this is the last one:
now with more colors!
Also script for analysis and the trace.
Misc notes:
02075648 parameter set procedure
argument R0=synth pointer
argument r1=knob id
argument r2=knob value
load filter pointer
0207E950 pitchtable
0207ED38 drumpitch table
02071fac set note (args: r0: pitch control, r1: note, r2: 0)
020722B8 drum handler method
02073150 mixer head?
02074720 synth handler method
02074F98 Head of synthesis loop
02075410 Store output of one synth
0207546C event dispatch function
- r0 = synth pointer
- load event count byte from r0+#8
- load events from r0+#80 (each even 8 bytes long)
if (event == 1) {
env = synth->envelope;
STRB #3, [env, #4]
STRB #1, [env, #5]
}
if (val == 2) {
note = keyjazz[4];
/* fill frequency table */
synth->pitch->note = note << 15;
STRB #0, [env,4]
STRB #0, [env,5]
}
02075650 knob dispatch table
020758ec drive knob handler
02075ad4 recompute LFO from BPM (guess)
020807c4 filter coefficient table (151 entries)
02087CEC synth method table (1st synth)
02087c7c mixer method table?
01FF846C load next "machine" dispatcher and call it
01FF8444 load next synth ptr data
01FF86A0 synth iteration table
01FF87E0 synths table [list of all synth pointer], null terminated
01FF8820 buf1 (syn1)
01FF88A0 buf2 (syn2)
01FF8920 buf3 (drum1)
01FF89A0 buf4
01FF8A20 buf5
01FF8AA0 buf6
021b04a0 precomputed sample 1
0223D280 stored synth preset data
0223d2e0 another preset data? (maybe?)
02248680 (output) soundbuf
022489A0 Synth 1 Data (struct SYN)
02248A20 keyjazz area (SYN1)
02248c40 env/SYN1
02248CC0 LFO/SYN1
02248d60 pitchctrl/SYN1
02248d20 notectrl/SYN1
02248de0 OSC/SYN1
02248ea0 FILTER/SYN1
02248f40 AMP/SYN1
02248fa0 ???/SYN1
02248FE0 Synth 2 Data
02249620 drum1
022498C0 drum1 volblock
02249900 drum2
02249BE0 drum3
02249EC0 drum4
0224A1A0 MIXER
0224A480 effects
0226A580 delay buffer
knobs: name id min max
octave 0 -2 2
egint 1 -63 63
porta 2 0 127
pitch.in 3 -63 63
pitch1.in 4 -63 63
pitch2.in 5 -63 63
vco1 6 0 3
vco2 7 0 3
vco2pitch 8 -63 63
balance 9 0 127
vcosync 10 0 1
cutoff 11 0 127
peak 12 0 127
vcftype 13 0 2 [0=lpf]
egint 14 -63 63
cutoff.in 15 -63 63
vcaeg 16 0 1
vca.in 17 -63 63
drive 18 0 127
level 19 0 127
attack 20 0 127
decay 21 0 127
sustain 22 0 127
release 23 0 127
mg.freq 24 0 127
mg.bpm 25 0 1 [0=off]
inputs
pitch.in.hole 26
pitch1.in.hole 27
pitch2.in.hole 28
cutoff.in.hole 29
vca.in.hole 30
outputs
none 0
sine 1
saw 2
square 3
s&h 4
eg 5
vco2 6
DRUM
+10 how many samples to render (u16)
+30 pointer to volblock
+34 play speed
+38 offset
+40 sample length (=0x8000)
+44 sample buffer address
DRUMvolblock (022498C0)
+14 volume
synth iteration table (01FF86A0)
+0 synth table pointer (01FF87E0)
+14 number of synths
+28 synth handler table
synth generic table
+0 synth method table
synth method table
+c synth render handler
+18 knob movement handler
+28 noteon event handling
SYN
+28 function for handling events? [=> 0207546C]
+2C ?? stored by synth event handler (event 4)
+30 pointer to envelope
+34 pointer to LFO
+3C pointer to pitch control
+40 pointer to struct OSC
+44 pointer to struct filter?
+48 pointer to AMP
+320 seems to be lfo data
+2A8 envelope value
+84 frequency!? could be pointer to memory, it's chromatic
+8 trigger? (number of events in event area)
+4 keyjazz pointer (pointer to list of events)
+0 function method table?
+52 used in noteon_to_env
+50 1 if first bar of pattern repeat
ENV
+0 envelope value
+40 attack
+e0 protamento
PITCHCTRL
+18 NOTECTRL
+38 period for osc1? (from code)
+3c period for osc2? (from code)
NOTECTRL
+4 oldpitch
+0 newpitch (note << 15)
+C inverse portamento?
+10 direction (0 down, 1 up)
OSC
+2C VCO1 type (0..3)
+30 VCO2 type (0..3)
+34 OSC balance (midi)
+38 VCO1 volume (15bit)
+3C VCO2 volume (15bit)
+20,28,64,6c vco2 pitch
44,
---
+24 [speculation]
+1c input to divisor
+c jumptable on this
+48 enum [0-3]
LFO
+28 BPM
FILTER (OSC+C0)
+1C VCF cutoff
+20,24 VCF peak (precomputed?)
+28 cutoff precomputed coefficient 1
+40 filter type (0=lowpass,1=bandpass,2=highpass)
---
+24 coef1
+34 highpass output
+38 lowpass output
AMP (OSC+160)
+4 ?? [values 0..3]
+8 which envelope segment are we using [env+[AMP+8]*4]
+C "extra distortion" switch
+10 pointer to env
+14 ?? [pointer to some table offseted by [AMP+4]*4]
+18 VCA LEVEL, knob*0x20408
+24 VCA EG
+2C [speculation] current amp envelope value
+30 [speculation] envelope increment
+34 over-driver output from amp first stage
+38 drive coeff 1, 15bit, unsigned ((knob*0x102)^2>>15)
+3C drive coeff 2, 15bit, unsigned (0x7fff-knob*0xe8)
this value is actually wet/dry knob (0x7fff is dry)
MIXER 0224A1A0
+8 number of events
+30 used by mixer, volume of sorts, (?)
+34 H? dry/wet knob, same as effect
+50+4*chan mixvol/R (L-R)
+68+4*chan mixvol/L
+80 pointer to effect data
+84 effect enabled on [0..syn1, 1..syn2, 2..syn1+syn2, 3..drums, 4..all, 5..off]
+88 used by effect, halfword (?)
+C0 event queue
EFFECT 0224a480
+C H effect type [0 delay, 1 chorus, 2 flanger]
+E H Delay sync [0 off]
+10 H Delay time
+12 HS Delay L/R ratio [ffc1..003f]
+14 H Flanger+Chorus LFO
+16 H Flanger+Chorus Depth
+18 HS Delay+Flanger Feedback [ffc1..003f]
+1A H dry/wet [0000..007f]
0224a480: 0228a5a0 0228a5e0 0228a620 00010000
0224a490: 000d0003 003e0017 00110010 00000001
SYNTH EVENTS
__ __ __ 01 __ __ __ __ 02075524 noteoff
__ __ __ 02 __ __ __ nn 02075534 noteon
__ __ __ 04 bb bb bb bb 02075588 set bpm
__ __ __ 05 __ __ __ __ 020755a0 supress lfo reset? force lfo reset?
__ __ __ 18 __ __ __ __ 020755b8 reset envelope, lfo?
__ __ __ 1c __ __ __ __ 020755e4 reset bpm-synced lfo if first bar
__ __ __ 1d __ __ __ __ 020755d4 set "first bar repetition" flag
- legato is done by changing the note without stopping (02 02 01)
- bar start: 1c
- pattern change: 1d
- play: 04 18
- stop: 19 (abrupt stop: 05 19? only when song not playing)
- abrupt pattern change: 05 (usually 05 05)
- when shaping a drum: 1e (what is that?)
MIXER EVENTS
__ __ cc 06 __ __ vv 07 ; set volume (0..7f) on channel
__ __ cc 06 __ __ pp 0a ; set pan (0..7f) on channel
- mixer also gets all the song control events (5, 18, 19, 1c, 1d)
DRUM EVENTS
__ __ __ 01 __ __ __ __ 02075524 noteoff
__ __ __ 02 __ __ __ nn 02075534 noteon
- the sequencer sets a noteoff before each noteon and optionally a
noteoff after some time (if it's not a one-shot)
- also accepts all song control events (TODO: which ones?)
- pan/volume events are set to mixer
Code seems to be divided into sections (first OSC, then FILTER, AMP...).
Envelopes seem to be computed just once per frame (they are not per-sample).
MCR p15 0 Rd c7 c10 2 Clean D$ Line by Set/Way
MCR p15 0 Rd c7 c10 4 Data Synchronization Barrier
Synth pointer:
--- DUMP AT 022489a0 ---
0000: 02087cec 02248a20 00004000 00000201 00000040 00007fd8 00008000 01ff8b40
0020: 00000040 00007fd8 00000002 00000075 02248c40 02248cc0 02248d20 02248d60
0040: 02248de0 02248ea0 02248f40 02248fa0 00000000 00000000 00007fff 00007fff
0060: 02248c20 02248980 00000220 00007fff 00000001 00000000 00000000 00000000
Things that need to be found:
- where does R4/SYN pointer come from
- what's the encoding for "frequency"
- dump structures in sampler
=> routine which renders "samples" is short and effective (0x02072354)
=> it does "drop sample" interpolation
That's it for today.
Sat Jun 10 20:22:20 CEST 2017
I'm trying to figure out how to trigger sample from keyboard. Memory write to
envelope register isn't enough, some code probably needs to be executed as
well to initialize the counters.
02248c44 00000000 attack
00000001 sustain?
00000103 release
Mon Jun 12 09:00:38 CEST 2017
Silly me - I know what I did wrong. I was writing "sustain" instead of
"attack" into the "envelope state" register. Now I can trigger the
last note from keyboard. Although just 4/5 of key-presses go through...
I need to speed up debugging, writing breakpoints into emulator source just
doesn't cut it anymore.
MCR p15, 0, , c7, c0, 4 Wait for interrupt
Trying to figure ou what triggers the envelope:
IPC7 time=1168279224 send FIFO 0xC02400C6 size 000 (l 0x8501, tail 10) (r
0x8501, tail 13)
irq proc=0 num=18
ARM9: trigger addr=022489a8 val=00000001 lastipc=c02400c6
IPC9 time=1168298016 send FIFO 0x8226CA07 size 000 (l 0x8501, tail 13) (r
0x8501, tail 11)
irq proc=1 num=18
ARM9: trigger addr=022489a8 val=00000002 lastipc=c02400c6
ARM9: trigger addr=022489a8 val=00000003 lastipc=c02400c6
irq proc=1 num=4
irq proc=1 num=4
IPC7 time=1168377409 send FIFO 0x00004047 size 000 (l 0x8501, tail 11) (r
0x8501, tail 14)
irq proc=0 num=18
ARM9: trigger addr=022489a8 val=00000004 lastipc=00004047
Wed Jun 14 14:44:28 CEST 2017
I wonder how hard it must have been to reverse engineer something just using
gdb or similar, command-line debugger.
Anyway, still working on task 2 - need to figure out, how to trigger note-on
manually.
There are two suspicious places:
1. "keyjazz" area
02248a20: 02070001 02077eff // note off
02248a20: 00000002 02247f35 // note on
2. envelope mem
02248c40: 02248c80 00000001
Let's do the keyjazz area first
- the lowest byte goes from 29 (low F) to 40 (high E), FF when no note is
pressed [this is midi note]
- it seems to be a pointer to memory (which itself is array to pointers)
- it is written periodically (when in keyjazz mode) or every song-tick (when
in song-play mode)
- the keyjazz is read from 01FF9EC8 and 01FF9ECC (seriously, what is that
memory? is it ipc? print memory map!)
- oh shit, that's stack! that explains a lot actually
ARM9:
<02 00 0000 ITCM, mod 0x8000
04 __ ____ IO
02 __ ____
- so I'm plotting more graphs and they are a thing of beauty...
(the picture illustrates how a dispatcher looks like)
- i need to go over 4007 sound generation again to find out how are these
values used
- the note (lowest byte) is read from 0230ADA9
0207025C procedure, R2 => pointer to <event,keypress> pair
- ok, so this is solved: 0230ADA4
[-- 01 00 -- | ff -- -- -- ] noteoff
[-- 02 00 -- | ff 35 -- -- ] noteon
- so for instance, this plays a note:
MMU_write32(0, 0x0230ADA4, 0x00000200);
MMU_write32(0, 0x0230ADA8, 0x000035ff);
- now let's go figure out how does it get interpreted by the synth code (aka
who reads 02248a20)
- oh, now i now: it's not "keyjazz" area, it's a list of events! and trigger
is "number of events pending"
- pitchctrl area:
02248d20: 001a0000 001a0000 00000000 00007fff 00000001 00000000 00000000 00000000
02248d40: 02248dc0 02248d00 00000080 00000000 00000000 00000000 00000000 00000000
02248d60: 02087bc0 00000000 00000000 00000000 02248c40 02248cc0 02248d20 00000000
02248d80: 00000040 00000040 00000040 00000040 00000000 00000059 014a07ca 01ee8052
02248da0: 00000004 00000004 00000004 00000002 00000002 00000002 00000000 00000000
02248dc0: 02248e80 02248d40 000000c0 00000000 00000000 00000000 00000000 00000000
- ok, this is it for today
Wed Jun 21 17:36:30 CEST 2017
Just a quick one for today: I'm looking for codepath that is responsible for
decoding touch operations into knob value.
Address of "PEAK" for SYN1: 02248ec0
Yeah, there's a parameter set procedure: 02075648 - seems like a big switch{}
Thu Jun 22 10:25:56 CEST 2017
I had a few ideas yesterday that I would like to implement today and tomorrow:
- watchpoints in emulator
- to see what knobs are set and when
=> ok, added knob table
- midi input to emulator: another thread that reads events from socket
- feed it from sdlkeys
- add supprot for knobs sdlkeys
- fonts
- drawing circles
- same gui algorithm "midiseq" had
- reading keys on console in desmume:
- something to enable/disable tracing into file
- viewing memory
- setting watchpoints etc.
- try
- newer desmume
- lua
- gdb
Ok, so basic keyjazz is working:
noteon:
MMU_write_mem(0x02248a20, 0x00000002);
MMU_write_mem(0x02248a24, 0x02247f39);
MMU_write_mem(0x022489a8, 1);
noteoff:
MMU_write_mem(0x02248a20, 0x02070001);
MMU_write_mem(0x02248a24, 0x02077eff);
MMU_write_mem(0x022489a8, 1);
and disable writes to the event registers.
Now let's implement "midi" via sdlkeys:
- we need some thread to open & read from "socket" with events (from sdlkeys)
- maybe use midi instead?
=> ok, midi it is (via snd-virmidi and open(2))
- threads?
SDL_CreateThread()
SDL_WaitThread()
Fri Jun 23 13:58:10 CEST 2017
- midi input works, trying to figure out how to send velocity via ds10 event
- apparently, there's one additional block I haven't considered: the mixer
- it gets driven by the sequencer
- it has also inputs from "mixer panel"
- so far the components are:
- sequencer - don't care
- synth - memmap mostly done, mapped inputs and outputs
- drums - TODO (albeit the code is quite short)
- effects - TODO
- mixer - TODO
- let's disassemble the drums
- consult the diagram:
- one block is 64 samples (= 128 bytes = half of dma buffer (double buffering))
0x2073150 looks like mixer head (64 reps)
=> in fact mixer and effect code
0x2072354 looks like sampler head (4*64 reps)
=> it is
0x2018054 32*4 reps
- the sampler code is fairly easy, but there are some weird stuff going on there as well
- time to do audioprobe stuff!
- ok, the "weird stuff" is linear interpolation (but it doesn't sound very interpolated)
- question is, how to trigger drums
- well that was easy:
pitch = 0x8000 * pow(2, (note-60)/12.0);
MMU_write_mem(0x02249620+0x34, pitch);
MMU_write_mem(0x02249620+0x38, 0);
- now i've got an idea: take sampler output and use it as waveform in syn1
- well, that's it for today, off to the theater
Tue Oct 3 23:17:25 CEST 2017
- bored, doing some static analysis
- TODO:
- dump 02248CC0 (used by AMP to calc something)
- figure out what is the range of "drive"
- figure out AMP+2c, AMP+30 (some kind of integrator?)
- need to find address of "knob_value -> amp structure values" routine
- 02075650, trace_knob_move
AMP:
- amplifier precomputation
- stored: 28, 2C, 30
- loaded: 4, 8, 10, 14, 18, 1c, 24, 28
Wed Oct 4 09:40:05 CEST 2017
- trying to get gdb running
./desmume-cli --arm9gdb 12345 ds10plus.nds
arm-none-eabi-gdb
- watch what are envelopes doing: 0x2248c40
- NOTES:
SMULL low,high,a,b
R13=SP
R14=LR
- how is amp balance computed?
- the formula gives something like
left=0.533*(1-bal^2)
right=0.533*(1-(1-bal)^2)
which is reasonably good, but i cannot reason the approximation in any way
- the correct way would probably be [cos(bal*pi/2), sin(bal*pi/2)]
cos phi ~= 1 - phi^2/2
... the constants are changed so it would work for [0..1] instead of radians and for bal=1 it would be 0
- still not sure why the 0x4444 scaling factor
- TODO: try to enable extra distortion 02248f4c=1
Fri Jun 8 14:15:06 CEST 2018
- almost a year later, I've got a sudden urge to continue
- the plan is now: pick some minimalistic ARM emulator, dump main memory (from within
desmume) and try to run it (and generate some sound)
- the minimalistic emulator in question could be this one:
https://code.9front.org/hg/plan9front/file/b974afb1648d/sys/src/cmd/5e/arm.c
Sat Jun 9 08:50:46 CEST 2018
- the emulator from 9front sort-of works, although I needed to add support for
some instructions that were tested/used/emitted by plan9 ARM compiler (BLX?)
- I sort-of understood how the mainloop after IRQ handler invocation goes
through all synths (2 main, 4 drum synths) and calls their handlers, then
goes through mixer, effect, etc.
- plan for today is:
- compile desmume from git
- dump ITCM (01FFxxxx range), because it contains buffers and
interrupt handler
- fix loading of binary segments in ds10emu
- implement DIVCNT IO register in ds10emu (might require io-register
handler ... or that trick with handler pagefault)
[...]
- So I did most of that, manually triggering synth1 from ds10emu - and I've
got some audio output! The bad news is, the output is severely garbled,
oscillator not oscillating properly and filter filtering-out nothing.
Envelope works though.
- another try. this time I've got a fault on DIVCNT (hardware division
accelerator), which I forgot to implement.
- ok, turns out I had a bug in a playback routine that didn't advance the
address when reading the buffer (thus the envelope being ok). the bad news
is, that the output is even more garbled than before.
- added audio probe (just a trigger: R15==addr => play(R[2])) and oscillator
works fine
- filter doesn't, output is severely garbled. my guess is: broken SMULL
instruction
- doesn't seem to be the case... let's try to do a filter bypass... nope,
still shit
0207526C:
LDR R2, [R7, #5C]
LDR R8, [R7, #34]
LDR R11, [R7, #38]
- looks like coefficients that were saved previously (R8, R11)
- they get saved later in highpass/lowpass section
- R8, R11 are later mostly just clamped values
- these values look very clamped in output of ds10emu
- R7 is kept in register pointing to filter section throughout the code
- ok, trying to compare various states to be able to check results
- filter coefficient address:
02248EA0+34 = 02248ED4
02248EA0+38 = 02248ED8
- ds10emu:
0207526c R8:00babd4b R11:ff793abc
- 02000000.bin:
$ hexdump -C 02000000.bin | grep -i "^00248ed0"
00248ed0 00 00 00 00 4b bd ba 00 bc 3a 79 ff 00 00 00 00 |....K....:y.....|
... so matches ds10emu values ...
- desmume
- although the state is same, it doesn't match. got to debug
who changes it
- ok... the savestate is from middle of rendering stuff, I
have to get a better savestate
- again:
- desmume:
0207526C R8:00EC9573 R11:00E224D1
- bin:
00248ed0 00 00 00 00 73 95 ec 00 d1 24 e2 00 00 00 00 00 |....s....$......|
- ds10emu:
0207526c R8:00ec9573 R11:00e224d1
- woot! now I can just do a register diff between desmume and my emu
- perfect, there's a difference in the second computed sample
- hah! fucking arithmetic shift right didn't do sign extension
E1A0584 MOV R7, R3, LSL #1
- instruction encoding:
cccc 00Io oooS nnnn dddd ssss ssss ssss
____ 0011 101S
- hmm the instruction encoding is less obvious than I thought
either it's immediate (1<<25)
or it isn't
0 0 0 0 0 0 0 0 Rm register
i i i i i 0 0 0 Rm SHL by immediate
r r r r 0 0 0 1 Rm SHL by register
i i i i i 0 1 0 Rm LSR by immediate
r r r r 0 0 1 1 Rm LSR by register
i i i i i 1 0 0 Rm ASR by immediate
r r r r 0 1 0 1 Rm ASR by register
i i i i i 1 1 0 Rm ROR by immediate
RRX if i=0
r r r r 0 1 1 1 Rm ROR by register
- WORKS! bug was in casting uint32_t to long before shifting (and long is
64bit on arm64, so it wouldn't get shifted right)
[...]
- another mystery: I added a synth noteon trigger and now it jumps to address
"000000004", which it fetched from stack
ARM9 02075540 EBFFF299 BL 02071FAC
-> R14 should be 02075544
- ok, there's a bug that BX instruction overwrites R14 with PC
- however! if I fix that, the synth stops working
- lol, this was my fuckup, i did implement BLX
- ok, so now the synth triggers on noteon/noteoff
- it would be nice to have knobs so let's work on that now
- some functions from the synth dispatch table:
02087cec 0207448c
02087cf0 02074590
02087cf4 020746e0
02087cf8 02074720 data render handler
02087cfc 0207469c
02087d00 02075930
02087d04 02075648 knob movement handler
02087d08 02075928
02087d0c 02075900
02087d10 0206ceb4
02087d14 0207546c noteon event handling (called from data renderer)
02087d18 02075640
- mixer dispatch table:
02087c7c 02072ce0
02087c80 02072d20
02087c84 02072dac
02087c88 02072e8c data render handler
02087c8c 02072d68
02087c90 02073888
02087c94 020736c0
02087c98 02073880
02087c9c 02073858
02087ca0 0206ceb4
02087ca4 020735d0 noteon event handling (called from data renderer)
02087ca8 020736b8
02087cac 00000000
02087cb0 00000000
- drum dispatch table:
02087bcc 020721f4
02087bd0 02072228
02087bd4 02072278
02087bd8 020722b8 render
02087bdc 02072264
02087be0 02072514
02087be4 020724e0 ?
02087be8 0206ba5c
02087bec 02049744
02087bf0 02072268
02087bf4 020723dc noteon?
02087bf8 020724d8
02087bfc 00000000
02087c00 00000000
Sun Jun 10 11:09:04 CEST 2018
- let's get the hardware knobs working
[...]
- ok, now I can control DS-10 from either keyboard or volca synth, with notes
and knobs working as expected. it can finally be played like a normal synth!
- how does the TODOLIST/roadmap look like now?
- importing patches from DS-10 save files
- saving and loading patches in my synth-emulator
- polyphonic synth!
- figuring out how drum synth and mixer works (what are the
knob-handlers etc.)
- decompile synthesizer into C, try to map structures and recompile it
for a different architecture
- make a hw DS-10-like controller (something like 60 knobs)
- try to get the STM32 blue-pill working and outputting audio
- now: let's make the import working
- the instruments seem to start around 0x38000 in DS-10+ save file
- the instrument description is a string of the ~30 knob settings
- in the trace_knob_move.txt trace (which is trace of a synth-state
being loaded) there's a loop that loads each of the knob via the
knob-dispatch routine
- only few parameters are "bipolar", in JUNKTT I have "VCO2pitch"
negative (id=8), so let's guess:
00038700 68 e0 a9 5d d9 3c 0c b5 1a f1 96 dd 3c a3 ad 99 |h..].<......<...|
00038710 4a 55 4e 4b 54 54 00 00 00 00 00 00 00 00 00 00 |JUNKTT..........|
00038720 00 00 00 03 00 00 01 02 e0 20 00 1b 4f 00 13 09 |......... ..O...|
00038730 01 00 03 69 46 42 7f 00 2d 00 01 00 00 00 00 00 |...iFB..-.......|
00038740 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
- ok, starting offset is 0x20
- patch loading working. this is utterly awesome! I can now play all my
instruments I made on DS-10! and in hi-fi! (because despite the information
that the D/A converter is 16bit, I still think it's just filtered PWM, 12bit
tops)
- let's dig a bit into the mixer
0224a260: 02 fe xx 06 00 00 yy 07 ; volume
- xx is channel (0..5)
- yy is volume (00..7f)
: 02 35 xx 06 02 35 yy 0a ; panning
- yy is pan (00..7f)
: 00 00 xx 06 00 00 yy 07 ; mute channel? oh! volume again...
- trap: who writes into mixer: ... mostly the "event handler" code that gets
called by the mixer synth routine
-
- note: current emulator eats about 35% on a single voice, 25% with -O2
- maybe a good idea would be to do a trace of event writes on all synths:
1. find out event area
syn1
2. trace writes to all event areas
022489A0 syn1
+8 number of pending events (byte)
+80 event area
02248FE0 syn2
02249620 drum1
02249900 drum2
02249BE0 drum3
02249EC0 drum4
0224A1A0 mixer
+8 number of pending evets
+c0 event area
0224A480 effects
- no event area? probably merged with mizer
01ff87e0 022489a0 02248fe0 02249620 02249900
01ff87f0 02249be0 02249ec0 00000000 00000000
* DRUM1 trigger:
ARM9: cond=evqueue addr=0224a224 val32=00000004 lastipc=00004047
^ this sets effect and is not a trigger
ARM9: cond=evqueue addr=022496a0 val32=00000202 lastipc=00004047
ARM9: cond=evqueue addr=022496a4 val32=02247f3c lastipc=00004047
ARM9: cond=evcount addr=02249628 val8=00000001 lastipc=00004047
Mon Jun 11 13:06:28 CEST 2018
- today I'm continuing with the event tracing
- as a part of ongoing experiments, I implemented a polyphonic DS-10 synth by
just creating three separated arm-emulator contexts and distributing
incoming notes amongst them (knobs distributing to all of them). and it
works surprisingly well. however it's too slow to run more than three
voices, so I should probably work on the decompilation par more
- drum synth works (when mapped to keyboard)
- even loading custom samples to drum synth works now! woot!
Tue Jun 12 09:20:57 CEST 2018
- trying retdec decompiler: https://retdec.com/
- the ds memory image files should have probably gone through some
preprocessing that would leave only relevant code, because
retdec chokes on the 4MB image
- idea: control flow extraction, and registers annotated with types
with type induction
A tool that I could really use right now is something that would extract code
chunks from the main image by walking down functions (with hints as to where
BX goes and how big are jumptables). I could then feed the code to either some
kind of "reassembler" (that would relocate the located code) or another tool
(like the retdec decompiler).
Simplest way to implement the above would just to make tool that understands
jump instructions a would descend the code recursively. Then I could just
extract the memory ranges by hand and check correctness with dynamic
execution. The problem with reassembly is that it would need modify the data
because function pointers are stored in synth data structures (and in method
tables).
This tool would also allow me, instead of decompiling the code, to recompile
the code to C (direct instr-to-C translation with indirect jumps solved by
static address->function translation table).
[...]
- Attempt to make effects work didn't work very well. I need to read some
tracedumps to understand better how it works.
[...]
I have a plan how to do the decompiler, but not the strength to implement it
now, so I'll just sketch it here for later.
1. take instruction parser from libdisarm, use it to determine whether
instruction is jump, data move, return, etc.
2. construct a call-graph incrementally: put start node into queue, then
repeat:
- take node from queue
- decode instruction at that address
- if it's conditional jump, make two edges (to next instruction,
to destination)
- if it's jump, make edge to destination
- otherwise, make edge to next instruction
- if it's computed jump (BX), take hints from "hintfile" (generated by
hand or by live tracing)
- if it's instruction manipulating R15, do some kind of special-case
handling:
- return (LDMIA) - mark as return
- jumptable (ADDLS) - take hint from hintfile about how big
that jumptable is
- literal (LD x, [R15]) - just assume the literal is constant
and fetch it from memory
3. this will construct a graph (use array of addresses to store node
pointers). walk it to check it's sane graph: no jumps between different
functions, etc.
4. decompile each function into one C function with variables like uint32_t
R2, R3...;, expand each instruction, put labels in places where edges
arrive and just goto there. do memory fetches and stores on array.
This is of course more compilated than it sounds:
- parameter needs to be handled (this could be work for hintfile)
- in/out/gen/kill sets need to be computed to see what gets passed
where
- R14 needs to be managed to ensure we return at correct places
- lots of small stuff
5. profit
Wed Jun 13 11:16:26 CEST 2018
- I couldn't resist and wrote some rudimentary disassembler/control flow
extractor according to plan outlined yesterday.
- My current idea is to don't try to guess much information, but instead
translate the instruction one-to-one into structure like this:
regjump: switch(R15) {
/* labels generated twice: oncee for register jumps (the switch),
once for direct jumps (the label) */
case 0x0204720:
lab0204720:
/* one stack array, maybe through some macro to check for overflows */
STACK[--R13] = R3;
STACK[--R13] = R4;
/* operation on registers translated directly */
R5 = R2 + R1;
R6 = R7 * R9;
lab020472c:
/* flags generated on demand (set solver, see bellow) */
C = R2 > R1;
/* conditioned instruction with if prefix */
if (C) R5 = 4;
if (C) goto lab0204720;
/* "function call" */
R14 = 0x0204720;
goto lab020490c;
case lab0204738:
/* return from call */
R14 = STACK[R13++];
/* the decoder knows its R15, so it's not required to track that. */
/* it would be nice to decode literals though */
R15 = R14;
goto regjump;
}
The "when should be generated" problem could be solved by in/out/used/gen/kill
sets, which could be done by a bitmap solver.
Although, maybe the first version could do away with in/out/used... sets and
just generate them everywhere. The same for stack - let's just make it an
ordinary memory accessed with "vaddr".
Thu Jun 14 23:40:09 CEST 2018
- working on implementation of alu function generator for arm->c translator
Fri Jun 15 13:23:31 CEST 2018
- implementing load/store instructions
- more ideas:
- track "use" of cmp"xx" predicates, so we can emit them instead of flags
- merge if(..}{ } blocks
[...]
- so the code generator is outputting compilable code, I linked it to rest of
the emulator AND IT RUNS! it doesn't produce any sound, but it correctly
returns to the place it was called from, so that's something
- smull was mis-implemented (no sign extension for the second operand)
- WOW! it plays something very similar to what is actually supposed to! it's
a bit garbled, but who cares. this is the code in question
[...]
- ok, output is garbled. register trace compare matches. what to do? maybe
store is buggy?
ARM9 02075410 E0C270B2 STRH R7, [R2], #2
... yeah, that one was mis-implemented, because I borked post-increment (and
implemented it as pre-increment).
Sound synthesis (on synth1) now works decompiled!
What's the plan now?
- implement multiple input points to control-flow graph
- try to decompile "knob", "drum" and "mixer"
=> I can't do this now, I don't have knobs analyzed by hand
- draw them with graphviz
- implement subroutine checking
- soubroutine is something called with link
- each subroutine has one entrypoint
- maybe try parameter detection and breaking the code into subroutines
- gen/kill/used sets computation
- implement value type and structures inverting
Decompiling knob-setting-dispatch function. Looks like each synth component
has it's own sub-dispatch function (ie. filter).
I need to rewrite the code-ordering function.
Latest decompilation attempt
Sun Jun 17 12:58:26 CEST 2018
- trying to fix the bitmap solver
- fixed
- now flags are output depending on what's used later in the code
- flags are outputted only when required (thanks to bitmap solver)
- instructions with the same if() condition are merged together
- now trying to replace "cmp ...; if (flags)" with the condition itself
- works! at least in most cases
- this is probably the furthest I can go without rewriting the
control-flow graph handling and manipulation. i would like to remove
all branches (why have a "b xxx" node? why not branch with the graph
itself?) and do some major overhaul to the decoder, because it's
really a pain in the ass to use.
- although maybe I can implement a subroutine chopper before I do that.
- I tried to test mumu (which is a kind of "shell" that manages sound
rendering, midi io etc.) with the latest code generated and the third noteon
almost always faults the whole emulator, so I'll probably need to fix that
first. First thing after I get back from vacation.
[...]
- I fixed a bug in code generator ... it seems that the code generator works
only by chance. I desperately need to rewrite it. Not neccessarily change
the whole concept, but just make sure the design makes sense.
Thu Jun 21 12:45:57 CEST 2018
- back
- trying to fix the c-code generator
[...]
- OK, I just rewrote the main decoding loop and I'm not sure it's an
improvement. The only thing that's better is the order of the output code
(it's sorted by function address and grouped by subroutine). The rest is...
well not nice. The "link next" disappeared, but it was replaced by some
spaghetti logic. The problem is that the goto/label/subroutine logic was
mixed with if()/cc/conditional logic, and that was not a good idea. I need
to find a way to separate those two.
- In other news: I replaced the "vaddr" with direct memory access and it's
twice as fast.
- hmm i forgot the vaddr also handles memory-mapped IO devices like DIV and
SQRT accelerators... so it's faster, but it doesn't sound right
Fri Jun 22 17:55:10 CEST 2018
- after yesterday's debacle I decided I need to stop dicking around and just
get the emulator to work well enough so I can sequence it from ableton
- I have two ways: either make it into a VST, or make it run on my small
laptop and sequence it via MIDI (would require a SDL gui for configuring the
synth)
- first I'm gonna try the VST path. I found a nice tutorial (Martin Finke's
"Making audio plugins"), downloaded Visual Studio and I'm now trying to put
it all together
[...]
- so I used "SpaceBass" example as the basis for my synth, polyphony is
working, one knob is working. what I need to do now is:
- the filter is stepping. why is the filter stepping? how is it done
on real ds-10? I don't remember filter stepping. todo: dump all
events from knob when either turning it manually or sequencing from
kaoss x/y
- design a gui with all the knobs
- design "enum" knobs for all modulation possibilities
- design background and knobs that are similar to ds-10
- implement all knob events
- implement polyphony switch
- design
- need to allocate space for keyboard
- how big it is?
- need to test knobs
- need to rip those knobs & switches
- switch: 38x26
- switch hole: 58
- offset: -36
- knobs and switches working, more tomorrow
Sat Jun 23 08:03:38 CEST 2018
- got up early, working on vst
- oh crap, all globals are shared between instances! @(#&@#*&
- need to rewrite this... :(
- TODO:
* implement event handling
* implement polyphony changing
* implement mono voice stealing (priority=last)
* get rid of all globals
* implement resampler from 32KHz to 44KHz
* implement volume (velocity) on noteon
* optional "resampler"
* implement presets
* figure out how to get my old patches here
* embed ds-10 image into dll
* rename it
* clean it, release it, release source
- well, it's done!
- goodbye
Sun Jun 24 21:29:39 CEST 2018
- I added resample feature
- fixed velocity curve
- posted to r/synthesizers just for kicks
TODO:
- decompiler
- implement separate hintfile
- how to write a bitmap solver (draw graph and equations for each node)
- mumu
- implement midi note priority logic
- fix keyjazz (doesn't do noteoff)
- fix "Data" data structure
- make a SDL gui with knobs and keyjazz piano (with keyup/down)
- make a VST out of it (I have this ANSI C code - how do I make a VST with
knobs)
- what is stored in patch saves other than the synth configuration?