ARM7 - runs generic
Fifo commands:
Sat Jun 10 10:32:33 CEST 2017 On Thursday I managed to run DS-10 in DeSmuME without tearing in sound. The problem was that some instruction (or maybe memory type) has assigned wrong number of cycles, that is needed to access it. The result is that the processor isn't fast enough to compute one sound "frame" (64*2 samples) within assigned timeframe. When I set timing for every instruction to 1, it works flawlessly (I need to fix it in the JIT as well. I was generating some callgraphs to understand where is each synth/drum computed: ds10 call graph. And you can kinda see it from there: one frame is 64 samples, so if node has 256 ins and 252 outs one way and 4 another, it's probably a 64 sample for loop done 4 times, etc. One other nice thing: I traped various instructions to send register contents to soundcard - it's like probing circuit with oscilloscope. It would be nice if it could be done run-time. Plan for today: - add support for subroutines to callgraph generator (or: buy IDA) - print full register contents next to each instruction - src/cli/main.cpp: DIS_AUDIO - src/armcpu.cpp: x_armlog function - make gdb working with desmume OK, so here's another graph I made when fixing bug with "overmerging" of nodes: graph. 70474bb824b56912cc419221aa8ae015 ds10plus.nds Sat Jun 10 12:52:40 CEST 2017 Well, I probably shouldn't spend so much time on making tools to answer questions I don't have. But it makes nice graphs. I made another one and I swear this is the last one: now with more colors! Also script for analysis and the trace. Misc notes: 02075648 parameter set procedure argument R0=synth pointer argument r1=knob id argument r2=knob value load filter pointer 0207E950 pitchtable 0207ED38 drumpitch table 02071fac set note (args: r0: pitch control, r1: note, r2: 0) 020722B8 drum handler method 02073150 mixer head? 02074720 synth handler method 02074F98 Head of synthesis loop 02075410 Store output of one synth 0207546C event dispatch function - r0 = synth pointer - load event count byte from r0+#8 - load events from r0+#80 (each even 8 bytes long) if (event == 1) { env = synth->envelope; STRB #3, [env, #4] STRB #1, [env, #5] } if (val == 2) { note = keyjazz[4]; /* fill frequency table */ synth->pitch->note = note << 15; STRB #0, [env,4] STRB #0, [env,5] } 02075650 knob dispatch table 020758ec drive knob handler 02075ad4 recompute LFO from BPM (guess) 020807c4 filter coefficient table (151 entries) 02087CEC synth method table (1st synth) 02087c7c mixer method table? 01FF846C load next "machine" dispatcher and call it 01FF8444 load next synth ptr data 01FF86A0 synth iteration table 01FF87E0 synths table [list of all synth pointer], null terminated 01FF8820 buf1 (syn1) 01FF88A0 buf2 (syn2) 01FF8920 buf3 (drum1) 01FF89A0 buf4 01FF8A20 buf5 01FF8AA0 buf6 021b04a0 precomputed sample 1 0223D280 stored synth preset data 0223d2e0 another preset data? (maybe?) 02248680 (output) soundbuf 022489A0 Synth 1 Data (struct SYN) 02248A20 keyjazz area (SYN1) 02248c40 env/SYN1 02248CC0 LFO/SYN1 02248d60 pitchctrl/SYN1 02248d20 notectrl/SYN1 02248de0 OSC/SYN1 02248ea0 FILTER/SYN1 02248f40 AMP/SYN1 02248fa0 ???/SYN1 02248FE0 Synth 2 Data 02249620 drum1 022498C0 drum1 volblock 02249900 drum2 02249BE0 drum3 02249EC0 drum4 0224A1A0 MIXER 0224A480 effects 0226A580 delay buffer knobs: name id min max octave 0 -2 2 egint 1 -63 63 porta 2 0 127 pitch.in 3 -63 63 pitch1.in 4 -63 63 pitch2.in 5 -63 63 vco1 6 0 3 vco2 7 0 3 vco2pitch 8 -63 63 balance 9 0 127 vcosync 10 0 1 cutoff 11 0 127 peak 12 0 127 vcftype 13 0 2 [0=lpf] egint 14 -63 63 cutoff.in 15 -63 63 vcaeg 16 0 1 vca.in 17 -63 63 drive 18 0 127 level 19 0 127 attack 20 0 127 decay 21 0 127 sustain 22 0 127 release 23 0 127 mg.freq 24 0 127 mg.bpm 25 0 1 [0=off] inputs pitch.in.hole 26 pitch1.in.hole 27 pitch2.in.hole 28 cutoff.in.hole 29 vca.in.hole 30 outputs none 0 sine 1 saw 2 square 3 s&h 4 eg 5 vco2 6 DRUM +10 how many samples to render (u16) +30 pointer to volblock +34 play speed +38 offset +40 sample length (=0x8000) +44 sample buffer address DRUMvolblock (022498C0) +14 volume synth iteration table (01FF86A0) +0 synth table pointer (01FF87E0) +14 number of synths +28 synth handler table synth generic table +0 synth method table synth method table +c synth render handler +18 knob movement handler +28 noteon event handling SYN +28 function for handling events? [=> 0207546C] +2C ?? stored by synth event handler (event 4) +30 pointer to envelope +34 pointer to LFO +3C pointer to pitch control +40 pointer to struct OSC +44 pointer to struct filter? +48 pointer to AMP +320 seems to be lfo data +2A8 envelope value +84 frequency!? could be pointer to memory, it's chromatic +8 trigger? (number of events in event area) +4 keyjazz pointer (pointer to list of events) +0 function method table? +52 used in noteon_to_env +50 1 if first bar of pattern repeat ENV +0 envelope value +40 attack +e0 protamento PITCHCTRL +18 NOTECTRL +38 period for osc1? (from code) +3c period for osc2? (from code) NOTECTRL +4 oldpitch +0 newpitch (note << 15) +C inverse portamento? +10 direction (0 down, 1 up) OSC +2C VCO1 type (0..3) +30 VCO2 type (0..3) +34 OSC balance (midi) +38 VCO1 volume (15bit) +3C VCO2 volume (15bit) +20,28,64,6c vco2 pitch 44, --- +24 [speculation] +1c input to divisor +c jumptable on this +48 enum [0-3] LFO +28 BPM FILTER (OSC+C0) +1C VCF cutoff +20,24 VCF peak (precomputed?) +28 cutoff precomputed coefficient 1 +40 filter type (0=lowpass,1=bandpass,2=highpass) --- +24 coef1 +34 highpass output +38 lowpass output AMP (OSC+160) +4 ?? [values 0..3] +8 which envelope segment are we using [env+[AMP+8]*4] +C "extra distortion" switch +10 pointer to env +14 ?? [pointer to some table offseted by [AMP+4]*4] +18 VCA LEVEL, knob*0x20408 +24 VCA EG +2C [speculation] current amp envelope value +30 [speculation] envelope increment +34 over-driver output from amp first stage +38 drive coeff 1, 15bit, unsigned ((knob*0x102)^2>>15) +3C drive coeff 2, 15bit, unsigned (0x7fff-knob*0xe8) this value is actually wet/dry knob (0x7fff is dry) MIXER 0224A1A0 +8 number of events +30 used by mixer, volume of sorts, (?) +34 H? dry/wet knob, same as effect +50+4*chan mixvol/R (L-R) +68+4*chan mixvol/L +80 pointer to effect data +84 effect enabled on [0..syn1, 1..syn2, 2..syn1+syn2, 3..drums, 4..all, 5..off] +88 used by effect, halfword (?) +C0 event queue EFFECT 0224a480 +C H effect type [0 delay, 1 chorus, 2 flanger] +E H Delay sync [0 off] +10 H Delay time +12 HS Delay L/R ratio [ffc1..003f] +14 H Flanger+Chorus LFO +16 H Flanger+Chorus Depth +18 HS Delay+Flanger Feedback [ffc1..003f] +1A H dry/wet [0000..007f] 0224a480: 0228a5a0 0228a5e0 0228a620 00010000 0224a490: 000d0003 003e0017 00110010 00000001 SYNTH EVENTS __ __ __ 01 __ __ __ __ 02075524 noteoff __ __ __ 02 __ __ __ nn 02075534 noteon __ __ __ 04 bb bb bb bb 02075588 set bpm __ __ __ 05 __ __ __ __ 020755a0 supress lfo reset? force lfo reset? __ __ __ 18 __ __ __ __ 020755b8 reset envelope, lfo? __ __ __ 1c __ __ __ __ 020755e4 reset bpm-synced lfo if first bar __ __ __ 1d __ __ __ __ 020755d4 set "first bar repetition" flag - legato is done by changing the note without stopping (02 02 01) - bar start: 1c - pattern change: 1d - play: 04 18 - stop: 19 (abrupt stop: 05 19? only when song not playing) - abrupt pattern change: 05 (usually 05 05) - when shaping a drum: 1e (what is that?) MIXER EVENTS __ __ cc 06 __ __ vv 07 ; set volume (0..7f) on channel __ __ cc 06 __ __ pp 0a ; set pan (0..7f) on channel - mixer also gets all the song control events (5, 18, 19, 1c, 1d) DRUM EVENTS __ __ __ 01 __ __ __ __ 02075524 noteoff __ __ __ 02 __ __ __ nn 02075534 noteon - the sequencer sets a noteoff before each noteon and optionally a noteoff after some time (if it's not a one-shot) - also accepts all song control events (TODO: which ones?) - pan/volume events are set to mixer Code seems to be divided into sections (first OSC, then FILTER, AMP...). Envelopes seem to be computed just once per frame (they are not per-sample). MCR p15 0 Rd c7 c10 2 Clean D$ Line by Set/Way MCR p15 0 Rd c7 c10 4 Data Synchronization Barrier Synth pointer: --- DUMP AT 022489a0 --- 0000: 02087cec 02248a20 00004000 00000201 00000040 00007fd8 00008000 01ff8b40 0020: 00000040 00007fd8 00000002 00000075 02248c40 02248cc0 02248d20 02248d60 0040: 02248de0 02248ea0 02248f40 02248fa0 00000000 00000000 00007fff 00007fff 0060: 02248c20 02248980 00000220 00007fff 00000001 00000000 00000000 00000000 Things that need to be found: - where does R4/SYN pointer come from - what's the encoding for "frequency" - dump structures in sampler => routine which renders "samples" is short and effective (0x02072354) => it does "drop sample" interpolation That's it for today. Sat Jun 10 20:22:20 CEST 2017 I'm trying to figure out how to trigger sample from keyboard. Memory write to envelope register isn't enough, some code probably needs to be executed as well to initialize the counters. 02248c44 00000000 attack 00000001 sustain? 00000103 release Mon Jun 12 09:00:38 CEST 2017 Silly me - I know what I did wrong. I was writing "sustain" instead of "attack" into the "envelope state" register. Now I can trigger the last note from keyboard. Although just 4/5 of key-presses go through... I need to speed up debugging, writing breakpoints into emulator source just doesn't cut it anymore. MCR p15, 0,, c7, c0, 4 Wait for interrupt Trying to figure ou what triggers the envelope: IPC7 time=1168279224 send FIFO 0xC02400C6 size 000 (l 0x8501, tail 10) (r 0x8501, tail 13) irq proc=0 num=18 ARM9: trigger addr=022489a8 val=00000001 lastipc=c02400c6 IPC9 time=1168298016 send FIFO 0x8226CA07 size 000 (l 0x8501, tail 13) (r 0x8501, tail 11) irq proc=1 num=18 ARM9: trigger addr=022489a8 val=00000002 lastipc=c02400c6 ARM9: trigger addr=022489a8 val=00000003 lastipc=c02400c6 irq proc=1 num=4 irq proc=1 num=4 IPC7 time=1168377409 send FIFO 0x00004047 size 000 (l 0x8501, tail 11) (r 0x8501, tail 14) irq proc=0 num=18 ARM9: trigger addr=022489a8 val=00000004 lastipc=00004047 Wed Jun 14 14:44:28 CEST 2017 I wonder how hard it must have been to reverse engineer something just using gdb or similar, command-line debugger. Anyway, still working on task 2 - need to figure out, how to trigger note-on manually. There are two suspicious places: 1. "keyjazz" area 02248a20: 02070001 02077eff // note off 02248a20: 00000002 02247f35 // note on 2. envelope mem 02248c40: 02248c80 00000001 Let's do the keyjazz area first - the lowest byte goes from 29 (low F) to 40 (high E), FF when no note is pressed [this is midi note] - it seems to be a pointer to memory (which itself is array to pointers) - it is written periodically (when in keyjazz mode) or every song-tick (when in song-play mode) - the keyjazz is read from 01FF9EC8 and 01FF9ECC (seriously, what is that memory? is it ipc? print memory map!) - oh shit, that's stack! that explains a lot actually ARM9: <02 00 0000 ITCM, mod 0x8000 04 __ ____ IO 02 __ ____ - so I'm plotting more graphs and they are a thing of beauty... (the picture illustrates how a dispatcher looks like) - i need to go over 4007 sound generation again to find out how are these values used - the note (lowest byte) is read from 0230ADA9 0207025C procedure, R2 => pointer to <event,keypress> pair - ok, so this is solved: 0230ADA4 [-- 01 00 -- | ff -- -- -- ] noteoff [-- 02 00 -- | ff 35 -- -- ] noteon - so for instance, this plays a note: MMU_write32(0, 0x0230ADA4, 0x00000200); MMU_write32(0, 0x0230ADA8, 0x000035ff); - now let's go figure out how does it get interpreted by the synth code (aka who reads 02248a20) - oh, now i now: it's not "keyjazz" area, it's a list of events! and trigger is "number of events pending" - pitchctrl area: 02248d20: 001a0000 001a0000 00000000 00007fff 00000001 00000000 00000000 00000000 02248d40: 02248dc0 02248d00 00000080 00000000 00000000 00000000 00000000 00000000 02248d60: 02087bc0 00000000 00000000 00000000 02248c40 02248cc0 02248d20 00000000 02248d80: 00000040 00000040 00000040 00000040 00000000 00000059 014a07ca 01ee8052 02248da0: 00000004 00000004 00000004 00000002 00000002 00000002 00000000 00000000 02248dc0: 02248e80 02248d40 000000c0 00000000 00000000 00000000 00000000 00000000 - ok, this is it for today Wed Jun 21 17:36:30 CEST 2017 Just a quick one for today: I'm looking for codepath that is responsible for decoding touch operations into knob value. Address of "PEAK" for SYN1: 02248ec0 Yeah, there's a parameter set procedure: 02075648 - seems like a big switch{} Thu Jun 22 10:25:56 CEST 2017 I had a few ideas yesterday that I would like to implement today and tomorrow: - watchpoints in emulator - to see what knobs are set and when => ok, added knob table - midi input to emulator: another thread that reads events from socket - feed it from sdlkeys - add supprot for knobs sdlkeys - fonts - drawing circles - same gui algorithm "midiseq" had - reading keys on console in desmume: - something to enable/disable tracing into file - viewing memory - setting watchpoints etc. - try - newer desmume - lua - gdb Ok, so basic keyjazz is working: noteon: MMU_write_mem(0x02248a20, 0x00000002); MMU_write_mem(0x02248a24, 0x02247f39); MMU_write_mem(0x022489a8, 1); noteoff: MMU_write_mem(0x02248a20, 0x02070001); MMU_write_mem(0x02248a24, 0x02077eff); MMU_write_mem(0x022489a8, 1); and disable writes to the event registers. Now let's implement "midi" via sdlkeys: - we need some thread to open & read from "socket" with events (from sdlkeys) - maybe use midi instead? => ok, midi it is (via snd-virmidi and open(2)) - threads? SDL_CreateThread() SDL_WaitThread() Fri Jun 23 13:58:10 CEST 2017 - midi input works, trying to figure out how to send velocity via ds10 event - apparently, there's one additional block I haven't considered: the mixer - it gets driven by the sequencer - it has also inputs from "mixer panel" - so far the components are: - sequencer - don't care - synth - memmap mostly done, mapped inputs and outputs - drums - TODO (albeit the code is quite short) - effects - TODO - mixer - TODO - let's disassemble the drums - consult the diagram: - one block is 64 samples (= 128 bytes = half of dma buffer (double buffering)) 0x2073150 looks like mixer head (64 reps) => in fact mixer and effect code 0x2072354 looks like sampler head (4*64 reps) => it is 0x2018054 32*4 reps - the sampler code is fairly easy, but there are some weird stuff going on there as well - time to do audioprobe stuff! - ok, the "weird stuff" is linear interpolation (but it doesn't sound very interpolated) - question is, how to trigger drums - well that was easy: pitch = 0x8000 * pow(2, (note-60)/12.0); MMU_write_mem(0x02249620+0x34, pitch); MMU_write_mem(0x02249620+0x38, 0); - now i've got an idea: take sampler output and use it as waveform in syn1 - well, that's it for today, off to the theater Tue Oct 3 23:17:25 CEST 2017 - bored, doing some static analysis - TODO: - dump 02248CC0 (used by AMP to calc something) - figure out what is the range of "drive" - figure out AMP+2c, AMP+30 (some kind of integrator?) - need to find address of "knob_value -> amp structure values" routine - 02075650, trace_knob_move AMP: - amplifier precomputation - stored: 28, 2C, 30 - loaded: 4, 8, 10, 14, 18, 1c, 24, 28 Wed Oct 4 09:40:05 CEST 2017 - trying to get gdb running ./desmume-cli --arm9gdb 12345 ds10plus.nds arm-none-eabi-gdb - watch what are envelopes doing: 0x2248c40 - NOTES: SMULL low,high,a,b R13=SP R14=LR - how is amp balance computed? - the formula gives something like left=0.533*(1-bal^2) right=0.533*(1-(1-bal)^2) which is reasonably good, but i cannot reason the approximation in any way - the correct way would probably be [cos(bal*pi/2), sin(bal*pi/2)] cos phi ~= 1 - phi^2/2 ... the constants are changed so it would work for [0..1] instead of radians and for bal=1 it would be 0 - still not sure why the 0x4444 scaling factor - TODO: try to enable extra distortion 02248f4c=1 Fri Jun 8 14:15:06 CEST 2018 - almost a year later, I've got a sudden urge to continue - the plan is now: pick some minimalistic ARM emulator, dump main memory (from within desmume) and try to run it (and generate some sound) - the minimalistic emulator in question could be this one: https://code.9front.org/hg/plan9front/file/b974afb1648d/sys/src/cmd/5e/arm.c Sat Jun 9 08:50:46 CEST 2018 - the emulator from 9front sort-of works, although I needed to add support for some instructions that were tested/used/emitted by plan9 ARM compiler (BLX?) - I sort-of understood how the mainloop after IRQ handler invocation goes through all synths (2 main, 4 drum synths) and calls their handlers, then goes through mixer, effect, etc. - plan for today is: - compile desmume from git - dump ITCM (01FFxxxx range), because it contains buffers and interrupt handler - fix loading of binary segments in ds10emu - implement DIVCNT IO register in ds10emu (might require io-register handler ... or that trick with handler pagefault) [...] - So I did most of that, manually triggering synth1 from ds10emu - and I've got some audio output! The bad news is, the output is severely garbled, oscillator not oscillating properly and filter filtering-out nothing. Envelope works though. - another try. this time I've got a fault on DIVCNT (hardware division accelerator), which I forgot to implement. - ok, turns out I had a bug in a playback routine that didn't advance the address when reading the buffer (thus the envelope being ok). the bad news is, that the output is even more garbled than before. - added audio probe (just a trigger: R15==addr => play(R[2])) and oscillator works fine - filter doesn't, output is severely garbled. my guess is: broken SMULL instruction - doesn't seem to be the case... let's try to do a filter bypass... nope, still shit 0207526C: LDR R2, [R7, #5C] LDR R8, [R7, #34] LDR R11, [R7, #38] - looks like coefficients that were saved previously (R8, R11) - they get saved later in highpass/lowpass section - R8, R11 are later mostly just clamped values - these values look very clamped in output of ds10emu - R7 is kept in register pointing to filter section throughout the code - ok, trying to compare various states to be able to check results - filter coefficient address: 02248EA0+34 = 02248ED4 02248EA0+38 = 02248ED8 - ds10emu: 0207526c R8:00babd4b R11:ff793abc - 02000000.bin: $ hexdump -C 02000000.bin | grep -i "^00248ed0" 00248ed0 00 00 00 00 4b bd ba 00 bc 3a 79 ff 00 00 00 00 |....K....:y.....| ... so matches ds10emu values ... - desmume - although the state is same, it doesn't match. got to debug who changes it - ok... the savestate is from middle of rendering stuff, I have to get a better savestate - again: - desmume: 0207526C R8:00EC9573 R11:00E224D1 - bin: 00248ed0 00 00 00 00 73 95 ec 00 d1 24 e2 00 00 00 00 00 |....s....$......| - ds10emu: 0207526c R8:00ec9573 R11:00e224d1 - woot! now I can just do a register diff between desmume and my emu - perfect, there's a difference in the second computed sample - hah! fucking arithmetic shift right didn't do sign extension E1A0584 MOV R7, R3, LSL #1 - instruction encoding: cccc 00Io oooS nnnn dddd ssss ssss ssss ____ 0011 101S - hmm the instruction encoding is less obvious than I thought either it's immediate (1<<25) or it isn't 0 0 0 0 0 0 0 0 Rm register i i i i i 0 0 0 Rm SHL by immediate r r r r 0 0 0 1 Rm SHL by register i i i i i 0 1 0 Rm LSR by immediate r r r r 0 0 1 1 Rm LSR by register i i i i i 1 0 0 Rm ASR by immediate r r r r 0 1 0 1 Rm ASR by register i i i i i 1 1 0 Rm ROR by immediate RRX if i=0 r r r r 0 1 1 1 Rm ROR by register - WORKS! bug was in casting uint32_t to long before shifting (and long is 64bit on arm64, so it wouldn't get shifted right) [...] - another mystery: I added a synth noteon trigger and now it jumps to address "000000004", which it fetched from stack ARM9 02075540 EBFFF299 BL 02071FAC -> R14 should be 02075544 - ok, there's a bug that BX instruction overwrites R14 with PC - however! if I fix that, the synth stops working - lol, this was my fuckup, i did implement BLX - ok, so now the synth triggers on noteon/noteoff - it would be nice to have knobs so let's work on that now - some functions from the synth dispatch table: 02087cec 0207448c 02087cf0 02074590 02087cf4 020746e0 02087cf8 02074720 data render handler 02087cfc 0207469c 02087d00 02075930 02087d04 02075648 knob movement handler 02087d08 02075928 02087d0c 02075900 02087d10 0206ceb4 02087d14 0207546c noteon event handling (called from data renderer) 02087d18 02075640 - mixer dispatch table: 02087c7c 02072ce0 02087c80 02072d20 02087c84 02072dac 02087c88 02072e8c data render handler 02087c8c 02072d68 02087c90 02073888 02087c94 020736c0 02087c98 02073880 02087c9c 02073858 02087ca0 0206ceb4 02087ca4 020735d0 noteon event handling (called from data renderer) 02087ca8 020736b8 02087cac 00000000 02087cb0 00000000 - drum dispatch table: 02087bcc 020721f4 02087bd0 02072228 02087bd4 02072278 02087bd8 020722b8 render 02087bdc 02072264 02087be0 02072514 02087be4 020724e0 ? 02087be8 0206ba5c 02087bec 02049744 02087bf0 02072268 02087bf4 020723dc noteon? 02087bf8 020724d8 02087bfc 00000000 02087c00 00000000 Sun Jun 10 11:09:04 CEST 2018 - let's get the hardware knobs working [...] - ok, now I can control DS-10 from either keyboard or volca synth, with notes and knobs working as expected. it can finally be played like a normal synth! - how does the TODOLIST/roadmap look like now? - importing patches from DS-10 save files - saving and loading patches in my synth-emulator - polyphonic synth! - figuring out how drum synth and mixer works (what are the knob-handlers etc.) - decompile synthesizer into C, try to map structures and recompile it for a different architecture - make a hw DS-10-like controller (something like 60 knobs) - try to get the STM32 blue-pill working and outputting audio - now: let's make the import working - the instruments seem to start around 0x38000 in DS-10+ save file - the instrument description is a string of the ~30 knob settings - in the trace_knob_move.txt trace (which is trace of a synth-state being loaded) there's a loop that loads each of the knob via the knob-dispatch routine - only few parameters are "bipolar", in JUNKTT I have "VCO2pitch" negative (id=8), so let's guess: 00038700 68 e0 a9 5d d9 3c 0c b5 1a f1 96 dd 3c a3 ad 99 |h..].<......<...| 00038710 4a 55 4e 4b 54 54 00 00 00 00 00 00 00 00 00 00 |JUNKTT..........| 00038720 00 00 00 03 00 00 01 02 e0 20 00 1b 4f 00 13 09 |......... ..O...| 00038730 01 00 03 69 46 42 7f 00 2d 00 01 00 00 00 00 00 |...iFB..-.......| 00038740 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| - ok, starting offset is 0x20 - patch loading working. this is utterly awesome! I can now play all my instruments I made on DS-10! and in hi-fi! (because despite the information that the D/A converter is 16bit, I still think it's just filtered PWM, 12bit tops) - let's dig a bit into the mixer 0224a260: 02 fe xx 06 00 00 yy 07 ; volume - xx is channel (0..5) - yy is volume (00..7f) : 02 35 xx 06 02 35 yy 0a ; panning - yy is pan (00..7f) : 00 00 xx 06 00 00 yy 07 ; mute channel? oh! volume again... - trap: who writes into mixer: ... mostly the "event handler" code that gets called by the mixer synth routine - - note: current emulator eats about 35% on a single voice, 25% with -O2 - maybe a good idea would be to do a trace of event writes on all synths: 1. find out event area syn1 2. trace writes to all event areas 022489A0 syn1 +8 number of pending events (byte) +80 event area 02248FE0 syn2 02249620 drum1 02249900 drum2 02249BE0 drum3 02249EC0 drum4 0224A1A0 mixer +8 number of pending evets +c0 event area 0224A480 effects - no event area? probably merged with mizer 01ff87e0 022489a0 02248fe0 02249620 02249900 01ff87f0 02249be0 02249ec0 00000000 00000000 * DRUM1 trigger: ARM9: cond=evqueue addr=0224a224 val32=00000004 lastipc=00004047 ^ this sets effect and is not a trigger ARM9: cond=evqueue addr=022496a0 val32=00000202 lastipc=00004047 ARM9: cond=evqueue addr=022496a4 val32=02247f3c lastipc=00004047 ARM9: cond=evcount addr=02249628 val8=00000001 lastipc=00004047 Mon Jun 11 13:06:28 CEST 2018 - today I'm continuing with the event tracing - as a part of ongoing experiments, I implemented a polyphonic DS-10 synth by just creating three separated arm-emulator contexts and distributing incoming notes amongst them (knobs distributing to all of them). and it works surprisingly well. however it's too slow to run more than three voices, so I should probably work on the decompilation par more - drum synth works (when mapped to keyboard) - even loading custom samples to drum synth works now! woot! Tue Jun 12 09:20:57 CEST 2018 - trying retdec decompiler: https://retdec.com/ - the ds memory image files should have probably gone through some preprocessing that would leave only relevant code, because retdec chokes on the 4MB image - idea: control flow extraction, and registers annotated with types with type induction A tool that I could really use right now is something that would extract code chunks from the main image by walking down functions (with hints as to where BX goes and how big are jumptables). I could then feed the code to either some kind of "reassembler" (that would relocate the located code) or another tool (like the retdec decompiler). Simplest way to implement the above would just to make tool that understands jump instructions a would descend the code recursively. Then I could just extract the memory ranges by hand and check correctness with dynamic execution. The problem with reassembly is that it would need modify the data because function pointers are stored in synth data structures (and in method tables). This tool would also allow me, instead of decompiling the code, to recompile the code to C (direct instr-to-C translation with indirect jumps solved by static address->function translation table). [...] - Attempt to make effects work didn't work very well. I need to read some tracedumps to understand better how it works. [...] I have a plan how to do the decompiler, but not the strength to implement it now, so I'll just sketch it here for later. 1. take instruction parser from libdisarm, use it to determine whether instruction is jump, data move, return, etc. 2. construct a call-graph incrementally: put start node into queue, then repeat: - take node from queue - decode instruction at that address - if it's conditional jump, make two edges (to next instruction, to destination) - if it's jump, make edge to destination - otherwise, make edge to next instruction - if it's computed jump (BX), take hints from "hintfile" (generated by hand or by live tracing) - if it's instruction manipulating R15, do some kind of special-case handling: - return (LDMIA) - mark as return - jumptable (ADDLS) - take hint from hintfile about how big that jumptable is - literal (LD x, [R15]) - just assume the literal is constant and fetch it from memory 3. this will construct a graph (use array of addresses to store node pointers). walk it to check it's sane graph: no jumps between different functions, etc. 4. decompile each function into one C function with variables like uint32_t R2, R3...;, expand each instruction, put labels in places where edges arrive and just goto there. do memory fetches and stores on array. This is of course more compilated than it sounds: - parameter needs to be handled (this could be work for hintfile) - in/out/gen/kill sets need to be computed to see what gets passed where - R14 needs to be managed to ensure we return at correct places - lots of small stuff 5. profit Wed Jun 13 11:16:26 CEST 2018 - I couldn't resist and wrote some rudimentary disassembler/control flow extractor according to plan outlined yesterday. - My current idea is to don't try to guess much information, but instead translate the instruction one-to-one into structure like this: regjump: switch(R15) { /* labels generated twice: oncee for register jumps (the switch), once for direct jumps (the label) */ case 0x0204720: lab0204720: /* one stack array, maybe through some macro to check for overflows */ STACK[--R13] = R3; STACK[--R13] = R4; /* operation on registers translated directly */ R5 = R2 + R1; R6 = R7 * R9; lab020472c: /* flags generated on demand (set solver, see bellow) */ C = R2 > R1; /* conditioned instruction with if prefix */ if (C) R5 = 4; if (C) goto lab0204720; /* "function call" */ R14 = 0x0204720; goto lab020490c; case lab0204738: /* return from call */ R14 = STACK[R13++]; /* the decoder knows its R15, so it's not required to track that. */ /* it would be nice to decode literals though */ R15 = R14; goto regjump; } The "when should be generated" problem could be solved by in/out/used/gen/kill sets, which could be done by a bitmap solver. Although, maybe the first version could do away with in/out/used... sets and just generate them everywhere. The same for stack - let's just make it an ordinary memory accessed with "vaddr". Thu Jun 14 23:40:09 CEST 2018 - working on implementation of alu function generator for arm->c translator Fri Jun 15 13:23:31 CEST 2018 - implementing load/store instructions - more ideas: - track "use" of cmp"xx" predicates, so we can emit them instead of flags - merge if(..}{ } blocks [...] - so the code generator is outputting compilable code, I linked it to rest of the emulator AND IT RUNS! it doesn't produce any sound, but it correctly returns to the place it was called from, so that's something - smull was mis-implemented (no sign extension for the second operand) - WOW! it plays something very similar to what is actually supposed to! it's a bit garbled, but who cares. this is the code in question [...] - ok, output is garbled. register trace compare matches. what to do? maybe store is buggy? ARM9 02075410 E0C270B2 STRH R7, [R2], #2 ... yeah, that one was mis-implemented, because I borked post-increment (and implemented it as pre-increment). Sound synthesis (on synth1) now works decompiled! What's the plan now? - implement multiple input points to control-flow graph - try to decompile "knob", "drum" and "mixer" => I can't do this now, I don't have knobs analyzed by hand - draw them with graphviz - implement subroutine checking - soubroutine is something called with link - each subroutine has one entrypoint - maybe try parameter detection and breaking the code into subroutines - gen/kill/used sets computation - implement value type and structures inverting Decompiling knob-setting-dispatch function. Looks like each synth component has it's own sub-dispatch function (ie. filter). I need to rewrite the code-ordering function. Latest decompilation attempt Sun Jun 17 12:58:26 CEST 2018 - trying to fix the bitmap solver - fixed - now flags are output depending on what's used later in the code - flags are outputted only when required (thanks to bitmap solver) - instructions with the same if() condition are merged together - now trying to replace "cmp ...; if (flags)" with the condition itself - works! at least in most cases - this is probably the furthest I can go without rewriting the control-flow graph handling and manipulation. i would like to remove all branches (why have a "b xxx" node? why not branch with the graph itself?) and do some major overhaul to the decoder, because it's really a pain in the ass to use. - although maybe I can implement a subroutine chopper before I do that. - I tried to test mumu (which is a kind of "shell" that manages sound rendering, midi io etc.) with the latest code generated and the third noteon almost always faults the whole emulator, so I'll probably need to fix that first. First thing after I get back from vacation. [...] - I fixed a bug in code generator ... it seems that the code generator works only by chance. I desperately need to rewrite it. Not neccessarily change the whole concept, but just make sure the design makes sense. Thu Jun 21 12:45:57 CEST 2018 - back - trying to fix the c-code generator [...] - OK, I just rewrote the main decoding loop and I'm not sure it's an improvement. The only thing that's better is the order of the output code (it's sorted by function address and grouped by subroutine). The rest is... well not nice. The "link next" disappeared, but it was replaced by some spaghetti logic. The problem is that the goto/label/subroutine logic was mixed with if()/cc/conditional logic, and that was not a good idea. I need to find a way to separate those two. - In other news: I replaced the "vaddr" with direct memory access and it's twice as fast. - hmm i forgot the vaddr also handles memory-mapped IO devices like DIV and SQRT accelerators... so it's faster, but it doesn't sound right Fri Jun 22 17:55:10 CEST 2018 - after yesterday's debacle I decided I need to stop dicking around and just get the emulator to work well enough so I can sequence it from ableton - I have two ways: either make it into a VST, or make it run on my small laptop and sequence it via MIDI (would require a SDL gui for configuring the synth) - first I'm gonna try the VST path. I found a nice tutorial (Martin Finke's "Making audio plugins"), downloaded Visual Studio and I'm now trying to put it all together [...] - so I used "SpaceBass" example as the basis for my synth, polyphony is working, one knob is working. what I need to do now is: - the filter is stepping. why is the filter stepping? how is it done on real ds-10? I don't remember filter stepping. todo: dump all events from knob when either turning it manually or sequencing from kaoss x/y - design a gui with all the knobs - design "enum" knobs for all modulation possibilities - design background and knobs that are similar to ds-10 - implement all knob events - implement polyphony switch - design - need to allocate space for keyboard - how big it is? - need to test knobs - need to rip those knobs & switches - switch: 38x26 - switch hole: 58 - offset: -36 - knobs and switches working, more tomorrow Sat Jun 23 08:03:38 CEST 2018 - got up early, working on vst - oh crap, all globals are shared between instances! @(#&@#*& - need to rewrite this... :( - TODO: * implement event handling * implement polyphony changing * implement mono voice stealing (priority=last) * get rid of all globals * implement resampler from 32KHz to 44KHz * implement volume (velocity) on noteon * optional "resampler" * implement presets * figure out how to get my old patches here * embed ds-10 image into dll * rename it * clean it, release it, release source - well, it's done! - goodbye Sun Jun 24 21:29:39 CEST 2018 - I added resample feature - fixed velocity curve - posted to r/synthesizers just for kicks TODO: - decompiler - implement separate hintfile - how to write a bitmap solver (draw graph and equations for each node) - mumu - implement midi note priority logic - fix keyjazz (doesn't do noteoff) - fix "Data" data structure - make a SDL gui with knobs and keyjazz piano (with keyup/down) - make a VST out of it (I have this ANSI C code - how do I make a VST with knobs) - what is stored in patch saves other than the synth configuration?