DS-10 Reverse Engineering project

Project goals

  1. Make DS-10 working in DeSmuME (done)
  2. MIDI-in for DS-10 (done)
    1. Figure out note encoding (done)
    2. Figure out "knob" encoding (how to transform from linear to logarithmic etc) (done)
  3. Rip out synth engine and make standalone, polyphonic synth program
  4. Make real synth with stm32f103c8t6 or similar

Q & A

DS-10 architecture and notes

ARM7 - runs generic

Fifo commands:

Project Log

Sat Jun 10 10:32:33 CEST 2017
On Thursday I managed to run DS-10 in DeSmuME without tearing in sound. The
problem was that some instruction (or maybe memory type) has assigned wrong
number of cycles, that is needed to access it. The result is that the
processor isn't fast enough to compute one sound "frame" (64*2 samples) within
assigned timeframe. When I set timing for every instruction to 1, it works
flawlessly (I need to fix it in the JIT as well.

I was generating some callgraphs to understand where is each synth/drum
computed: ds10 call graph.
And you can kinda see it from there: one frame is 64 samples, so if node
has 256 ins and 252 outs one way and 4 another, it's probably a 64 sample for
loop done 4 times, etc.

One other nice thing: I traped various instructions to send register contents
to soundcard - it's like probing circuit with oscilloscope. It would be nice
if it could be done run-time.

Plan for today:
- add support for subroutines to callgraph generator (or: buy IDA)
- print full register contents next to each instruction
	- src/cli/main.cpp: DIS_AUDIO
	- src/armcpu.cpp: x_armlog function
- make gdb working with desmume

OK, so here's another graph I made when fixing bug with "overmerging" of
nodes: graph.

70474bb824b56912cc419221aa8ae015  ds10plus.nds

Sat Jun 10 12:52:40 CEST 2017
Well, I probably shouldn't spend so much time on making tools to answer
questions I don't have. But it makes nice graphs.

I made another one and I swear this is the last one:
now with more colors!
Also script for analysis and the trace.

Misc notes:

02075648	parameter set procedure
	argument R0=synth pointer
	argument r1=knob id
	argument r2=knob value
	load filter pointer
0207E950	pitchtable
0207ED38	drumpitch table
02071fac	set note (args: r0: pitch control, r1: note, r2: 0)
020722B8	drum handler method
02073150        mixer head?
02074720	synth handler method
02074F98	Head of synthesis loop
02075410	Store output of one synth
0207546C	event dispatch function 
	- r0 = synth pointer
	- load event count byte from r0+#8
	- load events from r0+#80 (each even 8 bytes long)
	if (event == 1) {
		env = synth->envelope;
		STRB #3, [env, #4]
		STRB #1, [env, #5]
	}
	if (val == 2) {
		note = keyjazz[4];
		/* fill frequency table */
		synth->pitch->note = note << 15;
		STRB #0, [env,4]
		STRB #0, [env,5]
	}
02075650	knob dispatch table
020758ec	drive knob handler
02075ad4	recompute LFO from BPM (guess)
020807c4	filter coefficient table (151 entries)
02087CEC	synth method table (1st synth)
02087c7c	mixer method table?
01FF846C	load next "machine" dispatcher and call it
01FF8444	load next synth ptr data
	
01FF86A0	synth iteration table
01FF87E0	synths table [list of all synth pointer], null terminated
01FF8820        buf1 (syn1)
01FF88A0        buf2 (syn2)
01FF8920        buf3 (drum1)
01FF89A0        buf4
01FF8A20        buf5
01FF8AA0        buf6
021b04a0	precomputed sample 1
0223D280	stored synth preset data
0223d2e0	another preset data? (maybe?)
02248680	(output) soundbuf
022489A0	Synth 1 Data (struct SYN)
02248A20	keyjazz area (SYN1)
02248c40	env/SYN1
02248CC0	LFO/SYN1
02248d60	pitchctrl/SYN1
02248d20	notectrl/SYN1
02248de0	OSC/SYN1
02248ea0	FILTER/SYN1
02248f40	AMP/SYN1
02248fa0	???/SYN1
02248FE0	Synth 2 Data	
02249620	drum1
022498C0	drum1 volblock
02249900	drum2
02249BE0	drum3
02249EC0	drum4
0224A1A0	MIXER
0224A480	effects
0226A580	delay buffer

knobs:	name		id	min	max
        octave          0       -2      2
        egint           1       -63     63
        porta           2       0       127
        pitch.in        3       -63     63
        pitch1.in       4       -63     63
        pitch2.in       5       -63     63
        vco1            6       0       3
        vco2            7       0       3
        vco2pitch       8       -63     63
        balance         9       0       127
        vcosync         10      0       1
        cutoff          11      0       127
        peak            12      0       127
        vcftype         13      0       2       [0=lpf]
        egint           14      -63     63
        cutoff.in       15      -63     63
        vcaeg           16      0       1
        vca.in          17      -63     63
        drive           18      0       127
        level           19      0       127
        attack          20      0       127
        decay           21      0       127
        sustain         22      0       127
        release         23      0       127
        mg.freq         24      0       127
        mg.bpm          25      0       1       [0=off]
inputs
        pitch.in.hole   26
        pitch1.in.hole  27
        pitch2.in.hole  28
        cutoff.in.hole  29
        vca.in.hole     30
outputs
        none            0
        sine            1
        saw             2
        square          3
        s&h             4
        eg              5
        vco2            6
	

DRUM
	+10	how many samples to render (u16)
	+30	pointer to volblock
	+34	play speed
	+38	offset
	+40	sample length (=0x8000)
	+44	sample buffer address
	
DRUMvolblock (022498C0)
	+14	volume

synth iteration table (01FF86A0)
	+0	synth table pointer (01FF87E0)
	+14	number of synths
	+28	synth handler table

synth generic table
	+0	synth method table

synth method table
	+c	synth render handler
	+18	knob movement handler
	+28	noteon event handling

SYN
	+28	function for handling events? [=> 0207546C]
	+2C	?? stored by synth event handler (event 4)
	+30	pointer to envelope
	+34	pointer to LFO
	+3C	pointer to pitch control
	+40	pointer to struct OSC
	+44	pointer to struct filter?
	+48	pointer to AMP
	+320	seems to be lfo data
	+2A8	envelope value
	+84	frequency!? could be pointer to memory, it's chromatic
	+8	trigger? (number of events in event area)
	+4	keyjazz pointer (pointer to list of events)
	+0	function method table?
	+52	used in noteon_to_env
	+50	1 if first bar of pattern repeat
ENV
	+0	envelope value
	+40	attack
	+e0	protamento
PITCHCTRL
	+18	NOTECTRL
	+38	period for osc1? (from code)
	+3c	period for osc2? (from code)
NOTECTRL
	+4	oldpitch
	+0	newpitch (note << 15)
	+C	inverse portamento?
	+10	direction (0 down, 1 up)
OSC
	+2C	VCO1 type (0..3)
	+30	VCO2 type (0..3)
	+34	OSC balance (midi)
	+38	VCO1 volume (15bit)
	+3C	VCO2 volume (15bit)
	+20,28,64,6c	vco2 pitch
	44,
	---
	+24	[speculation] 
	+1c	input to divisor
	+c	jumptable on this
	+48	enum [0-3]
LFO
	+28	BPM
FILTER (OSC+C0)
	+1C		VCF cutoff
	+20,24		VCF peak (precomputed?)
	+28		cutoff precomputed coefficient 1
	+40		filter type (0=lowpass,1=bandpass,2=highpass)
	---
	+24 		coef1
	+34		highpass output
	+38		lowpass output
AMP (OSC+160)
	+4		?? [values 0..3]
	+8		which envelope segment are we using [env+[AMP+8]*4]
	+C		"extra distortion" switch
	+10		pointer to env
	+14		?? [pointer to some table offseted by [AMP+4]*4]
	+18		VCA LEVEL, knob*0x20408
	+24		VCA EG
	+2C		[speculation] current amp envelope value
	+30		[speculation] envelope increment
	+34		over-driver output from amp first stage
	+38		drive coeff 1, 15bit, unsigned ((knob*0x102)^2>>15)
	+3C		drive coeff 2, 15bit, unsigned (0x7fff-knob*0xe8)
			this value is actually wet/dry knob (0x7fff is dry)

MIXER 	0224A1A0
	+8		number of events
	+30		used by mixer, volume of sorts, (?)
	+34	H?	dry/wet knob, same as effect
	+50+4*chan	mixvol/R (L-R)
	+68+4*chan	mixvol/L
	+80		pointer to effect data
	+84		effect enabled on [0..syn1, 1..syn2, 2..syn1+syn2, 3..drums, 4..all, 5..off]
	+88		used by effect, halfword (?)
	+C0		event queue

EFFECT	0224a480
	+C	H	effect type [0 delay, 1 chorus, 2 flanger]
	+E	H	Delay sync [0 off]
	+10	H	Delay time
	+12	HS	Delay L/R ratio [ffc1..003f]
	+14	H	Flanger+Chorus LFO
	+16	H	Flanger+Chorus Depth
	+18	HS	Delay+Flanger Feedback [ffc1..003f]
	+1A	H	dry/wet [0000..007f]

0224a480: 0228a5a0 0228a5e0 0228a620 00010000
0224a490: 000d0003 003e0017 00110010 00000001
	

SYNTH EVENTS
	__ __ __ 01  __ __ __ __  02075524	noteoff
	__ __ __ 02  __ __ __ nn  02075534	noteon
	__ __ __ 04  bb bb bb bb  02075588	set bpm
	__ __ __ 05  __ __ __ __  020755a0	supress lfo reset? force lfo reset?
	__ __ __ 18  __ __ __ __  020755b8	reset envelope, lfo?
	__ __ __ 1c  __ __ __ __  020755e4	reset bpm-synced lfo if first bar
	__ __ __ 1d  __ __ __ __  020755d4	set "first bar repetition" flag
	
	- legato is done by changing the note without stopping (02 02 01)
	- bar start: 1c
	- pattern change: 1d
	- play: 04 18
	- stop: 19 (abrupt stop: 05 19? only when song not playing)
	- abrupt pattern change: 05 (usually 05 05)
	- when shaping a drum: 1e (what is that?)

MIXER EVENTS
	__ __ cc 06  __ __ vv 07	; set volume (0..7f) on channel
	__ __ cc 06  __ __ pp 0a	; set pan (0..7f) on channel
	
	- mixer also gets all the song control events (5, 18, 19, 1c, 1d)

DRUM EVENTS
	__ __ __ 01  __ __ __ __  02075524	noteoff
	__ __ __ 02  __ __ __ nn  02075534	noteon

	- the sequencer sets a noteoff before each noteon and optionally a
	  noteoff after some time (if it's not a one-shot)
	- also accepts all song control events (TODO: which ones?)
	- pan/volume events are set to mixer
	

Code seems to be divided into sections (first OSC, then FILTER, AMP...).
Envelopes seem to be computed just once per frame (they are not per-sample).

MCR p15	0	Rd	c7	c10	2	Clean D$ Line by Set/Way
MCR p15	0	Rd	c7	c10	4	Data Synchronization Barrier


Synth pointer:
--- DUMP AT 022489a0 ---
0000: 02087cec 02248a20 00004000 00000201 00000040 00007fd8 00008000 01ff8b40
0020: 00000040 00007fd8 00000002 00000075 02248c40 02248cc0 02248d20 02248d60
0040: 02248de0 02248ea0 02248f40 02248fa0 00000000 00000000 00007fff 00007fff
0060: 02248c20 02248980 00000220 00007fff 00000001 00000000 00000000 00000000

Things that need to be found:
- where does R4/SYN pointer come from
- what's the encoding for "frequency"
- dump structures in sampler
	=> routine which renders "samples" is short and effective (0x02072354)
	=> it does "drop sample" interpolation

That's it for today.

Sat Jun 10 20:22:20 CEST 2017
I'm trying to figure out how to trigger sample from keyboard. Memory write to
envelope register isn't enough, some code probably needs to be executed as
well to initialize the counters.

02248c44	00000000 attack
		00000001 sustain?
		00000103 release

Mon Jun 12 09:00:38 CEST 2017
Silly me - I know what I did wrong. I was writing "sustain" instead of
"attack" into the "envelope state" register. Now I can trigger the
last note from keyboard.  Although just 4/5 of key-presses go through...

I need to speed up debugging, writing breakpoints into emulator source just
doesn't cut it anymore.

MCR p15, 0, , c7, c0, 4		Wait for interrupt

Trying to figure ou what triggers the envelope:
IPC7 time=1168279224 send FIFO 0xC02400C6 size 000 (l 0x8501, tail 10) (r
0x8501, tail 13)
irq proc=0 num=18
ARM9: trigger addr=022489a8 val=00000001 lastipc=c02400c6
IPC9 time=1168298016 send FIFO 0x8226CA07 size 000 (l 0x8501, tail 13) (r
0x8501, tail 11)
irq proc=1 num=18
ARM9: trigger addr=022489a8 val=00000002 lastipc=c02400c6
ARM9: trigger addr=022489a8 val=00000003 lastipc=c02400c6
irq proc=1 num=4
irq proc=1 num=4
IPC7 time=1168377409 send FIFO 0x00004047 size 000 (l 0x8501, tail 11) (r
0x8501, tail 14)
irq proc=0 num=18
ARM9: trigger addr=022489a8 val=00000004 lastipc=00004047

Wed Jun 14 14:44:28 CEST 2017

I wonder how hard it must have been to reverse engineer something just using
gdb or similar, command-line debugger.
Anyway, still working on task 2 - need to figure out, how to trigger note-on
manually.

There are two suspicious places:
1. "keyjazz" area
	02248a20: 02070001 02077eff 	// note off
	02248a20: 00000002 02247f35 	// note on
2. envelope mem
	02248c40: 02248c80 00000001

Let's do the keyjazz area first
- the lowest byte goes from 29 (low F) to 40 (high E), FF when no note is
  pressed [this is midi note]
- it seems to be a pointer to memory (which itself is array to pointers)
- it is written periodically (when in keyjazz mode) or every song-tick (when
  in song-play mode)
- the keyjazz is read from 01FF9EC8 and 01FF9ECC (seriously, what is that
  memory? is it ipc? print memory map!)
	- oh shit, that's stack! that explains a lot actually

ARM9:
	<02 00 0000	ITCM, mod 0x8000
	 04 __ ____	IO
	 02 __ ____	

- so I'm plotting more graphs and they are a thing of beauty...
  (the picture illustrates how a dispatcher looks like)
- i need to go over 4007 sound generation again to find out how are these
  values used
- the note (lowest byte) is read from 0230ADA9

0207025C	procedure, R2 => pointer to <event,keypress> pair

- ok, so this is solved: 0230ADA4
	[-- 01 00 -- | ff -- -- -- ] 	noteoff
	[-- 02 00 -- | ff 35 -- -- ] 	noteon
- so for instance, this plays a note:
	MMU_write32(0, 0x0230ADA4, 0x00000200);
	MMU_write32(0, 0x0230ADA8, 0x000035ff);
- now let's go figure out how does it get interpreted by the synth code (aka
  who reads 02248a20)
- oh, now i now: it's not "keyjazz" area, it's a list of events! and trigger
  is "number of events pending"
- pitchctrl area:
02248d20: 001a0000 001a0000 00000000 00007fff 00000001 00000000 00000000 00000000
02248d40: 02248dc0 02248d00 00000080 00000000 00000000 00000000 00000000 00000000
02248d60: 02087bc0 00000000 00000000 00000000 02248c40 02248cc0 02248d20 00000000
02248d80: 00000040 00000040 00000040 00000040 00000000 00000059 014a07ca 01ee8052
02248da0: 00000004 00000004 00000004 00000002 00000002 00000002 00000000 00000000
02248dc0: 02248e80 02248d40 000000c0 00000000 00000000 00000000 00000000 00000000
- ok, this is it for today

Wed Jun 21 17:36:30 CEST 2017
Just a quick one for today: I'm looking for codepath that is responsible for
decoding touch operations into knob value.
Address of "PEAK" for SYN1: 02248ec0
Yeah, there's a parameter set procedure: 02075648 - seems like a big switch{}

Thu Jun 22 10:25:56 CEST 2017
I had a few ideas yesterday that I would like to implement today and tomorrow:
- watchpoints in emulator
	- to see what knobs are set and when
		=> ok, added knob table
- midi input to emulator: another thread that reads events from socket
	- feed it from sdlkeys
	- add supprot for knobs sdlkeys
		- fonts
		- drawing circles
		- same gui algorithm "midiseq" had
- reading keys on console in desmume:
	- something to enable/disable tracing into file
	- viewing memory
	- setting watchpoints etc.
- try
	- newer desmume
	- lua
	- gdb

Ok, so basic keyjazz is working:
	noteon:
                MMU_write_mem(0x02248a20, 0x00000002);
                MMU_write_mem(0x02248a24, 0x02247f39);
                MMU_write_mem(0x022489a8, 1);
	noteoff:
                MMU_write_mem(0x02248a20, 0x02070001);
                MMU_write_mem(0x02248a24, 0x02077eff);
                MMU_write_mem(0x022489a8, 1);
	and disable writes to the event registers.

Now let's implement "midi" via sdlkeys:
- we need some thread to open & read from "socket" with events (from sdlkeys)
	- maybe use midi instead?
		=> ok, midi it is (via snd-virmidi and open(2))
	- threads?
		SDL_CreateThread()
		SDL_WaitThread()

Fri Jun 23 13:58:10 CEST 2017
- midi input works, trying to figure out how to send velocity via ds10 event
- apparently, there's one additional block I haven't considered: the mixer
	- it gets driven by the sequencer
	- it has also inputs from "mixer panel"
- so far the components are:
	- sequencer - don't care
	- synth - memmap mostly done, mapped inputs and outputs
	- drums - TODO (albeit the code is quite short)
	- effects - TODO
	- mixer - TODO
- let's disassemble the drums
	- consult the diagram:
		- one block is 64 samples (= 128 bytes = half of dma buffer (double buffering))
		0x2073150	looks like mixer head (64 reps)
			=> in fact mixer and effect code
		0x2072354	looks like sampler head (4*64 reps)
			=> it is
		0x2018054	32*4 reps
	- the sampler code is fairly easy, but there are some weird stuff going on there as well
		- time to do audioprobe stuff!
		- ok, the "weird stuff" is linear interpolation (but it doesn't sound very interpolated)
		- question is, how to trigger drums
		- well that was easy:
			pitch = 0x8000 * pow(2, (note-60)/12.0);
            		MMU_write_mem(0x02249620+0x34, pitch);
            		MMU_write_mem(0x02249620+0x38, 0);
		- now i've got an idea: take sampler output and use it as waveform in syn1
- well, that's it for today, off to the theater


Tue Oct  3 23:17:25 CEST 2017
- bored, doing some static analysis
- TODO:
	- dump 02248CC0 (used by AMP to calc something)
	- figure out what is the range of "drive"
	- figure out AMP+2c, AMP+30 (some kind of integrator?)
	- need to find address of "knob_value -> amp structure values" routine
		- 02075650, trace_knob_move

AMP:
- amplifier precomputation
	- stored: 28, 2C, 30
	- loaded: 4, 8, 10, 14, 18, 1c, 24, 28

Wed Oct  4 09:40:05 CEST 2017
- trying to get gdb running
	./desmume-cli --arm9gdb 12345 ds10plus.nds
	arm-none-eabi-gdb
- watch what are envelopes doing: 0x2248c40
- NOTES:
	SMULL low,high,a,b
	R13=SP
	R14=LR
- how is amp balance computed?
	- the formula gives something like
		left=0.533*(1-bal^2)
		right=0.533*(1-(1-bal)^2)
	  which is reasonably good, but i cannot reason the approximation in any way
	- the correct way would probably be [cos(bal*pi/2), sin(bal*pi/2)]
		cos phi ~= 1 - phi^2/2
		... the constants are changed so it would work for [0..1] instead of radians and for bal=1 it would be 0
	- still not sure why the 0x4444 scaling factor
- TODO: try to enable extra distortion 02248f4c=1	


Fri Jun  8 14:15:06 CEST 2018
- almost a year later, I've got a sudden urge to continue
- the plan is now: pick some minimalistic ARM emulator, dump main memory (from within
  desmume) and try to run it (and generate some sound)
- the minimalistic emulator in question could be this one:

	https://code.9front.org/hg/plan9front/file/b974afb1648d/sys/src/cmd/5e/arm.c


Sat Jun  9 08:50:46 CEST 2018
- the emulator from 9front sort-of works, although I needed to add support for
  some instructions that were tested/used/emitted by plan9 ARM compiler (BLX?)
- I sort-of understood how the mainloop after IRQ handler invocation goes
  through all synths (2 main, 4 drum synths) and calls their handlers, then
  goes through mixer, effect, etc.
- plan for today is:
	- compile desmume from git
	- dump ITCM (01FFxxxx range), because it contains buffers and
	  interrupt handler
	- fix loading of binary segments in ds10emu
	- implement DIVCNT IO register in ds10emu (might require io-register
	  handler ... or that trick with handler pagefault)

[...]

- So I did most of that, manually triggering synth1 from ds10emu - and I've
  got some audio output! The bad news is, the output is severely garbled,
  oscillator not oscillating properly and filter filtering-out nothing.
  Envelope works though.
- another try. this time I've got a fault on DIVCNT (hardware division
  accelerator), which I forgot to implement.

- ok, turns out I had a bug in a playback routine that didn't advance the
  address when reading the buffer (thus the envelope being ok). the bad news
  is, that the output is even more garbled than before.

- added audio probe (just a trigger: R15==addr => play(R[2])) and oscillator
  works fine
- filter doesn't, output is severely garbled. my guess is: broken SMULL
  instruction
- doesn't seem to be the case... let's try to do a filter bypass... nope,
  still shit


0207526C:
	LDR R2, [R7, #5C]
	LDR R8, [R7, #34]
	LDR R11, [R7, #38]

- looks like coefficients that were saved previously (R8, R11)
	- they get saved later in highpass/lowpass section
	- R8, R11 are later mostly just clamped values
	- these values look very clamped in output of ds10emu
- R7 is kept in register pointing to filter section throughout the code
- ok, trying to compare various states to be able to check results 
	- filter coefficient address:
		02248EA0+34 = 02248ED4
		02248EA0+38 = 02248ED8
	- ds10emu:
		0207526c R8:00babd4b R11:ff793abc
	- 02000000.bin:
		$ hexdump -C 02000000.bin | grep -i "^00248ed0"
		00248ed0  00 00 00 00 4b bd ba 00  bc 3a 79 ff 00 00 00 00  |....K....:y.....|
		... so matches ds10emu values ...
	- desmume
		- although the state is same, it doesn't match. got to debug
		  who changes it
		- ok... the savestate is from middle of rendering stuff, I
		  have to get a better savestate
- again:
	- desmume:
		0207526C R8:00EC9573 R11:00E224D1
	- bin:
		00248ed0  00 00 00 00 73 95 ec 00  d1 24 e2 00 00 00 00 00 |....s....$......|
	- ds10emu:
		0207526c R8:00ec9573 R11:00e224d1
- woot! now I can just do a register diff between desmume and my emu
- perfect, there's a difference in the second computed sample
- hah! fucking arithmetic shift right didn't do sign extension
	E1A0584 MOV R7, R3, LSL #1
	- instruction encoding:
		cccc 00Io oooS nnnn dddd ssss ssss ssss
		____ 0011 101S 
	- hmm the instruction encoding is less obvious than I thought

either it's immediate (1<<25)

or it isn't
	0 0 0 0  0 0 0 0  Rm		register
	i i i i  i 0 0 0  Rm		SHL by immediate
	r r r r  0 0 0 1  Rm		SHL by register
	i i i i  i 0 1 0  Rm		LSR by immediate
	r r r r  0 0 1 1  Rm		LSR by register
	i i i i  i 1 0 0  Rm		ASR by immediate
	r r r r  0 1 0 1  Rm		ASR by register
	i i i i  i 1 1 0  Rm		ROR by immediate
	                                RRX if i=0
	r r r r  0 1 1 1  Rm		ROR by register

- WORKS! bug was in casting uint32_t to long before shifting (and long is
  64bit on arm64, so it wouldn't get shifted right)

[...]

- another mystery: I added a synth noteon trigger and now it jumps to address
  "000000004", which it fetched from stack
	ARM9 02075540   EBFFF299        BL 02071FAC
		-> R14 should be 02075544
	- ok, there's a bug that BX instruction overwrites R14 with PC
		- however! if I fix that, the synth stops working
		- lol, this was my fuckup, i did implement BLX

- ok, so now the synth triggers on noteon/noteoff

- it would be nice to have knobs so let's work on that now	
- some functions from the synth dispatch table:
	02087cec 0207448c
	02087cf0 02074590
	02087cf4 020746e0
	02087cf8 02074720 data render handler
	02087cfc 0207469c
	02087d00 02075930
	02087d04 02075648 knob movement handler
	02087d08 02075928
	02087d0c 02075900
	02087d10 0206ceb4
	02087d14 0207546c noteon event handling (called from data renderer)
	02087d18 02075640
- mixer dispatch table:
	02087c7c 02072ce0
	02087c80 02072d20
	02087c84 02072dac
	02087c88 02072e8c data render handler
	02087c8c 02072d68
	02087c90 02073888
	02087c94 020736c0
	02087c98 02073880
	02087c9c 02073858
	02087ca0 0206ceb4
	02087ca4 020735d0 noteon event handling (called from data renderer)
	02087ca8 020736b8
	02087cac 00000000
	02087cb0 00000000
- drum dispatch table:
	02087bcc 020721f4
	02087bd0 02072228
	02087bd4 02072278
	02087bd8 020722b8 render
	02087bdc 02072264
	02087be0 02072514
	02087be4 020724e0 ?
	02087be8 0206ba5c
	02087bec 02049744
	02087bf0 02072268
	02087bf4 020723dc noteon?
	02087bf8 020724d8
	02087bfc 00000000
	02087c00 00000000
	

Sun Jun 10 11:09:04 CEST 2018
- let's get the hardware knobs working

[...]

- ok, now I can control DS-10 from either keyboard or volca synth, with notes
  and knobs working as expected. it can finally be played like a normal synth!

- how does the TODOLIST/roadmap look like now?
	- importing patches from DS-10 save files
	- saving and loading patches in my synth-emulator
	- polyphonic synth!
	- figuring out how drum synth and mixer works (what are the
          knob-handlers etc.)
	- decompile synthesizer into C, try to map structures and recompile it
	  for a different architecture
	- make a hw DS-10-like controller (something like 60 knobs)
	- try to get the STM32 blue-pill working and outputting audio

- now: let's make the import working
	- the instruments seem to start around 0x38000 in DS-10+ save file
	- the instrument description is a string of the ~30 knob settings
	- in the trace_knob_move.txt trace (which is trace of a synth-state
	  being loaded) there's a loop that loads each of the knob via the
	  knob-dispatch routine
	- only few parameters are "bipolar", in JUNKTT I have "VCO2pitch"
	  negative (id=8), so let's guess:

00038700  68 e0 a9 5d d9 3c 0c b5  1a f1 96 dd 3c a3 ad 99  |h..].<......<...|
00038710  4a 55 4e 4b 54 54 00 00  00 00 00 00 00 00 00 00  |JUNKTT..........|
00038720  00 00 00 03 00 00 01 02  e0 20 00 1b 4f 00 13 09  |......... ..O...|
00038730  01 00 03 69 46 42 7f 00  2d 00 01 00 00 00 00 00  |...iFB..-.......|
00038740  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

	- ok, starting offset is 0x20

- patch loading working. this is utterly awesome! I can now play all my
  instruments I made on DS-10! and in hi-fi! (because despite the information
  that the D/A converter is 16bit, I still think it's just filtered PWM, 12bit
  tops)

- let's dig a bit into the mixer
	0224a260: 02 fe xx 06  00 00 yy 07	; volume
		- xx is channel (0..5)
		- yy is volume (00..7f)
		: 02 35 xx 06  02 35 yy 0a	; panning
		- yy is pan (00..7f)
		: 00 00 xx 06  00 00 yy 07	; mute channel? oh! volume again...
- trap: who writes into mixer: ... mostly the "event handler" code that gets
  called by the mixer synth routine
- 
		
- note: current emulator eats about 35% on a single voice, 25% with -O2
- maybe a good idea would be to do a trace of event writes on all synths:
	1. find out event area
		syn1	
	2. trace writes to all event areas
		
022489A0	syn1
	+8	number of pending events (byte)
	+80 	event area
02248FE0	syn2
02249620	drum1
02249900	drum2
02249BE0	drum3
02249EC0	drum4
0224A1A0	mixer
	+8	number of pending evets
	+c0	event area
0224A480	effects
		- no event area? probably merged with mizer

01ff87e0  022489a0 02248fe0 02249620 02249900
01ff87f0  02249be0 02249ec0 00000000 00000000
	
* DRUM1 trigger:
ARM9: cond=evqueue addr=0224a224 val32=00000004 lastipc=00004047
	^ this sets effect and is not a trigger
ARM9: cond=evqueue addr=022496a0 val32=00000202 lastipc=00004047
ARM9: cond=evqueue addr=022496a4 val32=02247f3c lastipc=00004047
ARM9: cond=evcount addr=02249628 val8=00000001 lastipc=00004047

Mon Jun 11 13:06:28 CEST 2018
- today I'm continuing with the event tracing

- as a part of ongoing experiments, I implemented a polyphonic DS-10 synth by
  just creating three separated arm-emulator contexts and distributing
  incoming notes amongst them (knobs distributing to all of them). and it
  works surprisingly well. however it's too slow to run more than three
  voices, so I should probably work on the decompilation par more

- drum synth works (when mapped to keyboard)
- even loading custom samples to drum synth works now! woot!



Tue Jun 12 09:20:57 CEST 2018
- trying retdec decompiler: https://retdec.com/
	- the ds memory image files should have probably gone through some
	  preprocessing that would leave only relevant code, because
	  retdec chokes on the 4MB image
	- idea: control flow extraction, and registers annotated with types
	  with type induction

A tool that I could really use right now is something that would extract code
chunks from the main image by walking down functions (with hints as to where
BX goes and how big are jumptables). I could then feed the code to either some
kind of "reassembler" (that would relocate the located code) or another tool
(like the retdec decompiler).

Simplest way to implement the above would just to make tool that understands
jump instructions a would descend the code recursively. Then I could just
extract the memory ranges by hand and check correctness with dynamic
execution. The problem with reassembly is that it would need modify the data
because function pointers are stored in synth data structures (and in method
tables).

This tool would also allow me, instead of decompiling the code, to recompile
the code to C (direct instr-to-C translation with indirect jumps solved by
static address->function translation table).

[...]

- Attempt to make effects work didn't work very well. I need to read some
  tracedumps to understand better how it works.

[...]

I have a plan how to do the decompiler, but not the strength to implement it
now, so I'll just sketch it here for later.

1. take instruction parser from libdisarm, use it to determine whether
   instruction is jump, data move, return, etc.
2. construct a call-graph incrementally: put start node into queue, then
   repeat:
	- take node from queue
	- decode instruction at that address
	- if it's conditional jump, make two edges (to next instruction, 
	  to destination)
	- if it's jump, make edge to destination
	- otherwise, make edge to next instruction
	- if it's computed jump (BX), take hints from "hintfile" (generated by
	  hand or by live tracing)
	- if it's instruction manipulating R15, do some kind of special-case
	  handling:
		- return (LDMIA) - mark as return
		- jumptable (ADDLS) - take hint from hintfile about how big
		  that jumptable is
		- literal (LD x, [R15]) - just assume the literal is constant
		  and fetch it from memory
3. this will construct a graph (use array of addresses to store node
   pointers). walk it to check it's sane graph: no jumps between different
   functions, etc.
4. decompile each function into one C function with variables like uint32_t
   R2, R3...;, expand each instruction, put labels in places where edges
   arrive and just goto there. do memory fetches and stores on array.
   This is of course more compilated than it sounds:
	- parameter needs to be handled (this could be work for hintfile)
	- in/out/gen/kill sets need to be computed to see what gets passed
	  where
	- R14 needs to be managed to ensure we return at correct places
	- lots of small stuff
5. profit

Wed Jun 13 11:16:26 CEST 2018
- I couldn't resist and wrote some rudimentary disassembler/control flow
  extractor according to plan outlined yesterday.
- My current idea is to don't try to guess much information, but instead
  translate the instruction one-to-one into structure like this:

regjump: switch(R15) {
	/* labels generated twice: oncee for register jumps (the switch),
	   once for direct jumps (the label) */
	case 0x0204720:
	lab0204720:
		/* one stack array, maybe through some macro to check for overflows */
		STACK[--R13] = R3;
		STACK[--R13] = R4;
		/* operation on registers translated directly */
		R5 = R2 + R1;
		R6 = R7 * R9;
	lab020472c:
		/* flags generated on demand (set solver, see bellow) */
		C = R2 > R1;
		/* conditioned instruction with if prefix */
		if (C) R5 = 4;
		if (C) goto lab0204720;
		/* "function call" */
		R14 = 0x0204720;
		goto lab020490c;
	case lab0204738:
		/* return from call */
		R14 = STACK[R13++];
		/* the decoder knows its R15, so it's not required to track that. */
		/* it would be nice to decode literals though */
		R15 = R14;
		goto regjump;
}

The "when should be generated" problem could be solved by in/out/used/gen/kill
sets, which could be done by a bitmap solver.

Although, maybe the first version could do away with in/out/used... sets and
just generate them everywhere. The same for stack - let's just make it an
ordinary memory accessed with "vaddr".

Thu Jun 14 23:40:09 CEST 2018
- working on implementation of alu function generator for arm->c translator

Fri Jun 15 13:23:31 CEST 2018
- implementing load/store instructions
- more ideas:
	- track "use" of cmp"xx" predicates, so we can emit them instead of flags
	- merge if(..}{ } blocks

[...]

- so the code generator is outputting compilable code, I linked it to rest of
  the emulator AND IT RUNS! it doesn't produce any sound, but it correctly
  returns to the place it was called from, so that's something
- smull was mis-implemented (no sign extension for the second operand)
- WOW! it plays something very similar to what is actually supposed to! it's
  a bit garbled, but who cares. this is the code in question

[...]

- ok, output is garbled. register trace compare matches. what to do? maybe
  store is buggy?

ARM9 02075410   E0C270B2        STRH R7, [R2], #2   

... yeah, that one was mis-implemented, because I borked post-increment (and
implemented it as pre-increment).

Sound synthesis (on synth1) now works decompiled!

What's the plan now?

- implement multiple input points to control-flow graph
	- try to decompile "knob", "drum" and "mixer"
		=> I can't do this now, I don't have knobs analyzed by hand
	- draw them with graphviz
- implement subroutine checking
	- soubroutine is something called with link
	- each subroutine has one entrypoint 
- maybe try parameter detection and breaking the code into subroutines
- gen/kill/used sets computation
- implement value type and structures inverting

Decompiling knob-setting-dispatch function. Looks like each synth component
has it's own sub-dispatch function (ie. filter).

I need to rewrite the code-ordering function.

Latest decompilation attempt

Sun Jun 17 12:58:26 CEST 2018
- trying to fix the bitmap solver
	- fixed
	- now flags are output depending on what's used later in the code
- flags are outputted only when required (thanks to bitmap solver)
- instructions with the same if() condition are merged together
- now trying to replace "cmp ...; if (flags)" with the condition itself
	- works! at least in most cases
	- this is probably the furthest I can go without rewriting the
	  control-flow graph handling and manipulation. i would like to remove
	  all branches (why have a "b xxx" node? why not branch with the graph
	  itself?) and do some major overhaul to the decoder, because it's
	  really a pain in the ass to use.
	- although maybe I can implement a subroutine chopper before I do that.
- I tried to test mumu (which is a kind of "shell" that manages sound
  rendering, midi io etc.) with the latest code generated and the third noteon
  almost always faults the whole emulator, so I'll probably need to fix that
  first. First thing after I get back from vacation.

[...]

- I fixed a bug in code generator ... it seems that the code generator works
  only by chance. I desperately need to rewrite it. Not neccessarily change
  the whole concept, but just make sure the design makes sense.



Thu Jun 21 12:45:57 CEST 2018
- back
- trying to fix the c-code generator

[...]

- OK, I just rewrote the main decoding loop and I'm not sure it's an
  improvement. The only thing that's better is the order of the output code
  (it's sorted by function address and grouped by subroutine). The rest is...
  well not nice. The "link next" disappeared, but it was replaced by some
  spaghetti logic. The problem is that the goto/label/subroutine logic was
  mixed with if()/cc/conditional logic, and that was not a good idea. I need 
  to find a way to separate those two.

- In other news: I replaced the "vaddr" with direct memory access and it's
  twice as fast.
- hmm i forgot the vaddr also handles memory-mapped IO devices like DIV and
  SQRT accelerators... so it's faster, but it doesn't sound right


Fri Jun 22 17:55:10 CEST 2018
- after yesterday's debacle I decided I need to stop dicking around and just
  get the emulator to work well enough so I can sequence it from ableton
- I have two ways: either make it into a VST, or make it run on my small
  laptop and sequence it via MIDI (would require a SDL gui for configuring the
  synth)
- first I'm gonna try the VST path. I found a nice tutorial (Martin Finke's
  "Making audio plugins"), downloaded Visual Studio and I'm now trying to put
  it all together

[...]

- so I used "SpaceBass" example as the basis for my synth, polyphony is
  working, one knob is working. what I need to do now is:
	- the filter is stepping. why is the filter stepping? how is it done
	  on real ds-10? I don't remember filter stepping. todo: dump all
	  events from knob when either turning it manually or sequencing from
	  kaoss x/y
	- design a gui with all the knobs
		- design "enum" knobs for all modulation possibilities
		- design background and knobs that are similar to ds-10
	- implement all knob events
	- implement polyphony switch

- design
	- need to allocate space for keyboard
		- how big it is?
	- need to test knobs
	- need to rip those knobs & switches
		- switch: 38x26
		- switch hole: 58
	- offset: -36

- knobs and switches working, more tomorrow


Sat Jun 23 08:03:38 CEST 2018
- got up early, working on vst
- oh crap, all globals are shared between instances! @(#&@#*&
	- need to rewrite this... :(
- TODO:
	* implement event handling
	* implement polyphony changing
	* implement mono voice stealing (priority=last)
	* get rid of all globals
	* implement resampler from 32KHz to 44KHz
	* implement volume (velocity) on noteon
	* optional "resampler"
	* implement presets
		* figure out how to get my old patches here
	* embed ds-10 image into dll
	* rename it
	* clean it, release it, release source

- well, it's done!
- goodbye


Sun Jun 24 21:29:39 CEST 2018
- I added resample feature
- fixed velocity curve
- posted to r/synthesizers just for kicks


TODO:
- decompiler
	- implement separate hintfile
- how to write a bitmap solver (draw graph and equations for each node)
- mumu
	- implement midi note priority logic
	- fix keyjazz (doesn't do noteoff)
	- fix "Data" data structure
	- make a SDL gui with knobs and keyjazz piano (with keyup/down)
- make a VST out of it (I have this ANSI C code - how do I make a VST with
  knobs)
- what is stored in patch saves other than the synth configuration?