Allpro 88 CPU Description ------------------------- By: Kevtris V1.00 061616 Once the programmer was reverse engineered, I needed a way to drive the programmer. The obvious solution is to create a custom CPU, so this was done. This CPU is a sorta-long instruction word CPU (SLIW) that has a 54 bit instruction word, and is 2K deep. (This depth might change in the future) It allows for the reading and writing of the Allpro registers, and access to the SDRAM, along with the on-FPGA registers of things like the fan controllers and such. A scripting language is "compiled" (more like translated) into these instructions, which when executed, dumps a microcontroller, programs an EPROM, tests a chip, or what have you. The scripting language is described later in this document. But first, the CPU itself. * * * * There are 8 32 bit wide general purpose registers, a 16 bit program counter, 2K words of 54 bit wide ROM, 8 level call stack, and a 8 level data stack. The instruction is long so that each part of the CPU can be independently programmed in a single instruction word. There's plenty of "weird" and "what the hell were you thinking?" instructions possible, due to the way they are encoded. Just don't use those unless you want to be weird. The CPU can read from IO space, pointed to by IO addresses. This space is specified by the source/destination registers (see below). IO space has a 32 bit address, and is 8 bits wide. When reading IO space, the lower 8 bits will be copied over and the upper 24 bits will be read as 0's. SDRAM, the Allpro registers themselves, and the FPGA's built in registers reside in IO space. Programmer's model: 32 bits 16 bits +-----+ +------+ | R7 | | PC | +-----+ +------+ | R6 | +-----+ | R5 | 16 bits (call stack) +-----+ +------+ | R4 | | S7 | stack level 7 +-----+ +------+ | R3 | | S6 | stack level 6 +-----+ .... | R2 | | S1 | stack level 1 +-----+ +------+ | R1 | | S0 | stack level 0 +-----+ +------+ | R0 | +-----+ 32 bits (data stack) +-----+ | S7 | stack level 7 +-----+ | S6 | stack level 6 .... | S1 | stack level 1 +-----+ | S0 | stack level 0 +-----+ The instruction word is formed like so: (bit #) 5 5 4 4 3 3 3 2 2 1 1 1 3 1 7 3 9 5 1 7 3 9 5 1 7 3 0 ------------------------------------------------------------------- xc dddd ssss SSSS cccc aaaa llll llll llll llll llll llll llll llll x: not used, keep as 0 for now c: 5 bit condition code (split between bits 52 and [39:36]) d: destination s: source 1 S: source 2 a: ALU operation l: 32 bit literal value ALU operation: -------------- The ALU can perform 16 different operations: 0 - SBIT copy carry to bit (no flags affected) 1 - LBIT copy bit to carry (C) 2 - TRI trinary (no flags affected) 3 - MOVX (no flags affected) 4 - ADD (CNZ) 5 - ADC (CNZ) 6 - SUB (CNZ) 7 - SBC (CNZ) 8 - MOV (NZ) 9 - AND (NZ) a - OR (NZ) b - XOR (NZ) c - RSH (CNZ) d - ROR (CNZ) e - LSH (CNZ) f - ROL (CNZ) The functions are fairly self-explanatory except for the first 4. There are 3 flags that can get updated based on the result of the operation. They are: C - carry flag N - sign flag (1 = upper bit of the result is set, 0 = upper bit clear) Z - zero flag. set when the result is 0. Operation 0 (copying carry to a bit) is useful for changing a single bit on a register when building up i.e. the 8 data bits of ROM that is being read. Operation 1 is the opposite- it will take a bit (say, the read data bit from a pin driver) and put it into carry so it can then be transferred into another register to build up i.e. the 8 bits of data from the ROM being read. Operation 2 is a "trinary" operator, the classic ? C or Verilog operator. If carry is set, the upper 16 bits of the literal value will be put into the destination. If carry is clear, the lower 16 bits will be put into the destination. The upper 16 bits of the destination are cleared. This is useful for setting the pin driver value to one of two states, i.e. when writing the address lines of an EPROM and you wish to set it high or low depending on the state of a register bit. Operation 3 is a MOV, but does not affect flags at all. operation 8 (MOV) will move source 1 into destination, and source 2 is not used. The rotates and shift are similar; only source 1 is used. NOTE: The ALU performs a math function every instruction, no matter what. For instructions where you do not wish to perform a useful math operation, the destination can be set to the Fh which is the literal 32 bit value in the opcode. This will result in the destination being thrown away. By selecting operation 0 as the instruction, the flags will not be updated either. Otherwise, the flags will be updated as usual even if the result is thrown away. This is useful for i.e. testing a register for 0 or whatever. There are 16 unique sources or destinations for the ALU inputs and outputs. These are: 0-7: R0-R7, fairly self explanatory. 8-A: (R0)-(R2), The IO address pointed to by R0, R1, or R2. B: IO address pointed to by the 32 bit literal word in the opcode. C: Data stack D: Data stack + increment pointer E: Data stack + decrement pointer F: literal value in the opcode. This flexibility allows the programmer to read two different registers, perform a function, and update a third. The same register can be specified in multiple places as well to perform a read/modify/write operation. If IO space is being read or written, it will function as expected. The exception to this is if two different IO addresses are being used to read. Only one can be read, and source 1's address will override source 2's. The write address can be different as well. NOTE: If more than 1 part of the instruction is trying to use the literal word at one time, it will work as expected, but since there's only 1 literal word it will be used by ALL parts of the instruction. This is a problem if you wish to perform a jump or call in the same instruction as the literal is being added to a register or similar. Keep that in mind. Condition code: --------------- There's plenty of fun things that can be done to the program counter to control program flow. Each instruction can potentially affect program flow. If the condition code does not affect the PC, it is simply incremented to the next address. 00: NOP never jump/skip/call/return (i.e. PC just increments to the next address) 01: SKIP always skip the next instruction 02: SZ skip if zero 03: SNZ skip if nonzero 04: SC skip if carry 05: SNC skip if no carry 06: SNEG skip if negative 07: SPOS skip if positive 08: never jump/skip/call/return 09: JMP always jump (lower 16 bits of literal word is the new PC) 0A: JZ jump if zero 0B: JNZ jump if nonzero 0C: JC jump if carry 0D: JNZ jump if no carry 0E: JNEG jump if negative 0F: JPOS jump if positive 10: OFF add R7 to PC 11: RET always return (pops top of stack into PC) 12: RZ return if zero 13: RNZ return if nonzero 14: RC return if carry 15: RNC return if no carry 16: RNEG return if negative 17: RPOS return if positive 18: (R7) jump to (R7) 19: CALL always call (jumps to address in literal word, pushes PC+1 on stack) 1A: CZ call if zero 1B: CNZ call if nonzero 1C: CC call if carry 1D: CNC call if no carry 1E: CNEG call if negative 1F: CPOS call if positive IO space: --------- The IO space is very large. This was done because I could, and it made decoding it much easier. The space is broken up into 4 30 bit chunks as such: 00000000-00ffffff: SDRAM (16Mbyte used) 40000000-400007ff: Allpro registers (2K used) 80000000-800000ff: System registers (256 used) C0000000-CFFFFFFF: Delay value in cycles (28 bits used) The SDRAM is just general purpose RAM and can be read and written directly by the PC host without CPU intervention. When dumping data from chips, the data is first written into this SDRAM, then the PC reads it when dumping is complete. Likewise, if programming a chip, data is first stored into the SDRAM by the PC, then it is read out and programmed into the chip. The Allpro registers are simply the 2K register space as presented in the Allpro document. Each set of addresses will stall the CPU a certain number of cycles. The delays are as follows: SDRAM: 20 cycles Allpro: 9 cycles system regs: 5 cycles delay: N cycles (depending on address) System registers are as follows: -------------------------------- system regs (read): ------------------- 00 : buttons (lower 4 bits) 01 : SDA line from display (bit 0) 02 : bit 0 = RX ready, bit 1 = TX ready 03 : data from the UART (reading acks UART) 04 : bit 0 = target. 0 = SDRAM interface, 1 = CPU 05 : lower 8 bits of tach on fan 0 06 : upper 8 bits of tach on fan 0 07 : lower 8 bits of tach on fan 1 08 : upper 8 bits of tach on fan 1 09 : lower 8 bits of tach on fan 2 0a : upper 8 bits of tach on fan 2 system regs (write): -------------------- 00 : voltage setting for VP (digipot) 01 : VP enable (bit 0) 1 = on, 0 = off 02 : external clock reload value (lower 8 bits) 03 : external clock reload value (upper 8 bits) 04 : bit 0: SDA, bit 1 : SCL, bit 2 = display reset 05 : data to the UART 06 : bit 0/1 = aux outputs (not used normally) 07 : display brightness. 0 = 0%, FF = 100%ish 08 : writing here will reset the WDT, and unreset the allpro. keep writing here to keep it out of reset. VP is the programmer supply voltage. turning this on will bias the programmer. external clock: a clock generator that is usable by the pin drivers. Delay value: ------------ Writing (or reading) here will not actually perform a read or write (and reading will return whatever was read last). However, it WILL stop the CPU for the number of cycles specified in the address. The CPU is being clocked at 20MHz, so if you wish to delay 1ms, Writing to address C01E8480 will delay this amount. This is calculated like so: 20000000 * .001 = 2000000. So adding this to C0000000 equals C01E8480. opcode format: -------------- ALU source1,source2,destination,condition,literal value sample opcode formats: ADD R0,R1,R0 ADD R0,R1 ADD R0,R1,R2,JMP,address if fields are omitted, they will be replaced with "don't care" fields if possible. i.e. ADD R0,R1,R0 This will add R0+R1 and put the result into R0. The condition code will be 00h which means the PC will just increment to the next instruction. The literal value is set to 00000000h as well. ADD R0,R1 This will add R0+R1 and put the result into R1. As before, the condition code and literal will both be set to 0's. ADD R0,R1,R2,JMP,address This will perform R2 = R0 + R1, then jump to "address" ADD R0,R1,R2,JC,address This will perform R2 = R0 + R1, then if a carry is generated, jump to "address" * * * * Scripting format: ----------------- The scripting format is designed to make life easy. Hopefully. The script is written, and it is compiled and assembled on the fly and uploaded to the FPGA and run. The following commands are supported: all the normal CPU instructions, listed above. PINCOUNT - sets device pincount DEFNAME - sets the name for serial dumping DUMPSIZE - size of the data in the device PIN - set a pin state depending on the selected values TRUE - value to use for a "1" bit when setting a pin (see PIN command) FALSE - value to use for a "0" bit when setting a pin (see PIN command) PINVOLTS - sets the voltage on a particular pin's DAC DACUP - updates all 89 DACs at once READPIN - reads a pin state and puts it into a register CY - set carry to the state of a pin THRESHOLD - sets the pin threshold voltage PULLVOLTS - sets the pullup supply voltage TESTVOLTS - sets the test supply voltage TESTCURR - sets the test supply current VPROGVOLTS - sets Vprog voltage SLEW - sets the slew rate CLOCKSPEED - sets the clock speed BYPASS - turns bypass caps on/off SUPPON - turns VPROG on SUPPOFF - turns VPROG off INC - increments register DEC - decrements register WAIT - waits a specified time LCD - print things to the LCD READBUTS - reads the buttons SETFAN - sets fan duty cycle FANRPM - reads fan RPM DISPBRITE - sets display backlight brightness -- PINCOUNT DEFNAME DUMPSIZE These three things form the declaration for a script, and must be present at the top. For example: pincount = 16 defname 82S23_ dumpsize = 32 PINCOUNT defines how many pins this device has, and is used to calculate values in all subsequent PIN/READPIN/PINVOLTS commands. DEFNAME is the default filename that a dump will be saved to. It will have 000.bin appended to it, then 001.bin, etc. i.e. 82S23_000.bin DUMPSIZE is the size of the final dump in bytes. -- Reading and writing pins: PIN: There's several ways to use the PIN command: The first is to directly set the pin's state. PIN xx = pinstate Pinstate is one of the following: HIZ - high-Z GND - ground DAC - DAC selectable voltage, selected using PINVOLTS TST - test voltage supply HI - logic high (4.5V or so) PUP - pullup to pullup voltage PDN - pulldown LOW - logic low, 50 ohms to ground CKP - positive clock pulses CKN - negative clock pulses Multiple states can be combined, but this can be dangerous and cause hardware failures and overheating. This shouldn't be used unless you know what you are doing! PIN 1 = GND will set pin 1 to ground. The other method of setting a pin is to use the "trinary" method, where depending on the bit of a register, it will set the pin to the "true" or "false" level. i.e. TRUE = HI FALSE = LOW PIN 1 = R0.0 This will set pin 1 to the state of register 0, bit 0. If the bit is set, it will be "HI" if the bit is clear, it will be "LOW". The TRUE and FALSE settings default to HIZ. The PIN xx = Rx.x commands will use the last value of TRUE and FALSE, thus multiple copies of these can be used if needed. PINVOLTS: This sets a voltage on the DAC for a particular pin. If pin 8 is VCC and needs to be 5V, the following commands will do that: PIN 8 = DAC PINVOLTS 8 = 5.0V Be sure to DACUP before turning the power on! DACUP: Used to update the DACs. After a PINVOLTS command is used, or the pullup voltage is changed, the DACUP command must be used to update the DAC settings. This can be done once after all pins are set. See one of the scripts for how this is used. READPIN: This command will read a particular pin and put the result into a particular register bit. i.e. READPIN 1 = R0.0 Will read pin 1, and if the voltage on pin 1 is higher than the threshold voltage, will set bit 0 of register 0. If the voltage on pin 1 is lower than the threshold, it will clear bit 0. The threshold voltage is set using the THRESHOLD command. CY: Same as READPIN, except it puts the state of the pin into carry instead of a register. THRESHOLD: Sets the threshold voltage for pin reading. THRESHOLD = 2.5V Will set it to 2.5V. -- Setting the various voltages for the pin drivers: PULLVOLTS: This is the pullup voltage. It connects to a pin through 2.7K of resistance when PUP is selected. PULLVOLTS = 5.0V will set it to 5V TESTCURR: TESTVOLTS: These two commands will set the current and voltage on the test supply, accessable by setting a pin to TST. The current MUST be set first, then the voltage because the two settings interact. This will calculate them properly. TESTCURR = 50ma will set it to 50ma. The range is 0-255ma TESTVOLTS = 5.0V will set it to 5V VPROGVOLTS: This sets the output voltage of the programmable supply. It should be set to 3V above the highest voltage on the PINVOLTS and PULLVOLTS and TESTVOLTS voltages. i.e. if using 5V logic chips, then VPROGVOLTS should be set to 8.0V. -- Other pin things: SLEW: Sets the slew (in V/uS) of the pin drivers. By default, slew is the fastest at 2.5V/uS. It is settable from 0.5V/uS to 2.5V/uS. CLOCKSPEED: Sets the speed of the clock generator usable using CKP and CKN. It can range from around 40KHz to 3MHz or so. BYPASS: Turns the bypass capacitors on/off on any of the 48 primary pin drivers. BYPASS 12 = ON will turn bypassing for pin 12 on. BYPASS 12 = OFF will turn it off. To properly bypass a chip, both power AND ground pins should be bypassed. i.e. if 1 is ground and 8 is VCC, then both 1 and 8 should have bypassing turned on. Note that only chips up to 48 pins can have bypassing enabled. Above this, some pins cannot be bypassed. SUPPON: Turns the main programmer supply on, and turns on the red LED. SUPPOFF: Turns the supply/LED off. -- Other programmer things: WDTRST: Resets the watchdog timer. Must happen at least every second. If not, the programmer will kill the supply and reset. WDTFAIL: Fails the watchdog, causing it to kill the supply and reset. Used to finish dumping/programming/testing. READUART: Reads the UART. Not terribly useful and is used mainly by the dumping header. WRITEUART: Writes to the UART. Similar to above. SOCKET: Returns socket ID in R0. Generally not used but might be. INC: DEC: Increments/decrements a register. Zero is updated. i.e. INC R0 WAIT: Waits a specified amount of time. The maximum is a bit under 1 second. Format is as follows: WAIT 1S - waits 1 second WAIT 1mS - wait 1 millisecond WAIT 1uS - wait 1 microsecond WAIT 500nS - waits 500 nanoseconds. Min wait is around 400nS or so. -- Somewhat unimplemented things: LCD: Not implemented. READBUTS: Puts the 4 buttons into R7. Not used. SETFAN: Sets the fan duty cycle. Not used. FANRPM: Returns fan RPM in R0. It is in RPM. DISPRITE: Sets display brightness. Not used.