NANDputer lives!

Yep, it’s been a long time since I’ve posted anything on here, so I hope to remedy that by posting some updates of the various projects.

First on the list is the NANDputer.   What is a NANDputer?  it’s a computer made out of nothing but NAND gates of course!  I dunno why, but I thought it’d be fun to make this.  I first had to work out how various parts of a CPU would be made out of NANDs, did a bunch of tests and went to town.

The design took about 2 months to come up with and make.   At the bottom of the post is a few statistics on gate usage and count of each type (2 input, 3 input, 4 input, etc).  As I suspected, the quantity vs. gate input count follows a pretty steep curve, with most gates being 2 inputs, and the fewest being 13 input gates.

Everything on the design is made out of NAND gates, even the 7 segment decoding.   The last PCB though has a few non-NAND gate chips like an NES PPU and a serial chip and stuff, but it’s just a peripheral board and is not part of the NANDputer proper.  (Eventually I want to make a NAND UART and replace that peripheral board).

The basic architecture of the computer is actually fairly conventional.  There’s an accumulator, instruction skipping (like on PIC) for decision making, a full ALU (and, add, or, xor, subtract, add with carry, subtract with borrow, set all bits, clear all bits, shifting), 8 bit registers, separate RAM/ROM areas (harvard arch), and bit set/clearing.  There’s a 3 level stack, and even an interrupt!

While the CPU architecture is fairly conventional, the way it is implemented isn’t.  I went with a bit-serial setup on here to save gates.  The ALU for example is only 1 bit, with a “latching” carry so operations are performed a bit at a time on the 8 bit registers/memory.  The program counter is also bit-serial, and on the first youtube video you can see the carry propagating during the incrementing of it.

The downside of course is that this is much slower than a parallel architecture, but this way takes vastly fewer gates.  It takes 96 clock cycles to run a single instruction:  There’s 16 “T” states and 3 non-overlapping clocks generated using a 6 stage johnson counter with some NAND decoding.   (The flipflops that form the johnson counter are made from NANDs too).  Thus, it’s 16*9 or 96 cycles per instruction.  The clock runs at 10MHz, so this is a bit over 100KIPs (thousands of instructions per second).  This sounds really slow but it isn’t TOO slow.  It’s faster than a TMS1000, and it’s only 2-3x slower than a Commodore 64 which I estimate at 250-300kips when it runs at 1MHz (3 and 4 cycle instructions being some of the more common ones).

I eventually want to load a text adventure game on it, then hook it up to the internet and let people telnet into it and play it!  So far, I have gotten a few test programs to run on it using my 8 word “bogorom”:

8 word test ROM
8 word test ROM

This is made of 32 16 position rotary dip switches, which form 8 words of ROM (program ROM is 16 bits wide).  Each LED by that particular row lights up when it is being accessed.  This plugs into the ROM port.  It’s just 32 switches, 128 diodes, two 74HC245’s, a 74138, and a 74123 astable multivibrator chip to add wait states (this is mainly for testing- I want to use some more exotic ROM some time).

Quick overview of the various PCBs:

Timing board
Timing board
Timing board
Timing board

First stop is the timing board.  It generates the 16 T state phases and has the johnson counter to produce the three nonoverlapping clock phases, denoted phi0 through phi2.  To latch data into a register, one of these clock phases is NANDed with one of the T states.  The crystal oscillator is on the timing board along with the single stepping and animate oscillator.  Interestingly, the crystal I selected was a 3.6864MHz one, but the NAND oscillator is slllightly overdriving it and it’s actually running at 3x this!  About 11MHz as shown on the frequency counter.  I will eventually change it out to see how fast it’ll go.  To quote photonicinduction, I will “Crank ‘er up till she pops” and it quits functioning properly.  I might be able to get it up to 20MHz before the CPU malfunctions.

Program counter high
Program counter high
Program counter low
Program counter low

Next up is the program counter.  Each board handles 8 bits of it.  There’s the basic program counter latches, the 1 bit half adder to increment it, and the 3 level stack.  The stack takes up most of the two boards.   There’s not much more to it.


ROM and misc. logic
ROM and misc. logic

This board contains the ROM, and a header for a cable (not on this picture).  The added header runs to the bogoROM board.  A bunch of the random logic is on here- interrupt handling and JSR instruction  (jump to subroutine, aka “call”) stuff.  The EPROM is a 64K*16 bit model.  The NANDputer supports 64K words of program ROM, in 16 4K banks.  The program counter only increments the lower 12 bits, while the upper 4 are latched.  This is mainly due to running out of T states to increment all the bits.  If I extended the T state count, I could’ve incremented all 16.

Indexer's bad hair day
Indexer’s bad hair day
Indexer done
Indexer done

Next is the indexer.  Its job is to perform relative addressing, for reading or writing arrays in memory using the index register (X).  The first picture of it is complete, but the wires have not been “dressed” nicely to make it look nice and tidy.  It’s mainly some multiplexing and stuff.

RAM board top

The RAM board is next, and gets most of its inputs from the indexer.  I have an 8K*8 bit SRAM on here.  The empty spot on the board is for a RAM header to use external RAM devices.  I hope to use core memory or a delay line memory for RAM, eventually.


The ALU is after the RAM board.  Its job is fairly obvious.  It can add, subtract, rotate left, rotate right, increment, decrement, AND, OR, XOR, set all bits, clear all bits and set/clear individual bits.  I have not dressed the wires since I was still working on it.  I think I have it fully debugged.  This was the hardest part to debug and design due to the convoluted logic I employed.


IO Board (missing audio circuits)

The IO board isn’t very NAND-ey but this is peripherals.  I don’t think making an audio or video chip would be terribly easy to do out of NAND gates.  I will probably eventually replace this with a board with a NAND made UART, however.  On this board are two 82C55 triple 8 bit parallel ports, 82C51 UART, 82C54 triple timer, 29F002 2Mbit 8 bit flash ROM (for storing data), RP2C02 NES PPU with 32K of SRAM, SP0256-AL2 speech chip, SN76489 sound chip, and a YM2413 FM chip.  There’s also an AY-3-8912 sound chip, too.

All Nandputer boards installed into the backplane
All Nandputer boards installed into the backplane

To hook it all together is a backplane.   The backplane ties all of them together, and the display board plugs into this, too.

Front panel of the NANDputer with all the LEDs and controls

The display board plugs into the front of the backplane, and shows what’s going on.  The LED descriptions:

Top row is the program counter address and the 16 bit instruction word at this address.

The next three rows of LEDs (16 per row) are the 3 levels of the stack.  Under this is the halt LED (left) and the 16 T states

Then next row is 13 LEDs.  the first 12 LEDs are the RAM address (12 bits) and an unused LED.

The bottom row is the accumulator (left 8 bits) and status bits (carry, sign, zero, interrupt and an extra).

Switches on the very bottom left to right are:  reset, instruction / T-state,  run/stop,  free-run/animate, and single step.  The pot adjusts the speed at which it animates (automatic single step).

An early video of it running the program counter (note how the address “settles” down as the carry propagates up the bits making up the program counter.)

The other video is running a small 8 step program that causes the accumulator to shift a bit back and forth in “Knight Rider” fashion.   The BogoROM is used to store the program.

Here’s the down and dirty on the gate and chip counts:

Gate and Chip Counts

Resource usage by chip type:

gate:   00   10   20   30  133   03   total
display 10    1   10    3    0    8   32
timing  21   16    9    0    0    0   46
PCL     39   18    0    0    2    0   59  
PCH     39   16    4    0    0    0   59
ROM     22    5    7    5    1    0   40
indexer 43    4   10    3    0    0   60
RAM     42    7    2    4    1    0   56
total  216   67   42   15    4    8   352

Resource usage by gate type:

gate:   2-in  3-in  4-in  8-in 13-in  2-OC unusd total
display  40     3    20     3     0    32   -2     96
timing   84    48    18     0     0     0    0    150
PLC     156    54     0     0     2     0   -1    212
PCH     156    48     8     0     0     0   -1    211
ROM      88    15    14     5     1     0    0    123
indexer 172    12    20     3     0     0    0    207
RAM     168    21     4     4     1     0    0    198
total   864   201    84    15     4    32   -4   1196
%     72.00 16.75  7.00  1.25  0.33  2.67 

unused gates

gate:   2-in  3-in  4-in  total
display   1     0     1     2
timing    0     0     0     0
PCL       1     0     0     1
PCH       0     1     0     1
ROM       0     0     0     0
indexer   0     0     0     0
RAM       0     0     0     0
total     2     1     1     4