Building an Intel 8080 Emulator from Scratch

June 18, 2026 · Yusuf Abdul-Mateen · ~12 min read

I wanted to understand computers at the lowest level — not just what a CPU does, but how it does it. The best way I know to learn something that deeply is to build one. Not a real CPU (I don't have a fab lab), but an emulator: a program that pretends to be a CPU and runs real machine code.

I chose the Intel 8080, the 8-bit microprocessor from 1974 that powered the original Space Invaders arcade cabinet. It's well-documented, has real ROMs available, and there's a fantastic tutorial by Emulator 101 that walks through the entire process. I leaned on that tutorial heavily, but I didn't copy it code-for-code. I wanted to understand every decision and make my own choices about the design.

Full source on GitHub. The ROM is from cbeust/space-invade.rs — a single combined binary in the repo root.

How I Started

Emulator 101's tutorial is structured around implementing instructions one at a time: run the ROM, hit an unimplemented instruction, implement it, repeat. You only wire up what the game actually needs — about 50 out of 256 opcodes.

I took a different first step. Before running anything, I mapped all 256 opcodes into a table:

struct Opcode {
    int opcode;
    std::string instruction;
    int size;           // bytes: 1, 2, or 3
};

std::unordered_map<int, Opcode> opcodeMap;

opcodeMap.insert({
    {0x00, {0x00, "NOP", 1}},
    {0x01, {0x01, "LXI B,D16", 3}},
    {0xc3, {0xc3, "JMP adr", 3}},
    // ... all 256
});

Why do this first? Because every opcode has a known size (1, 2, or 3 bytes). If I know the size, I can disassemble any binary correctly from the start — even before I implement execution. The tutorial's disassembler is a separate function. Mine is just a loop that reads from the opcode map.

I cross-referenced each opcode against the Intel 8080 Programmer's Manual to get the mnemonics and sizes right. This took about an hour, but it meant I never had to guess how many bytes an instruction consumed.

The Disassembler Came First

Once the opcode table was built, writing a disassembler was trivial. Walk through the binary, print each instruction, advance by its size:

0x0000: NOP
0x0001: LXI SP, D16
0x0004: LXI H, D16
0x0007: MVI A, D8
0x0009: DCR A
0x000a: JNZ
...

I also wrote a hex dump utility (my_hexdump.cpp) to inspect the raw ROM bytes:

00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00

Having both outputs let me verify that my opcode table was correct. I could pick any address in the hex dump, decode the instruction manually, and check it against my disassembler's output. This was my first debug layer, and it caught several typos early.

Getting the ROM

The Space Invaders ROM is split into four chip images: invaders.e, invaders.f, invaders.g, and invaders.h. These are the original EPROM dumps from the arcade hardware. I found a combined binary in the space-invade.rs repo by Cédric Beust, which is a Rust-based Space Invaders emulator that also passes the cpudiag test suite.

The four chip files get concatenated and loaded into memory starting at address 0x0000:

while (file_ref) {
    file_ref.read(reinterpret_cast<char *>(&block), 1);
    file_memory.Mblock[file_ref.tellg() - 1] = block;
}

while (true) {
    executeNextInstruction(file_memory, file_cpu);
}

How I Structured the Emulator

The tutorial uses a single giant switch function (Emulate8080Op) that handles every opcode inline. I grouped instructions by category into separate handler functions. This was my main design departure.

The CPU state:

struct Cpu_frame {
    u_int8_t a, b, c, d, e, h, l;  // 7 registers
    u_int16_t sp, pc;               // stack pointer, program counter
    std::bitset<8> flags;           // Z, S, P, CY, AC
    u_int8_t int_enable;
};

struct Memory {
    std::array<u_int8_t, 0xffff> Mblock;
};

Memory is a flat 64KB array. Registers are separate variables (not an array), which made the code more readable. I liked being able to write cpu.a instead of state->registers[7].

The Dispatch Function

executeNextInstruction fetches the full instruction (opcode + operands), advances the PC, and dispatches by opcode:

void executeNextInstruction(Memory &memory, Cpu_frame &cpu) {
    auto inst = getNextInstructionArr(memory, cpu.pc);
    cpu.pc += inst.size();

    switch (inst[0]) {
    case 0x00: break;                              // NOP
    case 0xc3: jumpInstruction(inst, cpu);         // JMP
    case 0xc2: jumpInstruction(inst, cpu);         // JNZ
    case 0xcd: callandretInstruction(inst, cpu, memory); // CALL
    case 0xc9: callandretInstruction(inst, cpu, memory); // RET
    case 0xfe: logicalInstruction(inst, cpu);      // CPI
    case 0x06: moveInstruction(inst, cpu, memory);  // MVI
    case 0x24: incrementInstruction(inst, cpu);    // INR
    default: unimplemented_instruction(inst[0]);
    }
}

Each category gets its own switch on the opcode byte. For example, the jump handler handles unconditional jumps and all the conditional variants:

void jumpInstruction(std::vector<int16_t> instructions, Cpu_frame &cpu) {
    switch (instructions[0]) {
    case 0xc3: // JMP adr — unconditional
        cpu.pc = (instructions[2] << 8) | instructions[1];
        break;
    case 0xc2: // JNZ adr — jump if not zero
        if (!cpu.flags[Z])
            cpu.pc = (instructions[2] << 8) | instructions[1];
        break;
    }
}

This grouping made the code easier to reason about. All jump logic lives in one place. All stack operations in another. When I had a bug in a CALL instruction, I knew exactly which function to look at.

Flags: The Tricky Part

The 8080 has five status flags: Zero, Sign, Parity, Carry, and Auxiliary Carry. Most instructions update some subset of these, but not all flags, and the rules vary per instruction.

void updateFlags(u_int16_t answer, Cpu_frame &cpu) {
    cpu.flags[Z] = ((answer & 0xff) == 0);
    cpu.flags[S] = ((answer & 0x08) != 0);   // bit 3, not bit 7
    cpu.flags[P] = __builtin_parity(answer);
    cpu.flags[CY] = (answer > 0xff);
    cpu.flags[AC] = false;
}

One thing that tripped me up: the 8080's Sign flag checks bit 3, not bit 7 (the high bit). That's because the 8080 uses the Sign flag for BCD sign, not two's complement sign. I had it wrong on my first pass and only caught it when the disassembly showed flags being set in unexpected ways.

Another subtlety: INR and DCR affect all flags except Carry. So you save and restore the carry flag around the update:

u_int16_t temp = cpu.h + 1;
cpu.h = temp & 0xff;
bool carryflag = cpu.flags[CY];
updateFlags(temp, cpu);
cpu.flags[CY] = carryflag;

This pattern repeats for every increment/decrement operation. It's easy to miss if you're copying the Intel manual's pseudocode, which doesn't explicitly call out this behaviour.

How CALL and RET Work

These were the hardest instructions to get right, because a bug in CALL corrupts the return stack, and then RET goes to the wrong address, and suddenly you're executing the ROM's high-score table as code.

CALL pushes the return address (PC + 3, since CALL is a 3-byte instruction) onto the stack and sets PC to the target:

case 0xcd: // CALL adr
    memory.Mblock[cpu.sp - 1] = (cpu.pc >> 8) & 0xff;  // high byte
    memory.Mblock[cpu.sp - 2] = cpu.pc & 0xff;          // low byte
    cpu.sp -= 2;
    cpu.pc = (instructions[2] << 8) | instructions[1];
    break;

RET pops two bytes off the stack and writes them to PC:

case 0xc9: // RET
    cpu.pc = (memory.Mblock[cpu.sp + 1] << 8) |
               memory.Mblock[cpu.sp];
    cpu.sp -= 2;  // actually SP += 2, but the stack grows downward
    break;

The 8080 stack grows downward. SP starts high (usually 0x2400 in Space Invaders) and decrements on PUSH/CALL. Get the direction wrong and you'll be writing into your own code space.

What I Learned Going Through This

Build the disassembler first

This is the single best decision I made. Being able to see what the ROM is doing — in human-readable mnemonics — made every subsequent debugging session faster. The tutorial does this too, but I'd recommend spending extra time here. Verify that your disassembler matches known-good output for the first 100 instructions before writing a single line of execution code.

Map all opcodes, even if you don't implement them

The opcode table serves as documentation, a disassembler data source, and an execution reference. I found opcodes I'd never heard of (RST, XTHL, DAA) just by filling in the table. Knowing they exist meant I wasn't surprised when the ROM hit one.

Group instructions by category

The tutorial's flat switch works, but grouping by category helped me catch inconsistencies. For example, all arithmetic instructions (ADD, ADC, SUB, SBB) follow the same pattern. Implementing them as a group meant I got eight instructions for the mental effort of one.

Don't implement what you can't test

The tutorial gives this advice and it's spot on. I mapped all 256 opcodes but only wired up the ~25 the Space Invaders ROM actually hits in the first 50,000 instructions. The untested opcodes are in the table for when the game needs them.

Learn to read assembly listings

At some point, your emulator will do something wrong and you'll need to trace through the code manually. The Space Invaders assembly listing on Computer Archeology is an incredible resource. I used it to verify my understanding of the game's initialisation sequence.

Current State and What's Next

The emulator successfully loads the Space Invaders ROM, runs through the initialisation sequence, and gets to the infinite loop that waits for the first interrupt. About 25 opcodes are implemented across six categories:

Data transfer: MVI, MOV, LXI, LDAX, STA, LDA
Arithmetic: INR, DCR, INX, DCX, DAD
Logical: CPI, ANA, ORA, XRA, CMP
Branching: JMP, JNZ, JZ, CALL, RET
Stack: PUSH, POP
Other: NOP, HLT, XCHG

What I'm working on next:

Cycle counting — each 8080 instruction takes a known number of T-states. Adding cycle counts is the first step toward getting the game to run at the right speed.
I/O ports — Space Invaders uses IN/OUT instructions for input (coin switch, buttons) and output (shift register for bitmapped graphics).
Interrupts — the game hardware generates RST instructions at specific scanline positions to trigger the display update.
Full opcode coverage — the remaining ~230 opcodes exist in the table but aren't wired into the execution switch yet.

Try It Yourself

The emulator is a single C++ file with zero dependencies. Download the ROM, build it, and run it:

git clone https://github.com/ookaay/Intel8080emu.git
cd Intel8080emu
g++ m_to_assembly.cpp -o invaders
./invaders "invaders folder/invaders"

No arguments prints the entire opcode table (proof you've mapped all 256):

$ ./invaders
0x00: NOP (1 byte)
0x01: LXI B,D16 (3 bytes)
0x02: STAX B (1 byte)
...

Point it at a ROM and you get the hex dump, the disassembly, and then execution starts:

$ ./invaders "invaders folder/invaders"

# Hex dump:
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00

# Disassembly:
0x0000: NOP
0x0001: LXI SP, D16
0x0004: LXI H, D16

# Execution:
PC=0x0000  A=00 B=00 C=00 D=00 E=00 H=00 L=00 SP=0000
Executing: NOP
...

If you're building your own emulator, I can't recommend the Emulator 101 tutorial enough. It walks through the entire machine — not just the CPU, but the display, interrupts, and controls. My implementation took a different path on some decisions, but that's the point: build it in a way that makes sense to you, and you'll learn more than if you copied anything verbatim.

View on GitHub · Back to Blog