“Hello, World!” Part Two

22 Jan 2016

In the first post, we took a look at the C code that defines what our game does (which, for now, is just writing “Hello, World!” on the screen). This time, we’re going to explore the toolchain that turns our code into a runnable ROM and the startup code that initializes the NES into a known state.

Building the ROM

Taking a look at our Makefile, we see that we’re running a few different commands to build our ROM:

cc65 -Oi hello_world.c --add-source
ca65 hello_world.s
ca65 reset.s
ld65 -C hello_world.cfg -o hello_world.nes reset.o hello_world.o nes.lib

The first command invokes cc65, a C compiler, to compile our C code into 6502 assembly. The flag -Oi tells the compiler to optimize the generated code and to inline more aggressively, as we want to avoid making slow function calls. The flag --add-source includes our original C source as comments in the resulting assembly code (hello_world.s) so we can poke around if we’re curious.

Next, we invoke ca65, a 6502 assembler, to assemble the output from cc65 and our NES startup code into object files (6502 CPU machine code, though not yet executable).

Finally, we’re invoking ld65, a linker, which combines our object files with its implementation of C standard library for the NES (notice the nes.lib in the command) into an executable NES ROM. The -o hello_world.nes argument tells the linker that we’d like our executable to be named “hello_world.nes”, while the -C hello_world.cfg tells the linker to use a configuration file that defines how memory is laid out, amongst other things. We now have an executable NES ROM!

The Linker Configuration

We’ve created a configuration file for ld65 to do three things:

Define memory areas that represent ranges of addresses.
Define segments and lay out in which areas they belong.
Define symbols needed by our startup code.

Memory Areas

Defining the memory areas is as simple as defining a start address and a size and assigning it to a name:

ZP: start = $00, size = $100;
OAM: start = $0200, size = $100;
RAM: start = $0300, size = $300;
STACK: start = __STACK_START__, size = __STACK_SIZE__;

The first four areas we create—ZP, OAM, RAM, and STACK—will define sections of the NES CPU RAM, while the rest will define sections of the cartridge ROM chips. We’re defining ZP as an explicit area that represents the zero page we talked about in the last post. It starts at memory address 0x0000 and is 256 (0x100) bytes in size. We’re going to skip over the next two pages for now (page one is used for the 6502 CPU stack and we will make specific use of page two in later tutorials). For the remaining five pages (0x500 or 1280 bytes), we will use three as general-purpose RAM and set aside two for the C stack. We will define __STACK_START__ and __STACK_SIZE__ below as symbols so that we can reference them in our startup code, mentioned below.

HEADER: start = $0, size = $10, file = %O, fill = yes;

We’re going to set aside sixteen (0x10) bytes at the beginning of our ROM file to hold the iNES header expected by NES emulators at the beginning of .nes files. Later in this tutorial we’ll see where that is populated. The file = %0 parameter tells the linker that the header will end up in the file passed in via the -o option, and fill = yes says that if we don’t use the entire area, fill the remainder of the space with zero bytes.

PRG: start = $8000, size = $3ffa, file = %O, fill = yes;

The NES CPU expects the PRG ROM in the cartridge to map to the addresses 0x8000 through 0xffff. This is 32 KiB of address space, though cartridges may have as little as 16 KiB of PRG ROM as much as 1 MiB or more! To be able to make use of more than 32 KiB of PRG ROM, cartidges have a mapper that can control which chunks of PRG ROM are mapped to which CPU addresses. For this extremely simple game, we’re going to use the simplest mapper: NROM (more specifically the NES-NROM-128). This mapper will map a single 16 KiB bank of PRG ROM to CPU addresses 0x8000-0xbfff, which will also be mirrored in 0xc000-0xffff.

VECTORS: start = $fffa, size = $6, file = %O, fill = yes;

You’ll notice that we don’t make the PRG area take up the full 16 KiB (0x4000 bytes), but six bytes short of that at 0x3ffa. We’re explicitly setting aside six bytes of space for three 16-bit addresses that the CPU expects in a fixed space at the end of the cartridge space (i.e. 0xfffa-0xffff). These addresses, which we define in the VECTORS area, point to interrupt vectors—code that is run when interrupts are triggered—for three different types of interrupts expected by the CPU. Two of these interrupts we don’t care about just yet, but the other is the reset vector, which will point to our startup code.

CHR: start = $0000, size = $2000, file = %O, fill = yes;

Finally, we create an area for our CHR ROM, which will contain our pattern tables. The CHR ROM will be mapped to PPU memory addresses 0x0000-0x1fff, though the cartridge may contain much more than 8 KiB (0x2000 bytes) of CHR ROM (and/or CHR RAM) which is mapped to these addresses by the mapper. Our simple NROM mapper just expects 8 KiB of CHR ROM which will map directly to the 8 KiB of PPU address space for pattern tables.

Segments

Now that our memory areas are set up, we can define segments and assign them to those memory areas. If multiple segments belong in the same memory area, they are added in the order in which they appear here. Segments are written to the output file in the order in which they appear as well, so we’ll write our header first:

HEADER: load = HEADER, type = ro;

Here we’re creating a segment named HEADER, telling the linker that it belongs in the HEADER memory area, and marking it as read-only (ro) initialized data. After the iNES header, emulators expect to find one or more 16 KiB PRG ROM banks, so we’ll create a few segments that belong in the PRG memory area (as well as the VECTORS memory area, which is part of PRG ROM):

STARTUP: load = PRG,            type = ro,  define = yes;
CODE:    load = PRG,            type = ro,  define = yes;
RODATA:  load = PRG,            type = ro,  define = yes;
DATA:    load = PRG, run = RAM, type = rw,  define = yes;
VECTORS: load = VECTORS,        type = ro;

The segments CODE, RODATA, and DATA are used by the C compiler to store read-only code, read-only data, and read/write data, respectively. We’re defining the STARTUP segment is used for storing read-only code that we’ve written in reset.s to initialize the NES, and the VECTORS segment will store pointers to our interrupt handlers.

Because the DATA section is read/write (rw), and our ROM is read-only, anything in DATA will have to be loaded into RAM. We use the run = RAM attribute to tell the linker that, even though DATA will reside in the PRG memory area, references to anything in DATA should use addresses from the RAM area.

After our PRG ROM banks, emulators will expect to find CHR ROM banks:

CHARS: load = CHR, type = ro;

The CHARS segment is the last that will end up in our output file, though we still need to define a few more segments:

ZEROPAGE: load = ZP,  type = zp;
BSS:      load = RAM, type = bss, define = yes;

We saw the ZEROPAGE segment in use in the last post when using #pragma bss-name in our C code to define zero page variables. The BSS segment is used by the C compiler for uninitialized read/write data. Since both of these have uninitialized types (zp and bss), they are not written to the output file.

Symbols

All that’s left for us to do is tell the linker to export a couple of symbols we’ll need to set up the C stack:

__STACK_START__: type = weak, value = $0600;
__STACK_SIZE__:  type = weak, value = $200;

Startup Code

The last piece of the puzzle is our scary assembly file that contains the NES startup code. We won’t explore this in great depth, but the file is liberally commented for the curious.

After importing a few symbols such as our C main function (notice the _ prepended by the C compiler) which we’ll need to call into, we define a few aliases for memory-mapped registers and plop our iNES header into the HEADER segment.

The start label is where we’ll point our reset interrupt handler, and it contains a bunch of assembly code to get the NES into a known state (e.g. all interrupts disabled, rendering off, sound off, RAM zeroed, all sprites off-screen, PPU stabilized). You’ll see at the very end of this label we make use of our __STACK_START__ and __STACK_SIZE__ symbols to set up the C stack pointer before calling into our C main with jmp _main.

The remainder of the STARTUP section is defining the other two interrupt handlers which for now do nothing but rti—return from interrupt.

Finally, we set the addresses of our interrupt handlers in the VECTORS segment and include our pattern tables (via .incbin, which tells the assembler to load the “sprites.chr” file as-is and not to assemble it) into the CHARS segment.

That’s it! We’ve covered everything that goes into making a simple NES ROM from scratch.

Next Time

In the next post, Color in Motion, we’ll create a more interesting game loop and explore using palettes to add color to our creation.