Writing a 6502 Assembler
So I wrote a 6502 Assembler which uses the same syntax you would find in most online tutorials like this one. I found the syntax simple and straight forward to what I needed and the documentation for many other assemblers fairly shotty and with their own unique symbols for things. Like not being able to do define thing $9F
in other assemblers is fairly frustrating. For example, if you were to just copy the snake game directly from the above tutorial link, and paste it into a file, then run it through my assembler, it will assemble it to the same exact hexcode of the site. There are some additions that are required (such as program address offset) but we’ll cover that stuff further down this page.
JMP
- Some background info and goals
- Trying it out
- Setting the program offset
- Special instruction DCB
- Special symbols (#< and #>)
- Instruction table
Some background info and goals
I wanted to make sure this assembler would assemble code that will work on any machine which runs 6502 machine code. So before writing out this page I made sure that it worked by writing a program that I could assemble and run on the Commodore 64 VICE emulator. For this I just quickly grabbed some code from the internet and assembled it using my assembler. From there I dropped the generated .prg
file onto VICE to run it and it worked out great. Below is an image of VICE running the program and the source code I used.
*=$0800
DCB $01 $08 $0b $08 $01 $00 $9e $32 $30 $36 $31 $00 $00 $00
LDX #0 ; X = 0
loop:
TXA ; copy X to A
STA $0400,X ; put A at $0400+X
STA $d800,X ; put A as a color at $d800+x. Color RAM only considers the lower 4 bits,
; so even though A will be > 15, this will wrap nicely around the 16 available colors
INX ; X=X+1
CPX #27 ; have we written enough chars?
BNE loop
RTS ; all done
Trying it out
I know that this assembler is not really anything that anyone want’s to try out or mess with, but if you have any interest in it feel free to check it out. The C# project is open source on GitHub. I’ll probably add more to the assembler as I need it, but the main goal was to (1) learn all about 6502 assembly and (2) create an Assembler that is bare bones and works with the syntax I know so far. Yes, yes, I know there are a ton of assemblers out there that I can just download and use (and I have) but that takes all the fun out of things; what can I say, I’m curious.
Setting the program offset
Setting the program offset is fairly standard across the compilers I’ve looked at so I followed the same syntax. Below is an example of how you can set the program starting address in hexidecimal. Note that decimal is not supported because I don’t see the necessity for it at the moment. Most computers like the Commodore 64 or Commander X16 will tell you the program starting address in hex anyway.
*=$6502
You can use this to change the relative address of the branch/jump addresses on the fly, this is useful if you use a feature of the system to setup something but your program code is starting at a different location (see below). Basically, the short of this is that when the assembler inputs the address for any branch or jump logic, it uses this relative address to know what value to replace labels with. Normally, when you type something like JMP my_label
the assembler will replace it with JMP $0608
then use that as the absolute address. If we used this exact example, let’s imagine our address was set to *=$0600
, if we then changed it to *=0800
then the JMP my_label
will be turned into JMP $0808
.
*=$06FF ; Set relative address
; Do some instructions that use the above relative address
*=$0600 ; Set back to where program code is
Special instruction DCB
Something you don’t see in the linked tutorial is an instruction named DCB
. This is because it is not an instruction on the CPU, it is more of an instruction for the assembler. It tells the assembler to put bytes directly within your program code. So if you had the following assembly code:
DEX
player_data:
DCB $99 $84 $F3 $1F
Then your program will have the following code (see the instruction table for the instruction DEX to see where the CA
comes from):
CA 99 84 F3 1F
But wait, there’s more! The DCB
command also serves for strings. I currently have it mapped to the Commodore 64 characters, but that is easy to change to any set (will add a configuration file later). The way you do this is by using the back-quote/back-tick `
character to wrap your strings. See the below example:
DCB `This is some text!`
What the assembler does, is during the line analysis phase in the beginning, it will check for the string and if found, convert each character to it’s corresponding byte value. It does this by actually altering the text contents of the line, then everything from there on processes as it normally does.
Special symbols (#< and #>)
While developing in 6502 assembly you are going to want to get the address high byte and low byte for labels. This will help you to store jump addresses within the zero page of memory so you can essentially pass a label as an argument to a routine. Though this is primarily useful for labels, you could also just use the standard address syntax here as well. Below is an example of how it is used and what it will produce when used.
LDA #>try_something
STA $00
LDA #<try_something
STA $01
NOP
try_something: ; For learning, assume the address for this label is $18F3
TAX
;...
The following is what the assembler will turn your code into:
LDA #$18
STA $00
LDA #$F3
STA $01
NOP
try_something: ; For learning, assume the address for this label is $18F3
TAX
;...
Instruction table
Below is a table of all the instructions for the assembler, note that there is a DCB
instruction, this is because the following is completely auto-generated from the assembler source code using reflection. Also (just below) is another table to describe how the big table works.
Column | Description |
---|---|
Mnemonic | The instruction name |
Argument | The type of argument provided to the instruction |
OpCode | The hex code that is written to identify this instruction |
Flags | The flags that are affected by the invocation of this instruction |
Clock | The number of clock cycles this instruction takes to execute |
SkipClock | Really only used on branches for if the branch is skipped |
BoundsClock | The additional clock cycles required if this instruction passes a page boundary during execution |
ADC / AND / ASL / BCC / BCS / BEQ / BIT / BMI / BNE / BPL / BRK / BVC / BVS / CLC / CLD / CLI / CLV / CMP / CPX / CPY / DCB / DEC / DEX / DEY / EOR / INC / INX / INY / JMP / JSR / LDA / LDX / LDY / LSR / NOP / ORA / PHA / PHP / PLA / PLP / ROL / ROR / RTI / RTS / SBC / SEC / SED / SEI / STA / STX / STY / TAX / TAY / TSX / TXA / TXS / TYA
Mnemonic | Argument | OpCode | Flags | Clock | SkipClock | BoundsClock |
---|---|---|---|---|---|---|
ADC | #09 or #$F9 | 0x69 | N O Z C | 2 | 0 | 0 |
ADC | $F9 | 0x65 | N O Z C | 3 | 0 | 0 |
ADC | $F9,X | 0x75 | N O Z C | 4 | 0 | 0 |
ADC | $0200 | 0x6D | N O Z C | 4 | 0 | 0 |
ADC | $0200,X | 0x7D | N O Z C | 4 | 0 | 1 |
ADC | $0200,Y | 0x79 | N O Z C | 4 | 0 | 1 |
ADC | ($09),X | 0x61 | N O Z C | 6 | 0 | 0 |
ADC | ($09),Y | 0x71 | N O Z C | 5 | 0 | 1 |
AND | #09 or #$F9 | 0x29 | N Z | 2 | 0 | 0 |
AND | $F9 | 0x25 | N Z | 3 | 0 | 0 |
AND | $F9,X | 0x35 | N Z | 4 | 0 | 0 |
AND | $0200 | 0x2D | N Z | 4 | 0 | 0 |
AND | $0200,X | 0x3D | N Z | 4 | 0 | 1 |
AND | $0200,Y | 0x39 | N Z | 4 | 0 | 1 |
AND | ($09),X | 0x21 | N Z | 6 | 0 | 0 |
AND | ($09),Y | 0x31 | N Z | 5 | 0 | 1 |
ASL | 0x0A | N Z C | 2 | 0 | 0 | |
ASL | A | 0x0A | N Z C | 2 | 0 | 0 |
ASL | $F9 | 0x06 | N Z C | 5 | 0 | 0 |
ASL | $F9,X | 0x16 | N Z C | 6 | 0 | 0 |
ASL | $0200 | 0x0E | N Z C | 6 | 0 | 0 |
ASL | $0200,X | 0x1E | N Z C | 7 | 0 | 0 |
BCC | $0200 | 0x90 | 1 | 1 | 1 | |
BCS | $0200 | 0xB0 | 1 | 1 | 1 | |
BEQ | $0200 | 0xF0 | 1 | 1 | 1 | |
BIT | $F9 | 0x24 | N O Z | 3 | 0 | 0 |
BIT | $0200 | 0x2C | N O Z | 3 | 0 | 0 |
BMI | $0200 | 0x30 | 1 | 1 | 1 | |
BNE | $0200 | 0xD0 | 1 | 1 | 1 | |
BPL | $0200 | 0x10 | 1 | 1 | 1 | |
BRK | 0x00 | 7 | 0 | 0 | ||
BVC | $0200 | 0x50 | 1 | 1 | 1 | |
BVS | $0200 | 0x70 | 1 | 1 | 1 | |
CLC | 0x18 | C | 2 | 0 | 0 | |
CLD | 0xD8 | D | 2 | 0 | 0 | |
CLI | 0x58 | I | 2 | 0 | 0 | |
CLV | 0xB8 | O | 2 | 0 | 0 | |
CMP | #09 or #$F9 | 0xC9 | 2 | 0 | 0 | |
CMP | $F9 | 0xC5 | 3 | 0 | 0 | |
CMP | $F9,X | 0xD5 | 4 | 0 | 0 | |
CMP | $0200 | 0xCD | 4 | 0 | 0 | |
CMP | $0200,X | 0xDD | 4 | 0 | 1 | |
CMP | $0200,Y | 0xD9 | 4 | 0 | 1 | |
CMP | ($09),X | 0xC1 | 6 | 0 | 0 | |
CMP | ($09),Y | 0xD1 | 5 | 0 | 1 | |
CPX | #09 or #$F9 | 0xE0 | N Z C | 2 | 0 | 0 |
CPX | $F9 | 0xE4 | N Z C | 3 | 0 | 0 |
CPX | $0200 | 0xEC | N Z C | 4 | 0 | 0 |
CPY | #09 or #$F9 | 0xC0 | N Z C | 2 | 0 | 0 |
CPY | $F9 | 0xC4 | N Z C | 3 | 0 | 0 |
CPY | $0200 | 0xCC | N Z C | 4 | 0 | 0 |
DCB | $F9 | 0xFF | 0 | 0 | 0 | |
DEC | $F9 | 0xC6 | N Z | 5 | 0 | 0 |
DEC | $F9,X | 0xD6 | N Z | 6 | 0 | 0 |
DEC | $0200 | 0xCE | N Z | 6 | 0 | 0 |
DEC | $0200,X | 0xDE | N Z | 7 | 0 | 0 |
DEX | 0xCA | 2 | 0 | 0 | ||
DEY | 0x88 | 2 | 0 | 0 | ||
EOR | #09 or #$F9 | 0x49 | N Z | 2 | 0 | 0 |
EOR | $F9 | 0x45 | N Z | 3 | 0 | 0 |
EOR | $F9,X | 0x55 | N Z | 4 | 0 | 0 |
EOR | $0200 | 0x4D | N Z | 4 | 0 | 0 |
EOR | $0200,X | 0x5D | N Z | 4 | 0 | 1 |
EOR | $0200,Y | 0x59 | N Z | 4 | 0 | 1 |
EOR | ($09),X | 0x41 | N Z | 6 | 0 | 0 |
EOR | ($09),Y | 0x51 | N Z | 5 | 0 | 1 |
INC | $F9 | 0xE6 | N Z | 5 | 0 | 0 |
INC | $F9,X | 0xF6 | N Z | 6 | 0 | 0 |
INC | $0200 | 0xEE | N Z | 6 | 0 | 0 |
INC | $0200,X | 0xFE | N Z | 7 | 0 | 0 |
INX | 0xE8 | 2 | 0 | 0 | ||
INY | 0xC8 | 2 | 0 | 0 | ||
JMP | $0200 | 0x4C | 3 | 0 | 0 | |
JMP | ($0200) | 0x6C | 5 | 0 | 0 | |
JSR | $0200 | 0x20 | 6 | 0 | 0 | |
LDA | #09 or #$F9 | 0xA9 | N Z | 2 | 0 | 0 |
LDA | $F9 | 0xA5 | N Z | 3 | 0 | 0 |
LDA | $F9,X | 0xB5 | N Z | 4 | 0 | 0 |
LDA | $0200 | 0xAD | N Z | 4 | 0 | 0 |
LDA | $0200,X | 0xBD | N Z | 4 | 0 | 1 |
LDA | $0200,Y | 0xB9 | N Z | 4 | 0 | 1 |
LDA | ($09),X | 0xA1 | N Z | 6 | 0 | 0 |
LDA | ($09),Y | 0xB1 | N Z | 5 | 0 | 1 |
LDX | #09 or #$F9 | 0xA2 | N Z | 2 | 0 | 0 |
LDX | $F9 | 0xA6 | N Z | 3 | 0 | 0 |
LDX | $F9,Y | 0xB6 | N Z | 4 | 0 | 0 |
LDX | $0200 | 0xAE | N Z | 4 | 0 | 0 |
LDX | $0200,Y | 0xBE | N Z | 4 | 0 | 1 |
LDY | #09 or #$F9 | 0xA0 | N Z | 2 | 0 | 0 |
LDY | $F9 | 0xA4 | N Z | 3 | 0 | 0 |
LDY | $F9,X | 0xB4 | N Z | 4 | 0 | 0 |
LDY | $0200 | 0xAC | N Z | 4 | 0 | 0 |
LDY | $0200,X | 0xBC | N Z | 4 | 0 | 1 |
LSR | 0x4A | N Z C | 2 | 0 | 0 | |
LSR | A | 0x4A | N Z C | 2 | 0 | 0 |
LSR | $F9 | 0x46 | N Z C | 5 | 0 | 0 |
LSR | $F9,X | 0x56 | N Z C | 6 | 0 | 0 |
LSR | $0200 | 0x4E | N Z C | 6 | 0 | 0 |
LSR | $0200,X | 0x5E | N Z C | 7 | 0 | 0 |
NOP | 0xEA | 2 | 0 | 0 | ||
ORA | #09 or #$F9 | 0x09 | N O Z | 2 | 0 | 0 |
ORA | $F9 | 0x05 | N O Z | 3 | 0 | 0 |
ORA | $F9,X | 0x15 | N O Z | 4 | 0 | 0 |
ORA | $0200 | 0x0D | N O Z | 4 | 0 | 0 |
ORA | $0200,X | 0x1D | N O Z | 4 | 0 | 1 |
ORA | $0200,Y | 0x19 | N O Z | 4 | 0 | 1 |
ORA | ($09),X | 0x01 | N O Z | 6 | 0 | 0 |
ORA | ($09),Y | 0x11 | N O Z | 5 | 0 | 1 |
PHA | 0x48 | 3 | 0 | 0 | ||
PHP | 0x08 | 3 | 0 | 0 | ||
PLA | 0x68 | 4 | 0 | 0 | ||
PLP | 0x28 | 4 | 0 | 0 | ||
ROL | 0x2A | N Z C | 2 | 0 | 0 | |
ROL | A | 0x2A | N Z C | 2 | 0 | 0 |
ROL | $F9 | 0x26 | N Z C | 5 | 0 | 0 |
ROL | $F9,X | 0x36 | N Z C | 6 | 0 | 0 |
ROL | $0200 | 0x2E | N Z C | 6 | 0 | 0 |
ROL | $0200,X | 0x3E | N Z C | 7 | 0 | 0 |
ROR | 0x6A | N Z C | 2 | 0 | 0 | |
ROR | A | 0x6A | N Z C | 2 | 0 | 0 |
ROR | $F9 | 0x66 | N Z C | 5 | 0 | 0 |
ROR | $F9,X | 0x76 | N Z C | 6 | 0 | 0 |
ROR | $0200 | 0x6E | N Z C | 6 | 0 | 0 |
ROR | $0200,X | 0x7E | N Z C | 7 | 0 | 0 |
RTI | 0x40 | N O - B D I Z C | 6 | 0 | 0 | |
RTS | 0x60 | N O - B D I Z C | 6 | 0 | 0 | |
SBC | #09 or #$F9 | 0xE9 | N O Z C | 2 | 0 | 0 |
SBC | $F9 | 0xE5 | N O Z C | 3 | 0 | 0 |
SBC | $F9,X | 0xF5 | N O Z C | 4 | 0 | 0 |
SBC | $0200 | 0xED | N O Z C | 4 | 0 | 0 |
SBC | $0200,X | 0xFD | N O Z C | 4 | 0 | 1 |
SBC | $0200,Y | 0xF9 | N O Z C | 4 | 0 | 1 |
SBC | ($09),X | 0xE1 | N O Z C | 6 | 0 | 0 |
SBC | ($09),Y | 0xF1 | N O Z C | 5 | 0 | 1 |
SEC | 0x38 | C | 2 | 0 | 0 | |
SED | 0xF8 | D | 2 | 0 | 0 | |
SEI | 0x78 | I | 2 | 0 | 0 | |
STA | $F9 | 0x85 | 3 | 0 | 0 | |
STA | $F9,X | 0x95 | 4 | 0 | 0 | |
STA | $0200 | 0x8D | 4 | 0 | 0 | |
STA | $0200,X | 0x9D | 5 | 0 | 0 | |
STA | $0200,Y | 0x99 | 5 | 0 | 0 | |
STA | ($09),X | 0x81 | 6 | 0 | 0 | |
STA | ($09),Y | 0x91 | 6 | 0 | 0 | |
STX | $F9 | 0x86 | 3 | 0 | 0 | |
STX | $F9,Y | 0x96 | 4 | 0 | 0 | |
STX | $0200 | 0x8E | 4 | 0 | 0 | |
STY | $F9 | 0x84 | 3 | 0 | 0 | |
STY | $F9,X | 0x94 | 4 | 0 | 0 | |
STY | $0200 | 0x8C | 4 | 0 | 0 | |
TAX | 0xAA | 2 | 0 | 0 | ||
TAY | 0xA8 | 2 | 0 | 0 | ||
TSX | 0xBA | 2 | 0 | 0 | ||
TXA | 0x8A | 2 | 0 | 0 | ||
TXS | 0x9A | 2 | 0 | 0 | ||
TYA | 0x98 | 2 | 0 | 0 |