Wenton's Commodore Assembly Language Page

by
Wenton L. Davis

One of the first things anyone every asks me about assembly language is always, "Why? Why do it? Why go to all that extra work when I can just use (fill in the blank)?" And generally, my answer is that you should use whatever language you want, provided that it does what you want. In many cases, yes, (fill in the blank) language will work just fine. However, there are some things that assembly will do that others can't...

Speed of execution - NO language will execute as fast as a well-written assembly program. Assembly is the native language of the computer, and although 'C' will generally produce efficient code, it will still fall short of the speed assembly will offer. Many modern machines are fast enough that even interpreted languages (PERL, Python, etc.) will seem to run sufficiently fast, if you run a program in an interpreted or compiled language, and run the same program in Assembly, again assuming the assembly is well written, the assembly language program will be done long before the compiled or interpreted language. (Keep in mind that when the program is making heavy use of a library such as libc.a, both the compiled language and assembly language will appear to be running about the same.)
Register access - With the exception of the register modifier for variables available in C/C++, there is no way to access the registers. Generally, this is not an issue, but some processors have registers that the compiler is unaware of their existances. These are typically special-purpose registers, and compilers tend to only use general-purpose registers. As a result, the compiler has no idea how to access these other registers. This is one of the ways a good assembly language program can completely out-class a compiled program.

That being said....

Portability - Programs written in assambly language are terribly unportable. They may work great on one computer, but will not run at all on a computer using a different processor. In these cases, porting a program from one computer to another will effectively mean completely rewriting the entire program. (Writing a program for a Windoze 32-bit machine may or may not run on another Windows 64-bit machine, largely depending on the libraries used. Similar issues exist in Linux.) A program written for the Commodore 64 will run on an Atari 2600, because the processors are (more or less) the same, but it will not appear to run because the I/O and display memory is completely different.
Maintainability - Almost any program that is written in any language will eventually be modified at some future point, based on business needs or application changes. It is nearly impossible to anticipate future needs of users when a program is first being designed, so programs need to be written to anticiapte only that changes will eventually be needed. In assembly language, this is a particularly difficult component of design; modifications typically have much larger impacts on assambly language than they have on other languages. In many cases, what might be considered a "minor" change might, in fact, be "minor" in higher-level languages, but they can easily become more complicated in assembly. (I Personally would also argue that there are so few good assembly programmers, that becomes a significant part of this issue.)

This good versus bad argument can go on for days... but I'm more interested on getting on with how to do assembly, so here we go...

Step one: The Build Environment

Before even looking at assembly (or any programming), developing a cohesive development environment is necissary. This topic could be it's own page, very easily. I will simplify this by stating that there are a few different approaches.

Native: Normally, developing software on the same system it is being designed to run on is prefered. The issues with "cross-compilers" (see below) can be daunting, and transfering from one system to another can be complex. As a result, if developing natively tends to have many benefits. For development on the PETs, VIC-20, and Commodore 64/128, the most common method is to use "monitor" programs. These monitors allow assembly, disassembly, direct control to memory data, saving to and loading from tape/disk, and that is about it. They provide a method to write programs, but only at very primative levels. The idea of "jump to a different location in memory" requires the developer to know in advance what that memory location is.... not always readily available, so it becomes necessary to go back to "fix" the memory location, later. Use of monitors is covered to some degree in the "Commodore 64 Programmer's Reference Guide." Several monitors were available in the shareware market. Additionaly, there were cartridges from Human Engineered Software (HES) for the VIC-20 and the Commodore 64/128 called "HESmon." These were a bit more advanced than the freeware/shareware monitors that had to be loaded from tape or disk.

A better option is, if you can find them, the "Super-PASCAL" or "Super C" software from Abacus Software. Both of these packages included a macro assembler, in which the developer could use labels, directives, and all kinds of functionality that worked almost as effectively as GNU's assemblers. These assemblers are much easier to use, and make code development far more efficient and effective for development, especially for larger projects.

Cross-development: A more comfortable method of development for many people is the use of a larger computer such as a Windoze-based or Linux-based computer. File management and things like that tend to be much smoother, and even version control and other more advanced techniques can be implimented far less painfully. I'm sure there are more options than I will describe, but these three are the ones I like best:

cc65 - Available as source code, fairly simple to "git" from sourceforge.org. This tool most closely resembles the GNU toolchains for other microprocessors. Daily (?) snapshots of documentation and binaries for Windoze are available.

xa - This one is a little harder to find, and if you are going to use the ViCE emulator, it is actually required to build the emulator from source. Try floodgap for the source code and manpages for the manual page. This is less powerful than the cc65 set, but seems to be a little smoother and more comfortable.

WDC tools - In many ways, this is the smart choice. WDC owns the rights to the 65xx and 65xxx processors. The WDC tools is released by the manufacturer, and probably represents the most well-supported and up-to-date development for the entire set of processors. Its only drawback is that this is a Windoze-only solution, and although I do like the simulators that come with the tools, and the simulators also integrate directly to WDC's development boards for the 65256 and 65816 (and presumably others), I still prefer to do my work in Linux.

At least for the time being, I am using the xa option because it required the least amount of work to set up and integrate.

Step one: Transport from development to target.

The really nice thing about developing code in a native environment is that you don't have to figure out how to get the final product from your development platforn to the running platform. One option is to transfer it over a serial port (RS232), and although both Windoze and Linux have many tools avalable that are capable of accomplishing this, the Commodores are not so well-equiped for this. That's not to say it is impossible.... just that it is not as much of a readily-available option for most people. I might look into this as an option in the fututre, but for now, I think I'll leave this one crossed off of my list.

Another option would be to burn the program into a UVPROM such as the 27256 or an EEPROM such as the 28256. The Commodores have a method of looking at the beginning of the ROM, looking for "CBM" and a start address when the computer is reset. This is probably the most efficient method, provided that an chip programmer is available. I do have a really nice programmer from EEtools that I have had a lot of success with, but this is still a relatively pricy solution. The biggest drawback of this method is the fact that when yuo move a chip back and forth too many times, you tend to end up with broken or bent pins, and contacts tend to become somewhat flaky.

The third option, and the easiest one to recommend, is to use the floopy disk. The opencbm project allows a person to develop code on a Windoze or Linux computer, and write it to a disk image (usually a .D64 file), or even directly to a disk in a Commodore 1541 (or 1571 or 8040) disk drive. This does require a little extra hardware, fairly readibly available online for nominal fees, and of course, the disk drive, itself. This method is probably the best balance between efficiency and cost-effectiveness. This method also allows a nice, clean way of writing to a disk image that can be read by the ViCE emulator; development can be done without even owning a physical machine.

For the sake of simplicity, from this point, forward, I will be using the xa assembler and I'll employ the .D64 files for my development...

Step Two: Simple first test

Probably the easiest jumping off point is to write assembly code for the Commodore 64 is to write it to fit into that little 4K window from $C000 to $CFFF. This is pretty easy to set up, except there is one little "trick" to remember. When the Commodores are loading a file, the first two bytes read from the file are the starting address to load into. BASIC programs start at $0801. (No, I have no idea WHY they skip over $0800.) So whether the program is in BASIC of if it is assembly, the first two bytes tell the computer the start address to start loading. Therefore, if we want to write code that will start at $C000, we have to trick the assembler into thinking that we actually want two bytes to appear before the actual code. This is accomplished by using the .ORG statement, but back up two bytes. Since we want our code to begin at $C000, we tell the assembler we want to begin at $BFFE, and we give it the start to load address of $C000:

;This is our example program
  * = $BFFE       ;the "star" (*) means "current location"
  .WORD $C000
  
;Now for the code:

Now, just for simplicity, let's create a function that will clear the screen and return:

sample_cls:
    LDA #147    ;character to clear the screen
    JSR $FFD2   ;Call the print character subroutine
    RTS         ;Return back to BASIC

And there is our first program. As convention, I would save this file as sample.a65 or something similar... just ending with .a65. (NOTE that in the GNU world, the common file extension for assembly language programming is .s, but we are not really in the GNU world.) We can now assemble the program:

dev6502@development> xa -o sample.out -L sample.list sample.a65

and, if all went well, we should see no messages. We can look at the listing file to see if our program is located in the correct position. Our check, here, is to see if the label sample_cls is located at $C000:

sample_cls, 0xc000, 0, 0x0000

And we can look at the hexadecimal output generated using hexdump -C sample.out:

00000000  00 c0 a9 93 20 d2 ff 60                           |.... ..`|
00000008

Notice that the first two bytes (00 C0) are the low-byte followed by the high byte of C000, the address that the program will load into from the disk.

Step Three: Loading onto the Disk

The next major accomplishment will be to load the file sample.out onto the disk. Guess what - there are a few ways of accomplishing this.

The first is the most striaght-forward. it makes the assumption that we have opencbm installed and working. Here, we can just write the program directly to the disk:

Now we just move the disk out of the drive connected to the PC, and into a drive connected to the Commodore 64 or 128. Then we can load the program into the computer and execute is using a SYS command:

LOAD "SAMPLE.OUT",8,1
SYS 49152

The program should clear the screen and display the familiar READY prompt.

The second method is a bit more complicated, but only a bit. It also assumes you have installed ViCE, because the c1541 tool is part of that software:

  c1541 -format disklabel,wd d64 mydisk.d64

If the file mydisk.d64 does not exist, it will be created. Either of the two commands will work; the only difference is that the second command will also generate the label on the disk, "assembly,64." You can also give it a new label later if you wish.

Next, you can write the file to the disk image:

  c1541 mydisk.d64 -write sample.out

Now for fun, we can list what is on the disk:

  c1514 mydisk.d64 -list

should look like:

0 "disklabel       " wd 2a
1     "sample.out"        prg 
663 blocks free.

Now that the .D64 disk image is set, it can be written to a regular floppy disk. This is assuming that you have openCBM installed. If so, the command

  d64copy mydisk.d64 8  (or whichever drive number you need)

will write the disk image to the floppy disk. From here, you move the disk to a drive connected to your Commodore 64/128. Alternatively, if you have ViCE installed, you can launch the '64 or '128 emulator, and attach the disk image in the emulator.

Now that you have the program on disk connected to a Commodore 64/128 (or emulator), you can load the program and run it:

LOAD "SAMPLE.OUT"
SYS 49152

It really doesn't do much; it just clears the screen, but at least you have the tools to write more interesting programs.

Now, if you want to get really fancy, you can also build an assembly program in a way that it works more nicely with the BASIC environment. As an example, I took the program from the Commodure 64 User's Guide, page 65, and rewrite it in assembly language instead of BASIC. I also made it in a way that allows the program to be loaded and run in a method that is more familiar for most people who have programmed in BASIC. In order to understand this method, it is necessary to understand how a BASIC program is stored in memory in the Commodore 64/128 (And the VIC-20).

Each line of a BASIC program has a hidden field of two bytes that stores the address of the beginning of the next line of BASIC. After that, there is a two-byte integer value that is the line number of the current BASIC line. After that, the BASIC commands are stored, encrypted to a single byte, each one is larger than $80, so the upper bit of the byte effectively states, "I am a command." The SYS command is encrypterd to the byte $9E. Finally, the last line of a BASIC program will not be displayed, but that first field (pointing to the address of the next line) will store the number 0 to indicate, "this is the end of the program."

Although the BASIC program "space" starts at $0800, the actual program data begins at $0801. I have no idea why, but that's just the way it is. Recall that earlier, when we wanted our program to start at $C000, we told the assembler to start at $BFFE so we could add two bytes for the load address. We will do the same thing here, but since we want the data to be loaded starting at $0801, we will tell the assembler to begin two bytes earlier, at $07FF.

The complete program looks like this:

;CUG_pg65.a65 - Assembly version of page 65 of COmmodore User's Guide
;-------------------------------------------------------------------|

* = $07FF

        .word $0801                     ;load address
        .word Bline20                    ;address of next BASIC line
        .word 10                        ;BASIC line 10
        .byte $9E                       ;SYS BASIC token
        .byte " 2088", 0                ;2088 = $0828
Bline20: .word line30                    ;address of next BASIC line
        .word 20                        ;BASIC line 20
        .byte "WENTON L. DAVIS, 2021",0 ;copyright
Bline30: .word 0                         ;end of program

;===========================================================================
; Definitions not using memory space                                       |
;===========================================================================
address	= $AC			;address in zero-page of to store vector

;===========================================================================
; Main program                                                             |
;===========================================================================

main:   LDA	#147			;CLS
	      JSR	$FFD2			;print it

line20:
	      ldx #7        ;color
        stx 53280     ;Border color
        ldx #13
        stx 53281     ;Background color
line30:
        lda #1        ;initialize X and Y
        sta X
        sta Y

line40:
        sta DX
        sta DY

line50:
        jsr calc_offset
	      CLC               ;Clear Carry
	      LDA offset+1      ;High byte of offset
	      ADC	#$04          ;Add high byte of $0400
	      STA	address+1     ;Save high byte of address
	      LDA	offset        ;Low byte of offset
	      STA	address       ;Low byte of address
	      LDA	#81           ;Screen code for a ball
	      LDX	#0            ;Clear for indirect,x
	      STA	(address,x)		;Write to screen

line60:
        LDA	#10           ;loop 10 times
loop60top:
        JSR	delay_FF      ;waste time
        SEC
        SBC	#1
        BPL	loop60top


line70:
        LDA	#32           ;screen code for space
        STA	(address,x)   ;Write to screen

line80:
        CLC
        LDA	X             ;Get X
        ADC	DX            ;X=X+DX
        STA	X

line90:
        LDA	X             ;Get X
        CMP	#39           ;X>=39?
        BPL	line90chs     ;Yes, branch
        CMP	#0            ;X<=0?
        BMI	line90chs     ;Yes, branch
        JMP	line100       ;go to next line
line90chs:
        CLC
        LDA	DX
        EOR	#$FF          ;one's compliment
        ADC	#1            ;two's compliment
        STA	DX            ;X=-X

line100:
        CLC
        LDA	Y             ;Get Y
        ADC	DY            ;Y=Y+DY
        STA	Y

line110:
        LDA	Y             ;Get Y
        CMP	#24           ;Y>=24?
        BPL	line110chs    ;Yes, branch
        CMP	#0            ;Y<=0?
        BMI	line110chs    ;Yes, branch
        JMP	line120       ;go to next line
        
line110chs:
        CLC
        LDA	DY
        EOR	#$FF          ;one's compliment
        ADC	#1            ;two's compliment
        STA	DY            ;Y=-Y

line120:
        JMP	line50


calc_offset:
        LDA	#0            ;Clear ACC
        STA	offset+1      ;Clear high byte
        LDA	Y             ;Get Y value
        STA	offset        ;Set low byte

        ASL	offset        ;Multiply by 2
        ROL offset+1

        ASL offset        ;Multiply by 4
        ROL	offset+1

        CLC               ;Clear Carry
        LDA	Y             ;Get Y
        ADC	offset        ;Multiply by 5
        STA	offset
        LDA	offset+1      ;high byte
        ADC	#0            ;Factor in carry flag
        STA	offset+1

;at this point, offset is Y * 5
        ASL	offset        ;Multiply by 10
        ROL offset+1

        ASL	offset        ;Multiply by 20
        ROL offset+1

        ASL	offset        ;Multiply by 40
        ROL offset+1

        CLC
        LDA	X             ;Get X
        ADC	offset        ;40*y + x
        STA	offset
        LDA	offset+1
        ADC	#0
        STA	offset+1

        ;at this point, offset holds 40*y+x
        RTS

;delay 1294 clock cycles, not counting 6 from JSR to call
delay_FF:
        PHA               ;                   (3)
        LDA	#$FF          ;Largest value      (2)
d_FF_0:
        SBC	#1            ;Decrement          (2)
        BNE	d_FF_0        ;Continue until 0   (3)(2 not taken)
        PLA               ;                   (4)  
        RTS               ;                   (6)



X       .byte   0         ;Variable x
Y       .byte   0         ;Variable y
DX      .byte   0
DY      .byte   0
offset  .word   0         ;offest = X+40*y

Now, assemble the code, write it to disk, and load it on the Commodore 64/128. When you load it using LOAD "cug_pg65.out",8 (or whatever you named it), you should not need the extra ,1 at the end (although it should nave no effect). you can list the program, and you should see:

10 SYS 2088
20 WENTON L. DAVIS, 2021

The number 2088 is not arbitrary. When I assemble the program and examine the cug_pg65.lst file, I want the address of the beginning of the assambly language program, in this case, marked with the label, main. Looking at the listing file, I see a line telling me that main is located at $0828. I convert that to decimal (since Commodore BASIC has no way of dealing with hexadecimal) to get the number 2088. This is the address I want to give to the SYS command. In this program, the program flow gets "stuck" in the assembly language part of the program and it never returns back to BASIC. If it did return, BASIC would then continue with line 20. Obviously, BASIC has no idea how to deal with my name, and it would generate a ?SYNTAX ERROR message. If the byte $8F were prepended in fronf of my name, BASIC would interpret the $8F token as REM, and just about any text could be put in the place of my name. The 2088 number would have to be modified, but that is easy enough, just look up (in the cug_pg65.lst file) where the main label is located.

This way, a person can run the program using the familiar RUN command. The program will run, go to the location given in the SYS command, and continue.

Wenton's email (wenton@ieee.org)

home