J5 Virtual Machine
Table of Contents
Overview
For a while now, I had wanted to implement simple VM. After reading a few books on interpreters and compilers I decided to give it a go. I named it J5 after the robot from the movie Blank Man.
J5 is a minimal bytecode VM that is designed to make small graphical demos. The VM renders a 32x32 pixel screen at 60 frames per second. It has the ability to do basic drawing commands like setting the pen color, drawing lines or copying a sprite to the screen.
I took a lot of inspiration after PICO-8, Chip-8 and the Atari 2600. I especially liked how with PICO-8 there is a VM implementation for desktop as well as the web. I wanted J5 to have a single binary format that I could run on either a desktop VM or in the browser.
Bytecode VM Spec
J5 is a implemented as a stack machine for simplicity. Most of the OpCodes are designed to make it easy to draw graphics.
Memory Layout
- 8 bytes of user RAM
- 1 byte to indicate the current color
- 1024 bytes for the 32 x 32 screen frame buffer
- 8 8x8 monochrome sprites
OpCodes
- 0x00,
constant
- pushes a single byte integer argument onto the stack - 0x01,
color
- uses the value at the top of the stack for the color - 0x02,
pixel
- takes the top 2 items from the stack them as x and y values - 0x03,
store
- stores the byte at the top of the stack in the argument address - 0x04,
load
- push the value in the argument address onto the stack - 0x05,
add
- add the top two values together on the stack and push the result - 0x06,
sub
- subtract the top two values together on the stack and push the result - 0x07,
mul
- multiply the top two values together on the stack, push the result - 0x08,
div
- divide the top two values together on the stack, push the result - 0x09,
greater
- compares the top two values, pushes 1 to stack if lower value is greater than higher - 0x0A,
lesser
- compares the top two values, pushes 1 to stack if lower value is lesser than higher - 0x0B,
equal
- compares the top two values, pushes 1 to stack if lower value is equal to the higher - 0x0C,
jumpif
- sets program counter to the argument address if the value on the top of the stack is 0 - 0x0D,
nop
- just increments the program counter - 0x0E,
jump
- sets the program counter to argument address unconditionally - 0x0F,
line
- pops 4 items off the stack (x1, y1, x2, y2) and draws line connecting them - 0x10,
clear
- sets the entire screen to the current color - 0x11,
enterframe
- set the frame program counter to the current value + 1 - 0x12,
getpixel
- pop 2 values off stack return color value at x and y location - 0x13,
rnd
- pop max value off stack, push random value between 0 and max - 1 - 0x14,
loadsprite
- pops sprite index off stack, then 8 more values off the stack, gets loaded into the 8x8 sprite at index - 0x15,
sprite
- pops index then the x and y coordinate off the stack, draws the sprite at index
VM Implementations
I wrote the C implementation and the compiler pretty much simultaneously. I would first implement the OpCode in C then write the Lisp s-expression macro for it. Once the C VM was working I would port that code over to the JavaScript VM. I wanted to make sure that both VMs could run the same bytecode binary program.
The VM program has two main entry points: init
, and frame
. The init
entry point is a code block that is run first. This was needed to setup any variables you might have. The frame
entry point is run after init
60 times per second.
C Implementation
The C VM is fairly simple. There is one large switch case statement that handles each different OpCode. Depending on the rules for that OpCode values are pushed or popped off of the stack. The C implementation uses SDL2 to render the VM.
JavaScript Implementation
The JavaScript VM is basically a straight port of the C one. I will say that compared to the C program it was much more difficult to debug in Firefox. I was surprised at how easy it was to crash the entire browser with an infinite loop in some JavaScript. Instead of using SDL2 for rendering, the JavaScript VM just uses a canvas tag. It uses the fetch API to load the J5 bytecode binary from the server.
Lisp Compiler
I wrote the compiler for J5 in Common Lisp and made extensive use of macros. Using Lisp I was able to build a sub langauge out of S-Expressions that allowed me to write J5 programs in a high level language. This was much easier than hand writing the bytecode binary.
Why Lisp?
After reading through Writing an Interpreter in Go I noticed that a large amount of code went toward parsing statements into an AST. While reading the book, I thought that just using S-Expressions and Lisp macros would make it much easier to implement a compiler. Since S-Expressions are basically already ASTs and I was familiar with Lisp syntax I went this route.
Wrapping Statements
I chose to use macros that start with $
character to make up the J5 language. For example ($add x y)
or ($frame)
. These macros would wrap statements with lambda functions that are called recursively at compile time. The underlying functions would be responsible for emitting the opcode corresponding to the function. For example calling ($add 1 2)
would emit OpCode 0x05 for add followed by two values. The tree structure of S-Expressions allowed for nesting without any additional work on the compiler. This would allow for statements like ($add ($add 1 2) 3)
. While this was an efficient way to implement a bytecode compiler, I encountered some bugs around wrapping variables took me quite a while to figure out.
Compiling blocks
When I needed to implement the jump and conditional jump OpCodes, I wasn't initally sure how to get program counter index to jump to. I am sure there is probably a more elegant solution but I opted to just run a nested compiler for the block. I would then count the number of bytes from the compiler output and add this to the initial program counter value. I ran into some macro hygiene issues when running a nested compiler but once I got those figured out it seemed to work fine. Using this technique I was able to add loops and conditionals to the compiler.