Thorsten Ball

Home Posts Talks Projects Contact My Book

A Virtual Brainfuck Machine In Go

04 Jan 2017

You’re a programmer and your product manager walks up to your desk, taps you on the shoulder and asks if you have a couple of minutes to spare. She needs to talk to you about something. You sit down together and she has a serious look on her face. Oh boy. Something’s up. “Do you have anything important on your plate right now? I need you to do something for me.” Here it comes… “I need you to write a Brainfuck interpreter for me. A fast one.”

Some people might say that this conversation will never, ever happen. Well, “better be prepared” is what I say.

Brainfuck

Brainfuck is a weird looking programming language and keeps every promise its name makes. Here is “Hello, World!” in Brainfuck:

++++++++[>++++[>++>+++>+++>+<<
<<-]>+>+>->>+[<]<-]>>.>---.+++
++++..+++.>>.<-.<.+++.------.-
-------.>>+.>++.

If you’re now thinking “Heck, I’d use that in production”, let me tell you that Brainfuck was conceived as a fun, teaching language. Its inventor Urban Müller wanted Brainfuck to be a language that’s easily implementable and thus make it the perfect choice for someone who wants to learn more about interpreters or compilers.

I think, he reached that goal. Implementing Brainfuck is an eye-opening experience. Even though it’s a tiny language, it’s perfectly well-equipped to illustrate a number of concepts behind programming language implementations.

But before we can build Brainfuck, we need to understand how Brainfuck thinks.

Views Of The World

One thing in which programming languages differ is their model of the world and how they make it accessible to their users.

Take C, for example. Leaving aside the multitude of abstractions that hide in the depth of the kernel and the hardware, when working with C you can peek behind the curtain and see the inner workings of your computer. You are pretty close to the hardware-supported stack and you can allocate and free memory on the heap. If you’re experienced and stare intently enough, you can see the actual machine code when looking at your C code. The same goes for C++.

In Forth you mainly work with a stack. You push, you pop, you swap and drop. Nearly everything you do happens on a stack. In Forth, the stack is the world.

In other languages, these underlying assumptions about the mechanics of the world are abstracted away. Even though the current version of the Ruby Virtual Machine has a stack, you won’t notice. You don’t push and pop, but send messages to objects. The same goes for Java. You have classes that inherit from each other and memory allocation only concerns you in so far as the garbage collector shows up on time.

Then there are some languages that explicitly tell you what their world looks like. Especially intermediate languages, which are not meant to be written by hand, but are representation of end-user languages and easier for computers to understand and optimize. WebAssembly, for example, represents the commands of a stack-based machine, that gets then emulated by a runtime (which will be a browser, most of the time). Java bytecode is a representation of Java code in the world of a stack machine.

Brainfuck Machines

And then there’s Brainfuck. Brainfuck doesn’t just tell you what its view of the world is, no, it smacks you over the head with it.

Brainfuck is based on the assumption that Brainfuck code will be executed by a Brainfuck machine. Just like the PUSH and POP operations in Java bytecode assume that the JVM manages a stack, the + and - in Brainfuck assume that there’s a Brainfuck machine which supports these two instructions.

So what does this Brainfuck machine look like? Not too complicated! It only has a few parts:

That’s it. Those are all the parts of a complete, working Brainfuck machine that can execute Brainfuck code. So let’s take a closer look at Brainfuck code.

The Instructions

Brainfuck is tiny. It consists of eight different instructions. These instructions can be used to manipulate the state of the Brainfuck machine:

That’s all of it, the complete Brainfuck language.

Even though these instructions look archaic, they’re just identifiers. Replace + with PLUS, - with SUB, . with PRINT and [ with LOOP and suddenly Brainfuck starts to look more like Brain-oh-wow-wait-a-second-I-can-actually-read-that.

Now that we know what the machine should look like and what it has to do, let’s get started with building it.

Building The Machine

The basic structure will be called - you guessed it - Machine and looks like this:

// machine.go

type Machine struct {
	code string
	ip   int

	memory [30000]int
	dp     int

	input  io.Reader
	output io.Writer
}

func NewMachine(code string, in io.Reader, out io.Writer) *Machine {
	return &Machine{
		code:    code,
		input:   in,
		output:  out,
	}
}

As you can see, everything we’ve talked about is here: the code, the instruction pointer (ip), the memory, the data pointer (dp) and both the input and output streams.

Now we just need a method that can start this Machine and get it to execute code:

// machine.go

func (m *Machine) Execute() {
	for m.ip < len(m.code) {
		ins := m.code[m.ip]

		switch ins {
		case '+':
			m.memory[m.dp]++
		case '-':
			m.memory[m.dp]--
		case '>':
			m.dp++
		case '<':
			m.dp--
		}

		m.ip++
	}
}

Here we step through every instruction in m.code until we reach its end. In order to execute each instruction individually, we have a switch statement, that “decodes” the current instruction and manipulates the machine according to which instruction it is.

In the case of + and - we manipulate the current memory cell, incrementing and decrementing its value respectively. The current memory cell is pointed to by the data pointer, m.dp, and we can get to it with m.memory[m.dp]. And in order to change the data pointer itself, we have two case branches for > and <.

So far, so good. But we’re missing printing and reading, the . and , instructions. In order to implement support for those, we need to make a slight modification: we need to give our Machine a one-byte buffer slice.

// machine.go

type Machine struct {
// [...]
	buf []byte
}

func NewMachine(code string, in io.Reader, out io.Writer) *Machine {
	return &Machine{
// [...]
		buf: make([]byte, 1),
	}
}

With that in place, we can add two new methods called readChar and putChar:

// machine.go

func (m *Machine) readChar() {
	n, err := m.input.Read(m.buf)
	if err != nil {
		panic(err)
	}
	if n != 1 {
		panic("wrong num bytes read")
	}

	m.memory[m.dp] = int(m.buf[0])
}

func (m *Machine) putChar() {
	m.buf[0] = byte(m.memory[m.dp])

	n, err := m.output.Write(m.buf)
	if err != nil {
		panic(err)
	}
	if n != 1 {
		panic("wrong num bytes written")
	}
}

readChar reads one byte from the input, which will be os.Stdin, and then transfers this byte to the current memory cell, m.memory[m.dp]. putChar does the opposite and writes the content of the current memory cell to the output stream, which will be os.Stdout.

It has to be said, that instead of doing proper error handling here, we just let the machine blow up by calling panic. That shouldn’t happen, of course, when we plan to use it in production (I dare you), so keep that in mind.

Using these two methods means adding new case branches to the switch statement in Execute:

// machine.go

func (m *Machine) Execute() {
	for m.ip < len(m.code) {

// [...]
		case ',':
			m.readChar()
		case '.':
			m.putChar()
// [...]

	}
}

And with that, our Brainfuck machine can read and print characters! It’s time to move on to the hairiest part of the implementation.

Looping

Brainfuck’s two control flow instructions are [ and ]. And they’re not quite like loops or other control flow mechanisms in “normal” languages. Expressed in some Go-like dialect of pseudo-code, what they do is this:

switch currentInstruction {
case '[':
  if currentMemoryCellValue() == 0 {
    positionOfMatchingBracket = findMatching("]")
    instructionPointer = positionOfMatchingBracket + 1
  }
case ']':
  if currentMemoryCellValue() != 0 {
    positionOfMatchingBracket = findMatching("[")
    instructionPointer = positionOfMatchingBracket + 1
  }
}

Note the two different conditions of the if-statements. They are the most important bits here, because they give both instructions separate meaning. Here’s an example to see how [ and ] can be used:

+++++   -- Increment current cell to 5
[       -- Execute the following code, if the current cell is not zero
->      -- Decrement current cell, move data pointer to next cell
+<      -- Increment current cell, move data pointer to previous cell
]       -- Repeat loop if current cell is non-zero

This snippet increments the current cell to 5 and then uses [ and ] to add the cell’s value to the next cell, by decrementing and incrementing both cells in a loop. The body of the loop will be executed 5 times until the first cell contains zero.

Of course, implementing the “does the current memory cell hold zero or not?” check is not the problem. Finding the matching brackets is what’s hairy about this, because brackets can be nested. It’s not enough to find the next ] when we encounter a [, no, we need to keep track of every pair of brackets we find. How are we going to do that? With a simple counter! Here is the pseudo-code from above turned into real Go code:

// machine.go

func (m *Machine) Execute() {
	for m.ip < len(m.code) {
		ins := m.code[m.ip]

		switch ins {
// [...]
		case '[':
			if m.memory[m.dp] == 0 {
				depth := 1
				for depth != 0 {
					m.ip++
					switch m.code[m.ip] {
					case '[':
						depth++
					case ']':
						depth--
					}
				}
			}
		case ']':
			if m.memory[m.dp] != 0 {
				depth := 1
				for depth != 0 {
					m.ip--
					switch m.code[m.ip] {
					case ']':
						depth++
					case '[':
						depth--
					}
				}
			}
		}

		m.ip++
	}
}

Let’s take a closer look at the case '[' branch.

Here we check whether the current memory cell’s value is zero and if it is, we try to set the instruction pointer, ip, to the position of the matching ]. In order to do that correctly in the face of nested bracket pairs, we use depth as a counter. With each [ we pass, we increment the counter, and with each ] we decrement it. Since it’s set to 1 initially, we know that we are sitting on our matching ] when depth is 0. And that means that m.ip is set to the correct position. The m.ip++ at the end of the for-loop does the rest and sets the instruction pointer to the instruction right after the matching bracket.

The case ']' branch is the mirrored version, where we walk backwards in the instructions, trying to find the matching [.

It’s time to flip the power switch on this machine.

Hello World

Here is a small driver, that reads in a file and passes it to our Brainfuck machine:

// main.go

package main

import (
	"fmt"
	"io/ioutil"
	"os"
)

func main() {
	fileName := os.Args[1]
	code, err := ioutil.ReadFile(fileName)
	if err != nil {
		fmt.Fprintf(os.Stderr, "error: %s\n", err)
		os.Exit(-1)
	}

	m := NewMachine(string(code), os.Stdin, os.Stdout)
	m.Execute()
}

That gives us the possibility to run Brainfuck programs on the command line:

$ cat ./hello_world.b
++++++++[>++++[>++>+++>+++>+<<
<<-]>+>+>->>+[<]<-]>>.>---.+++
++++..+++.>>.<-.<.+++.------.-
-------.>>+.>++.

$ go build -o machine && ./machine ./hello_world.b
Hello World!

It talks! Sweet! Our Brainfuck machine works!

So slow!

I have some good and some bad news. Our product manager said that the Brainfuck interpreter needs to be fast and, sadly, ours isn’t. That’s the bad news.

On my computer, our machine currently takes around 70 seconds to execute mandelbrot.b, a mandelbrot set fractal viewer written in Brainfuck by Erik Bosman, that’s often used as a benchmark for Brainfuck interpreters. That’s slow.

$ go build -o machine && time ./machine ./mandelbrot.b >/dev/null
 ./machine ./mandelbrot.b > /dev/null  68.24s user 0.18s system 99% cpu 1:08.60 total

The good news is, that there are a few things we can do to make it faster.

Take a look at the hello_world.b example from above or the mandelbrot.b program. See all those runs of + and -? There are a lot of instructions of the same type right behind each other in Brainfuck programs. And we have to read each one, check which one it is and then execute it.

The overhead of doing this is high. Consider this Brainfuck snippet: +++++. In order to execute it, we need five cycles of “fetch the next instruction”, “what instruction do we have here?” and “execute this!”. That turns into us incrementing the value of the current memory cell by one five times. It would give us a huge performance boost if we could just increase the current cell’s value by five directly.

The other thing that’s slowing us down is the way we handle [ and ]. Every time we stumble upon such a bracket, we go looking for its matching counterpart again. Scan the program, keep track of all the other brackets we pass and then modify the instruction pointer. The longer the program, the longer this will take. If we could do that just once for each bracket and remember the position of its matching counterpart, we wouldn’t need to rescan the program again and again.

And here’s the best of news: we can! We can do all of this before we even start up our Brainfuck machine. We can turn +++++ into something that says “increase by 5”. We can also do the same for -, >, <, ., and ,. And we can find and remember the positions of matching bracket pairs beforehand. All we need to do is create another representation of the original Brainfuck code that can include these optimizations and have our machine execute this instead.

A New Instruction Set

Up until now we’ve used a string to represent the code, that’s to be executed by the Machine. But in order to make optimizations, we need a new instruction set. Here is the Instruction type, that makes up the new set:

// instruction.go

type InsType byte

const (
	Plus          InsType = '+'
	Minus         InsType = '-'
	Right         InsType = '>'
	Left          InsType = '<'
	PutChar       InsType = '.'
	ReadChar      InsType = ','
	JumpIfZero    InsType = '['
	JumpIfNotZero InsType = ']'
)

type Instruction struct {
	Type     InsType
	Argument int
}

Each Instruction has a Type and an Argument. The Type can be one of the predefined constants defined at the top, where each constant has a corresponding Brainfuck instruction. The interesting part here is the Argument field. This field allows us to make our instruction set much more dense than the original Brainfuck code. We can put more information in less instructions. We’ll use Argument in two ways:

Now that we have our new Instruction type and know how this new instruction set is to be interpreted, we can modify our Machine to do exactly that. The first thing we need to do is to change its definition, so it doesn’t work with a string anymore, but with a slice of *Instruction:

// machine.go

type Machine struct {
	code []*Instruction
	ip   int

	memory [30000]int
	dp     int

	input  io.Reader
	output io.Writer

	readBuf []byte
}

func NewMachine(instructions []*Instruction, in io.Reader, out io.Writer) *Machine {
	return &Machine{
		code:    instructions,
		input:   in,
		output:  out,
		readBuf: make([]byte, 1),
	}
}

With that change made, the Execute method of the Machine now also needs to work with this new type of instruction set:

// machine.go

func (m *Machine) Execute() {
	for m.ip < len(m.code) {
		ins := m.code[m.ip]

		switch ins.Type {
		case Plus:
			m.memory[m.dp] += ins.Argument
		case Minus:
			m.memory[m.dp] -= ins.Argument
		case Right:
			m.dp += ins.Argument
		case Left:
			m.dp -= ins.Argument
		case PutChar:
			for i := 0; i < ins.Argument; i++ {
				m.putChar()
			}
		case ReadChar:
			for i := 0; i < ins.Argument; i++ {
				m.readChar()
			}
		case JumpIfZero:
			if m.memory[m.dp] == 0 {
				m.ip = ins.Argument
				continue
			}
		case JumpIfNotZero:
			if m.memory[m.dp] != 0 {
				m.ip = ins.Argument
				continue
			}
		}

		m.ip++
	}
}

That’s a lot cleaner than what we had before, right? And it’s faster, too! Well, I can’t prove it yet, because there’s still a piece missing: something that turns Brainfuck code into a slice of *Instructions.

Compiling Brainfuck

Wikipedia defines a compiler as:

a computer program (or a set of programs) that transforms source code written in a programming language (the source language) into another computer language (the target language)

That’s exactly what we need! A program that takes Brainfuck code and turns it into our new “language”, which is made up of our Instructions.

And that’s also a pretty clear definition of requirements, which allows us to define our Compiler:

// compiler.go

type Compiler struct {
	code       string
	codeLength int
	position   int

	instructions []*Instruction
}

func NewCompiler(code string) *Compiler {
	return &Compiler{
		code:         code,
		codeLength:   len(code),
		instructions: []*Instruction{},
	}
}

The Compiler is constructed with the original Brainfuck code as a string and has an empty instructions slice that will be filled. That’s the job of the Compile method:

// compiler.go

func (c *Compiler) Compile() []*Instruction {
	for c.position < c.codeLength {
		current := c.code[c.position]

		switch current {
		case '+':
			c.CompileFoldableInstruction('+', Plus)
		case '-':
			c.CompileFoldableInstruction('-', Minus)
		case '<':
			c.CompileFoldableInstruction('<', Left)
		case '>':
			c.CompileFoldableInstruction('>', Right)
		case '.':
			c.CompileFoldableInstruction('.', PutChar)
		case ',':
			c.CompileFoldableInstruction(',', ReadChar)
		}

		c.position++
	}

	return c.instructions
}

That looks remarkably close to the Execute method of the current and previous versions of our Machine. But there’s a huge difference: whereas the Machine executed the Brainfuck instructions directly, our Compiler now turns them into *Instructions, so they can be executed later. Here is what the CompileFoldableInstruction method does:

// compiler.go

func (c *Compiler) CompileFoldableInstruction(char byte, insType InsType) {
	count := 1

	for c.position < c.codeLength-1 && c.code[c.position+1] == char {
		count++
		c.position++
	}

	c.EmitWithArg(insType, count)
}

func (c *Compiler) EmitWithArg(insType InsType, arg int) int {
	ins := &Instruction{Type: insType, Argument: arg}
	c.instructions = append(c.instructions, ins)
	return len(c.instructions) - 1
}

Together with EmitWithArg the CompileFoldableInstruction method scans through the input code (the Brainfuck string code) to see if the current instruction is followed by other instructions of the same type. If that’s the case, it folds those Brainfuck instructions into one Instruction.

EmitWithArg is a helper method that creates a new *Instruction, adds it to the c.instructions slice of the Compiler and returns the position of this newly created instruction in c.instructions.

Returning the position of the newest instruction is an important detail, because we’re going to need it now. As you may have noticed, we didn’t add support for [ and ] to our Compiler yet. That’s because these are not foldable instructions (e.g.: we cannot turn [[[ into a single instruction), but need to do something more elaborate.

Compiling Loops

We have two loop instructions: [ and ]. And we want to turn them into JumpIfZero and JumpIfNotZero instructions, where the Argument field contains the position of the matching bracket. That is: the position of the matching counterpart Instruction in the final instructions slice.

That’s easier said than done, though. The problem is that when we encounter a [ we don’t know where in the final instructions slice the matching ] instruction will end up. Counting the instructions in between doesn’t work, because it’s possible that those will be folded together in the next compilation step and thus invalidate the position we got through counting.

Then there’s also the problem of remembering the position of the last JumpIfZero instruction, so it can be used as Argument when constructing the matching JumpIfNotZero instruction.

But here’s what we’re going to do, here’s how we’re going to solve these problems. First, we will emit a JumpIfZero instruction for each [ we encounter, with the placeholder value 0 in the Argument field. Later, when we have constructed the matching JumpIfNotZero instruction, we’re going to come back to this instruction and change its Argument to the real value.

In order to later be able to change them, we need to keep track of JumpIfZero instructions. And we’re going to use a stack to do that, implemented with a simple Go slice:

// compiler.go

func (c *Compiler) Compile() []*Instruction {
	loopStack := []int{}

	for c.position < c.codeLength {
		current := c.code[c.position]

		switch current {
		case '[':
			insPos := c.EmitWithArg(JumpIfZero, 0)
			loopStack = append(loopStack, insPos)
// [...]
		}

		c.position++
	}

	return c.instructions
}

loopStack, which acts as a stack onto which we can push elements and later pop them off, is just an empty slice. There’s not much to it. Interesting here is the case branch for the [ instructions. Just like we discussed, we emit a new JumpIfZero instruction with a placeholder Argument. Then comes the important part: we push the position of the new JumpIfZero position onto our loopStack.

That, in turn, allows us to correctly handle ] instructions:

// compiler.go

func (c *Compiler) Compile() []*Instruction {
// [...]
	case ']':
		// Pop position of last JumpIfZero ("[") instruction off stack
		openInstruction := loopStack[len(loopStack)-1]
		loopStack = loopStack[:len(loopStack)-1]

		// Emit the new JumpIfNotZero ("]") instruction,
		// with correct position as argument
		closeInstructionPos := c.EmitWithArg(JumpIfNotZero, openInstruction)

		// Patch the old JumpIfZero ("[") instruction with new position
		c.instructions[openInstruction].Argument = closeInstructionPos
// [...]
}

We pop the position of the last JumpIfZero instruction, the opening [, which still holds a placeholder 0 as Argument, off the stack, and use it as the correct Argument for a new JumpIfNotZero instruction.

And since we now have the position of the JumpIfZero instruction, we can access it in c.instructions and change its Argument from 0 to the correct position of the new JumpIfNotZero instruction!

Isn’t that neat? Now our Compiler takes this piece of Brainfuck code

+++[---[+]>>>]<<<

And turns it into these Instructions:

[]*Instruction{
  &Instruction{Type: Plus, Argument: 3},
  &Instruction{Type: JumpIfZero, Argument: 7},
  &Instruction{Type: Minus, Argument: 3},
  &Instruction{Type: JumpIfZero, Argument: 5},
  &Instruction{Type: Plus, Argument: 1},
  &Instruction{Type: JumpIfNotZero, Argument: 3},
  &Instruction{Type: Right, Argument: 3},
  &Instruction{Type: JumpIfNotZero, Argument: 1},
  &Instruction{Type: Left, Argument: 3},
}

All that’s left to do now is making use of it.

Starting The Faster Machine

In order to make use of our optimized Machine and our shiny new instruction set, we have to use our Compiler when we read in a file of Brainfuck code:

// main.go

package main

import (
	"fmt"
	"io/ioutil"
	"os"
)

func main() {
	fileName := os.Args[1]
	code, err := ioutil.ReadFile(fileName)
	if err != nil {
		fmt.Fprintf(os.Stderr, "error: %s\n", err)
		os.Exit(-1)
	}

	compiler := NewCompiler(string(code))
	instructions := compiler.Compile()

	m := NewMachine(instructions, os.Stdin, os.Stdout)
	m.Execute()
}

That’s looks a lot like our old driver. But instead of reading in a file and passing its content to our Brainfuck machine, we first compile the original Brainfuck code in the file to our new Instruction set. And these Instructions will then be executed by our Machine.

If we now run this with the mandelbrot.b benchmark we can see that our work paid off: what took 70s before now only takes 13s!

$ go build -o machine && time ./machine ./mandelbrot.b >/dev/null
./machine ./mandelbrot.b > /dev/null 13.43s user 0.04s system 99% cpu 13.496 total

Isn’t that something?

Taking A Closer Look

Yes, we’ve only implemented Brainfuck, a language with no syntax to speak of and only eight different instructions. You might be tempted to call our two Brainfuck machines toys. But let’s take a look at what we actually did.

The first thing we built is an interpreter that acts as a Brainfuck machine. It has all the necessary parts: memory cells, data and instruction pointers, input and output streams. The interpreter effectively tokenizes its input by processing it byte by byte. It then evaluates each token on the fly. It’s not much longer than 100 lines, but has all the essential parts of a fully-grown interpreter.

And then we’ve built a compiler! Sure, it doesn’t output native machine code and it’s really simple, but it’s a compiler nonetheless! It takes Brainfuck code as input and outputs instructions for a machine - our Brainfuck machine. That’s the basic idea behind compilers. We could also change the way our Instructions are stored and passed around, and then we’d realize that our Machine is now a virtual machine and is executing bytecode.

Now, that doesn’t sound like toys, does it? What we built is using the same blueprints a lot of other, mature and production-ready programming languages use. Once you’ve understood how and why they work, you start to recognize them in other languages, too, and in turn understand these languages better.

And that’s why I think implementing Brainfuck can be a rewarding and eye-opening experience.

You can find the complete code, including tests, for both versions of the Brainfuck machine here on GitHub.

Follow me on twitter: @thorstenball. Or send me an email to me@thorstenball.com. Or check out my book at interpreterbook.com.

I also maintain a mailing list, on which I sent out occasional updates about my book or this blog. I won't spam you and you can unsubscribe at any time.

Comments

comments powered by Disqus