How to start JIT-ting

01 November 2013
compilers

Premise #

Most developers heard about JIT compilers and how they can make slow interpreted languages run at a speed, comparable to native code. However, not many people understand how exactly this JIT thing works, and even less people could write their own compilers.

I think having at least, basic knowledge of compiler internals may greatly improve understanding of the code that is running on that software.

In this article, we'll visit some peaks of JIT-island, and probably even implement a compiler ourselves!

What we'll start with #

Knowing some compiler basics, we can assume that every compiler is transforming input in some format (usually, a source code) into the output in another or same format (usually, a machine code). JIT compilers are not an exception.

What really makes them exceptional, is the fact that they're running not ahead of time (like gcc, clang and others), but Just-In-Time (i.e. right before executing compiler's output).

To start developing our own JIT compiler we'll need to select the input language for it. Considering TOP GITHUB LANGUAGES FOR 2013 (SO FAR), JavaScript seems like a good candidate for implementing some limited subset of it with simplified semantics. Even more, we'll implement JIT compiler in the JavaScript itself. You can call it META-META!

AST #

Our compiler will accept JavaScript source code as its input, and produce (and immediately execute) machine code for the very popular X64 platform. But, while its pretty comfortable for humans to work with a textual representation, compiler developers are usually tending to create multiple Intermediate Representations (IR) before generating the final machine code.

Since we're writing simplified compiler, having only one IR should be enough for us, and I'll choose Abstract Syntax Tree (AST) representation for this purposes.

Getting AST out of JavaScript code is really easy nowadays, and we can choose any (of dozens) library we like: esprima, uglify-js, etc. Just to be on one page with me, I recommend you to choose esprima. It has a nice and well defined output format.

For example, this code: obj.method(42) will produce the following AST (using esprima.parse("...")):

{ type: 'Program',
  body:
   [ { type: 'ExpressionStatement',
       expression:
        { type: 'CallExpression',
          callee:
           { type: 'MemberExpression',
             computed: false,
             object: { type: 'Identifier', name: 'obj' },
             property: { type: 'Identifier', name: 'method' } },
          arguments: [ { type: 'Literal', value: 42 } ] } } ] }

Machine code #

Let's summarize: we have JavaScript source (check), its AST (check), and we want to get machine code for it.

If you're already familiar with assembly language then you can skip this chapter, as it contains only basic introductionary material on this topic. However, if you're new to it, reading next chapter may be hard without learning some basics first. So please stay here, it won't take too long!

Assembly language is the nearest textual representation of the binary code that your CPU(s) understand and is(are) able to run. Considering that processors are executing code by reading and running instructions one-by-one, it may seem logical to you that almost every line in assembly program represent an instruction:

mov rax, 1    ; Put 1 into the register named `rax`
mov rbx, 2    ; Put 2 into the register named `rbx`
add rax, rbx  ; Calculate sum of `rax` and `rbx` and put it into `rax`

This program's output (assuming you'll get it from rax register) is 3. And, as you've probably already figured out, it puts some data in some CPU slots (registers) and asks the CPU to calculate the sum of them.

Usually processors have enough registers to store results of intermediate operations, but in some situations you may want to store/load data (and work with it) from the computer's memory:

mov rax, 1
mov [rbp-8], rbx  ; Save rbx register into a stack slot
mov rbx, 2
add rax, rbx
mov rbx, [rbp-8]  ; Restore rbx register from a stack slot

Registers have names, memory slots have addresses. These addresses are usually written using [...] syntax. For example, [rbp-8] means: take the value of the rbp register, subtract 8, and access a memory slot using the resulting value as the address.

You can see that we're using rbp register here. rbp usually contains address at which on-stack variables storage (i.e. variables that are stored in current procedure's stack) starts; 8 is a size of rbx register (and any other register, prefixed with r), and since the stack is growing upwards, we need to subtract it from rbp to get a free address slot for our purposes.

There are many more nuances of programming at such a low level, and unfortunately I'm not going to cover all of them here. Also, please be aware that I gave you a very shallow description, and what actually happens here may sometimes be much more complex.

Knowing things mentioned above should be enough to proceed to the code generation.

Code generation #

Implementing the entire JavaScript is a rather complicated practice, so we'll implement only a simplified arithmetics engine for now. (Which should be as fun as getting to the whole thing later!)

The best and the easiest way to do it, is to traverse the AST using Depth First Search, generating machine code for each node. You might wonder how could you generate machine code in a memory-safe language like JavaScript. That's where I'm going to introduce you to jit.js.

It is a node.js module (and C++ addon, actually) capable of generating and execution of machine code, using assembly-like JavaScript syntax:

var jit = require('jit.js');

var fn = jit.compile(function() {
  this.Proc(function() {
    this.mov('rax', 42);
    this.Return();
  });
});
console.log(fn());  // 42

Let's write it #

Thus only one thing left now, a module to traverse the AST tree, generated by esprima. Thankfully, considering its structure and our minimalistic compiler design it should be pretty easy.

We're going to support:

Number literals ({ type: 'Literal', value: 123 })
Binary expression, with operators: +, -, *, /, % ({ type: 'BinaryExpression', operator: '+', left: ... , right: .... })
Unary expression, with the - operator ({ type: 'UnaryExpression', operator: '-', argument: ... })

All these operations are performed on integers, so don't expect it to work properly with values like 0.5, 0.66666, etc.

While processing expression, we'll be visiting each supported AST node of it, generating code that returns it's result in the rax register. Sounds easy, right? The only rule here is that we should keep all other registers clean after leaving the AST node. Which, in other words, means that we should save all registers that are used and restore them after they're not needed anymore. Fortunately, CPUs have two magic instructions push and pop that can help us with that task.

Here is the resulting code with descriptive comments:

var jit = require('jit.js'),
    esprima = require('esprima'),
    assert = require('assert');

var ast = esprima.parse(process.argv[2]);

// Compile
var fn = jit.compile(function() {
  // This will generate default entry boilerplate
  this.Proc(function() {
    visit.call(this, ast);

    // The result should be in 'rax' at this point

    // This will generate default exit boilerplate
    this.Return();
  });
});

// Execute
console.log(fn());

function visit(ast) {
  if (ast.type === 'Program')
    visitProgram.call(this, ast);
  else if (ast.type === 'Literal')
    visitLiteral.call(this, ast);
  else if (ast.type === 'UnaryExpression')
    visitUnary.call(this, ast);
  else if (ast.type === 'BinaryExpression')
    visitBinary.call(this, ast);
  else
    throw new Error('Unknown ast node: ' + ast.type);
}

function visitProgram(ast) {
  assert.equal(ast.body.length,
               1,
               'Only one statement programs are supported');
  assert.equal(ast.body[0].type, 'ExpressionStatement');
  visit.call(this, ast.body[0].expression);
}

function visitLiteral(ast) {
  assert.equal(typeof ast.value, 'number');
  assert.equal(ast.value | 0,
               ast.value,
               'Only integer numbers are supported');

  this.mov('rax', ast.value);
}

function visitBinary(ast) {
  // Preserve 'rbx' after leaving the AST node
  this.push('rbx');

  // Visit right side of expresion
  visit.call(this, ast.right);

  // Move it to 'rbx'
  this.mov('rbx', 'rax');

  // Visit left side of expression (the result is in 'rax')
  visit.call(this, ast.left);

  //
  // So, to conclude, we've left side in 'rax' and right in 'rbx'
  //

  // Execute binary operation
  if (ast.operator === '+') {
    this.add('rax', 'rbx');
  } else if (ast.operator === '-') {
    this.sub('rax', 'rbx');
  } else if (ast.operator === '*') {
    // Signed multiplication
    // rax = rax * rbx
    this.imul('rbx');
  } else if (ast.operator === '/') {
    // Preserve 'rdx'
    this.push('rdx');

    // idiv is dividing rdx:rax by rbx, therefore we need to clear rdx
    // before running it
    this.xor('rdx', 'rdx');

    // Signed division, rax = rax / rbx
    this.idiv('rbx');

    // Restore 'rdx'
    this.pop('rdx');
  } else if (ast.operator === '%') {
    // Preserve 'rdx'
    this.push('rdx');

    // Prepare to execute idiv
    this.xor('rdx', 'rdx');
    this.idiv('rbx');

    // idiv puts remainder in 'rdx'
    this.mov('rax', 'rdx');

    // Restore 'rdx'
    this.pop('rdx');
  } else {
    throw new Error('Unsupported binary operator: ' + ast.operator);
  }

  // Restore 'rbx'
  this.pop('rbx');

  // The result is in 'rax'
}

function visitUnary(ast) {
  // Visit argument and put result into 'rax'
  visit.call(this, ast.argument);

  if (ast.operator === '-') {
    // Negate argument
    this.neg('rax');
  } else {
    throw new Error('Unsupported unary operator: ' + ast.operator);
  }
}

You can try it by cloning it from github, running npm install in it's folder and then voila!

$ node ./main.js '1 + 2 * 3'
7

Thanks for reading up to this point! I'll talk about floating point operations and the heap in the next blog post!

Previous: DTrace and the little ustack helper that could
Next: Allocating numbers