This is a lab that walks you through one approach to the challenge of how we can have easy and high performance language virtual machines.
It's designed to come after Mario Wolczko's lecture on a Concise and Opinionated History of Virtual Machines, and illustrates practically some of the concepts and techniques introduced there.
This lab is about just one approach to easy and high performance VMs, and that is the Truffle and Graal approach. Truffle is a framework for developing self-specialising AST interpreters in Java, and Graal is a dynamic compiler, which Truffle uses on your behalf to give you a just-in-time compiler for your interpreter, using partial evaluation.
This lab of course won't teach you how to work on a production language or compiler in an hour and a half, but I hope it can concretely do is to show how Truffle makes it easy to implement a language, and how accessible the Graal compiler is, so you know you can reach for these tools if you need them in future.
- Chris Seaton
- chris.seaton@oracle.com
- https://twitter.com/ChrisGSeaton
-
Refresh the big ideas of self-specialising AST interpreters and partial evaluation
-
Set up for Truffle development
-
Tour a simple language interpreter in Truffle
-
Add new functionality to the simple language
-
Combining Truffle with Graal for partial evaluation
-
Tour Graal
-
Stretch ideas
You will need:
-
Internet access
-
Some basic development tools like Python 2 and Git
-
A Java 8 JDK and some Java ecosystem tools like Maven 3
-
An IDE like Eclipse (Neon and Oxygen version both work), or a plain text editor and
grepif you prefer -
Probably in practice you need me to talk you through the tasks - it's supposed to be interactive
Mario talked about two techniques used in implementing high-performance language implementations easily - AST interpretation and partial evaluation. We can demonstrate these two key ideas to ourselves in order to make sure that they're clear.
Here's a simple AST interpreter in Python.
I haven't written a parser, as we'll leave parsers as a solved problem for this lab, and instead we'll create AST nodes directly by using their constructors.
You can see the essential parts of the AST interpreter - the node classes,
and the execute methods. See how the execute method of the AddNode
calls execute on the left and right operands.
class ConstantNode:
def __init__(self, value):
self.value = value
def execute(self):
return self.value
class AddNode:
def __init__(self, left, right):
self.left = left
self.right = right
def execute(self):
return self.left.execute() + self.right.execute()
class SubNode:
def __init__(self, left, right):
self.left = left
self.right = right
def execute(self):
return self.left.execute() - self.right.execute()
print AddNode(ConstantNode(6),
SubNode(ConstantNode(4), ConstantNode(2))).execute()Tasks:
- Add
MulNodeandDivNodefor a multiply and divide operation.
This just means adding a new class, and then you can see how you add the logic
for the operation in the execute method. Notice how the code needed for the
new operations are contained within the new classes - we don't have to modify
a central execute method. This is an advantage of writing AST interpreters.
- Manually partially evaluate your interpreter for the expression
1 + 2 - 3
Partial evaluation is a technique for compiling or executing an AST interpreter. What it means in practice is that we take the code for a call to an execute method and inline that code.
AddNode(ConstantNode(1), ConstantNode(2)).execute()
ConstantNode(1).execute() + ConstantNode(2).execute()
1 + 2
3
This is effectively what the Graal partial evaluator we're going to be using does.
To look at a Truffle language we're going to need the latest release of GraalVM. You can download it for Linux from GitHub, or from OTN for macOS:
- https://github.com/oracle/graal/releases
- http://www.oracle.com/technetwork/oracle-labs/program-languages/downloads/index.html
After extracting it, set the JAVA_HOME environment variable to point to it (to
Contents/Home on macOS).
You then want to clone an example Truffle language, SimpleLanguage. SimpleLanguage is a bit like a simpler version of JavaScript and is heavily commented to show how Truffle is used to build a language. After cloning it you can build it using Maven - it's just a normal Java application.
$ export JAVA_HOME=`pwd`/graalvm-ee-1.0.0-rc3
$ git clone https://github.com/graalvm/simplelanguage.git
$ cd simplelanguage
$ mvn package
Tasks:
-
Compile SimpleLanguage
-
Run the SimpleLangauge
HelloWorld.sltest program
Look at the sl launcher program, and the simplelanguage/language/tests
directory.
- Run some more test programs that look interesting.
SimpleLanguage is designed to be easy to read. A good way to navigate it is to
also use the Eclipse IDE. Install the m2e and m2e-apt plugins from Help,
Eclipse Marketplace, and then do File, Import, Maven, Existing Maven
Projects to import SimpleLanguage.
Tasks:
- Find the AST node Java class that implements the addition operator
Use Navigate, Open Type to search for classes.
You'll see that it appears to have multiple execute methods, unlike our Python
example, and that the results of executing child nodes are passed in as
parameters to the method, rather than executing them yourself. Each overload
matches different pairs of types from the results from the children. This
reduces the number of if branches you need to write to match the types that
your operators support.
Read the comments in this file.
- Find where the parser emits the node
A companion class called SLAddNodeGen is created automatically by Truffle.
Look at SLNodeFactory to see the node being created. Note how the parser
directly creates a SLAddNode - there are no other intermediate
representations.
- Find the
ifnode, and where the parser emits it
The SLIfNode looks different to SLAddNode and doesn't have child values
passed in as parameters. Why do you think that is? What does this logic talking
about profiling do?
- Find the
returnnode - who catches the exception it throws?
SLReturnNode throws an exception. Why is it doing that if there's no error.
Who catches this exception? Use Eclipse's right click and Find References on
the class name to find out.
- Add support for the
<operator comparing two strings for lexical ordering
Can you get this program working by modifying SLLessThanNode? You will need
to run mvn package before running sl again.
function main() {
println("a" < "b");
println("b" < "a");
println("a" < "a");
}
- Run the SimpleLanguage
LoopCall.sltest program
LoopCall.sl runs in a loop, so we'd expect parts of it to get hot and get
compiled. Run with the -J-Dgraal.TruffleBackgroundCompilation=false -J-Dgraal.TraceTruffleCompilation=true options to see this happening. root add is the root node, of the add function, being compiled.
- Dump Graal graphs from
LoopCall.slto IGV
IGV is a graphical tool to understand what Graal and Truffle are doing. Download IGV from GitHub:
https://github.com/oracle/graal/releases
$ idealgraphvisualizer/bin/idealgraphvisualizer &
When you open IGV, click Remove State on the right-hand side, to make the graphs simpler.
Then run sl with the -dump option. Explore the tree on the left of the IGV
screen.
-
Find the Truffle AST graph in IGV
-
Find the add node in the Graal graph for the
addfunction in IGV
Look at After TruffleTier and see if you can make some sense of the compiler
nodes you see there. Can you relate them back to the Java code for the AST
interpreter?
-
Look at the graphics for the
loopfunction. How does Graal represent thewhileloop? How is it different to how thewhileloop is represented in the Truffle AST? -
Dump assembly from
LoopCall.sl, and find the relevantaddassembly instruction in the compiledaddfunction
Use the -disassemble option.
-
Why is the
addinstruction followed by ajoinstruction and what does this do? -
Where does the
joinstruction jump to?
To build Graal in order to open it in an IDE, we need to install a special JVM with a feature called JVMCI. You can get this for Linux from GitHub or OTN for macOS.
- http://www.oracle.com/technetwork/oracle-labs/program-languages/downloads/index.html
- https://github.com/graalvm/openjdk8-jvmci-builder/releases/tag/jvmci-0.46
Then clone our special build tool, mx, and then Graal (clone at depth 1 to
save time).
$ git clone https://github.com/graalvm/mx.git
$ export PATH=`pwd`/mx:$PATH
$ git clone --depth 1 https://github.com/oracle/graal.git
$ cd graal/compiler
$ JAVA_HOME=~/Documents/labsjdk1.8.0_172-jvmci-0.46/Contents/Home mx eclipseinit
Open Eclipse, and open the graal directory as a workspace.
File, Import, General, Existing Projects Into Workspace and select the
graal directory again.
- Find the Graal IR graph node Java class that represents the
addinstruction we had in theaddSimpleLanguage function
Like SimpleLanguage had SLAdd, Graal has a node in the compiler for
IntegerAddExactNode, that we saw when we used IGV. How does this class differ
from the AST node? What do you think the canonical method does?
- Find where this node is created from Java bytecode
Can you find where this node is created? Is is created by a parser? What kind of parser is it?
- Can you follow the code from
IntegerAddExactNodeto find where that jump on overflow branch is created, and what bytes get emitted?
You should end at the jcc method in AMD64Assembler. What is addPatchAt all
about?
-
What happens in
IntegerAddExactSplitNodeif the operation cannot overflow? -
Follow this code through to find what machine-code bytes are emitted for add-with-overflow on the AMD64 architecture
-
Find the Truffle JavaScript implementation, clone it and explore it
-
See if you can add a new language feature to JavaScript