Lizards are cool

Yesterday Janie and I spent the day in Fredricksburg, a quaint German town about an hour east of Austin. Its a great place for beer, and German food. While there we went to the National Museum of the Pacific War (the famous Admiral Nimitz was born in Fredricksburg).

While there we saw this old WWII tank:

On closer inspection, we saw something unexpected and rather cool, here it is blown up:

Its a small lizard, which had changed to be almost exactly the same color as the tank!

Swarm: A true distributed programming language

“Cloud computing” is becoming such a popular buzzword that it now even has its very own backlash.  Since I designed Freenet, I’ve been interested in the problem of building computing systems that can scale effectively.  While Freenet focused on the storage and retrieval of information, I’ve often wondered whether something similar could be created for computation in general.

The result of this is “Swarm”.  Before anyone gets too excited, note that Swarm is currently just a proof-of-concept prototype, not anywhere close to being suitable for actual use. That being said, there is working code.

Fundamentals

The fundamental concept behind Swarm is that we should “move the computation, not the data”.

The Swarm prototype is a simple stack-based language, akin to a primitive version of the Java bytecode interpreter.  I wanted the proof of concept to be quick to implement, while demonstrating that the concept could work for a popular runtime like the JVM or Microsoft’s CLR.

The Prototype

The prototype is implemented in Scala, and I will use snippets of Scala code below, but a knowledge of Scala won’t be required to understand the rest of this article.  I chose Scala because I wanted to learn it, and because its rich semantics tends to make coding easier and faster than Java (my normal language of choice).

As with the JVM, there are three places to store data in the Swarm VM: the stack, a local variable array, and the store.  The stack is used for intermediate values in computations, data here tends to be very short-lived.  In the prototype it is implemented as a List[Any].  The local variable array is for data that is used within a block of code (its implemented as a Map[Int, Any]).

The “Store”

The “store” is somewhat analogous to the JVM heap.  It is used for long-term storage of data, indeed, in an actual implementation it may be persistent, and/or transactional, but in the prototype it is in-memory.  The store contains “objects”, each of which is a list of key-value pairs.  The values may be references to other objects.  The store is implemented as a Map[Int, Map[String, Any]].

The store is the key to how Swarm is distributed.  A store can be spread across multiple computers, and objects in a store can point to objects on remote computers.  While the objects stored on different computers are different, all the computers contain the same computer program (or must have access to it).

When some code is running on a computer that wants to read or write to an object stored locally, this can be done as it would with any other computer, a very efficient operation.  But what if the code needs to access a remote object?  Well, in this case, rather than moving the remote object across the network to the local computer, we freeze the computation, and move it to the remote computer, where it resumes execution – now able to access the object directly.

Continuations

Freezing the code execution in this way is called a “continuation”.  A continuation is akin to using the “Save Game” feature in a computer game.  It allows you to start the game from the same place in the future, or even to email the saved game to a friend of yours and have them continue where you left-off.

Continuations are supported by some advanced programming languages like Haskell and Ruby, but not other more common languages like Java and Python (at least, not directly).

In practice, the continuation must contain the current stack, the current local variables, and the program counter.  These three things represent the “state” of a Swarm computer program.  If you can transmit these three things to another computer that also has access to the same computer program, then it can pick up where you left off, and the author of the computer program doesn’t need to know or care about this little bit of magic.

Of course, sending this state across the network, even though it will typically be quite a small amount of data (probably fitting in a single IP packet), will still introduce a significant delay relative to the normal speed of code execution.  For this reason, Swarm must try to ensure that objects are distributed among the computers in the cluster in such a way that over-the-network traffic is minimized.

Deciding where objects should go

Our goal is to minimise the number of times continuations must be packages up and sent across the network.  We can do this by ensuring that objects which are “tightly coupled”, meaning that the reference from one to the other are frequently used by the code, reside on the same computer.

For now we use a very simple approach to this.  When an object is created, we attempt to put it on the same computer as the last object accessed.  This is easy, because this is the computer that will be executing the code.  The only situation under which we don’t do this is if the current server has too many objects relative to the other servers, in which case we create the object on the least-full server.  This helps to balance objects across the servers.

Of course, this is a very simple approach, in a real-world implementation we’d want much more sophisticated load-balancing that monitored actual usage and moved objects as necessary.

The simulation

To test these ideas I wrote a simulation in Scala.  My goal was to create something (to quote Einstein) as simple as possible, but not simpler.

First I created a simple stack-based language.  It supports primitive control commands, Goto and If.  It also supports mathematical operations like addition, generation of random numbers, and greater than comparison.  Lastly, it supports commands to create, modify, and retrieve objects in the store.  Recall that each object is a list of key-value pairs, where the key is a string, and the value can be a number, string, or a reference to another object.  At the time of writing the language doesn’t support subroutines, but these could be added with relative ease.  You can find the definitions of these commands in Instruction.scala.

I used immutable datastructures for the state of the execution, a List for the stack (Scala’s Stack implementation is buggy – yay Scala!), and an IntMap for the local variables.

The simulated Store

Next I implemented the Store class.  This is a map of integers to objects.  Each object is itself a map of strings to arbitrary values, which could include integers, strings, ObjectRef objects, among others.

Overseeing the various Stores is a Cluster.  Its role is to tell us when we should create new objects, using its loadBalance() function.  It will generally create the object on the current computer (aka “node”), but will sometimes say that it must be created on another node to ensure objects are reasonably well distributed.

The load balancing algorithm is simple: No store is permitted to have less than half the free-space of the least-full store.  If it is, then the object is created on the least-full store instead.

A test program

Next I needed to write a simple program to see how the Swarm prototype performs.  I needed something simple, but which would hopefully be representative of the way real-world software creates and uses objects.

I decided to implement a simple binary-search tree, and then insert random numbers into it.  It took me a while, and was rather tedious, but eventually I got something working.  You can see the end result in Test2.scala.  Initially line numbers had to be specified explicitly, this was a royal PITA, so I later implemented a label mechanism, and a “compilation” pre-processing stage (implemented in Interpreter.scala) that assigned line-numbers to the labels.

The experiment

I created 5 virtual computers, each with a capacity of 20 objects.  Of course, this is a ludicrously small number of objects (in the real world, a node would be able to handle millions, even billions of objects), but I wanted to stress the system.

Results

Here is a graph of the resultant tree, the color indicates which of the computers the object is stored on – as you can see the root of the tree was stored on the pink computer.
g1.pdf (1 page)

In this experiment, I show the percentage of instructions that result in a continuation being transmitted between computers as the size of the tree is increased:

Microsoft Excel

As can be seen, it increases rapidly before leveling out around 3.25%.

Analysis

Even the simplistic load-balancing approach employed in the simulation seems to be reasonably effective, only 3.5% of instructions required some communication between nodes.  A number of factors conspired to make this a difficult task for it:

  • The insertion of data to random points in the tree make it more difficult for the algorithm to group branches of the tree together on the same computer
  • Each node had very limited capacity

Immediate future

Continuous re-allocation of objects between computers in response to actual usage is key.  This will need to be a trade-off between efficiency of the re-balancing process, and its effectiveness.  Near-term work on the simulation is likely to focus on this.

Long term

I guesstimate that turning this concept into an actual working piece of software would be a 5 man-year effort.  Among the major questions/challenges are:

Can we work with an existing virtual machine, or do we need a new language?

Ideally we would be able to make Swarm support a widely-used bytecode language such as the Java Virtual Machine, Microsoft’s Common Language Runtime, or the LLVM compiler infrastructure.  Swarm’s simple bytecode is a simplification of the Java Virtual Machine for this purpose.  However, it is possible that we will uncover reasons that we need a new language for this.  That would be unfortunate if the case, as it would create a significant barrier to entry.

What about persistence, transactions, and replication?

If we want Swarm to have the potential to replace databases, then it will need to be able to store objects persistently (ie. on disk).  It will also need to support transactions (so that we can be sure that the stored data is always in a consistent state).  We will also need a solution to ensure that it is fault-tolerant, meaning some kind of replication scheme.

Wrap up

Right now Swarm is still more of a thought-experiment for me, as I am still very-much focussed on my day job.  That being said, I have had some conversations with a number of people who have expressed an interest in providing funding to make Swarm a reality.  As I mentioned before, it would be a big task, probably requiring around 5 very smart people working full-time for at least a year.  The potential, however, is huge – just imagine being able to write massively scalable applications without having to think about scalability and persistence!

Source code

You can find the Scala source code in a public git repository here.  The code is not that well commented, and at the time of writing it will spit out a Graphviz file to produce the tree diagram above, its not really intended for public consumption – but it may be interesting for people to browse.

I will grant push access to the repository to anyone that thinks they can make a useful contribution, just email me at ian dot clarke at gmail dot com.

Note: Its been pointed out that Swarm is also the name of a multi-agent research platform. Swarm is just the name of the idea at this stage, if it ever turns into an actual programming language I’ll find a better name for it

Making sense of the economic crisis

Like most people, I’ve been trying to make sense of the economic crisis.  Yesterday NPR’s “This American Life” did a great radio show on the situation, it is really excellent.  Listen to the mp3:

If you would like a day-to-day update, NPR’s Planet Money show is excellent, subscribe to the podcast here.

The free-market, more accurate than polls?

HubDub is a great company based in Edinburgh (where I used to live), who apply the principles of the free market to predicting the outcomes of news stories. I’ve had a good chat with them, and they say that their market tends to be spookily accurate. I’ve heard anecdotal evidence that when this type of market is applied to predicting when internal projects will be completed within Microsoft, they were universally more pessimistic, and more accurate, than the project manager’s estimates.

Here is the current prediction for who will win the US Presidential election, right now it has Obama winning (phew!). It will be interesting to look back and see how accurate it was in retrospect:

The Ballmer Peak

Google’s Chrome on OSX

Feeling left out of the fuss over Google’s new Chrome web browser because you use a Mac or Linux?  Well, I’m using a Mac and writing this blog post in Chrome thanks to this unofficial port of Chrome to OSX by Codeweavers as a demonstration of their tech for making it easier to port apps between operating systems.

Its not perfect, scrolling and dragging the window are a little wobbly, but its a good way to get a feel for the new browser.  Unfortunately word has it that it will be quite a while before there is an official port to OSX :-(

Twitter clone Yammer wins TC50

Am I the only one to be a little surprised by the fact that the winner of the TechCrunch 50 competition, basically “American Idol” for tech startups, was a spectacularly unimaginative Twitter clone? I mean, even the idea to build a Twitter clone has been done to death (*cough* Pownce *cough*)!

Silicon Valley is starting to remind me of Hollywood, increasingly devoid of fresh ideas, money is being pumped into “Its like the Care Bares Movie meets Robocop!”-style unimaginative combinations of well-worn ideas.

If I had a dollar for every “Its like X, but for the enterprise!” business plan, I could probably start a VC fund myself.

Weather in Texas is rarely dull

Matt Damon on Sarah Palin

I normally don’t look to actors for political insight, but I think Matt is spot on with this one. I could live with McCain as President, at least he’s no Bush, but Palin would be worse than Bush in almost every way. She is more of a religious-right ideologue (even Bush has come around somewhat to global warming, apparently she hasn’t). She has far less leadership experience than he did (Texas > Alaska), and my impression is that she has less foreign policy experience (which is hard).

A McCain Presidency would be a kick in the teeth, but a Palin Presidency is truly a frightening prospect and we can’t afford even the chance that it would happen.

Here is what Matt had to say:

OtherInbox has launched (and I’ve got free invites for you!)

My good friend Josh Baer launched OtherInbox on Monday. I’ve been fortunate enough to be a beta tester for the last few months, and I can honestly say that it is one of a very small number of web tools that I use every single day (along with Reddit, and Google Calendar).

Josh gave an amazing talk at TechCrunch 50 describing OtherInbox, unfortunately WordPress.com won’t let me embed the video (grrr!) – so you’ll need to go here and check it out.

But don’t forget to come back here afterwards because the first 25 people to click this link will get a free invite!

Oh, one last thing, be sure to go and vote for OtherInbox here, they are currently winning but have some stiff competition!