Directions in Future Languages - Software Transactional Memory (30 Jan 2005)

STM is a method of handling concurrency in multithreaded systems. The light at the end of this tunnel is that we wish to able to write the following:

def add-child(x):
  global children-by-name, children-by-id
    children-by-name[] = x
    children-by-id[] = x

At no point would any other thread see the child in one map, but not the other. Traditionally this would be done with locks. The maps themselves would have locks inside them, of course, and would thus be `thread safe', but we would need to implement our own locking in order to achieve atomicity between them. An STM says that, when you enter an atomic block, nothing happens until you exit it and then it all happens at once. If another thread altered children-by-name while you were processing the block above the atomic block would be aborted and attempted again.

Lock-free techniques:

The design I'm outlining is is from the lock-free group at Cambridge and a good explaination can be found in Keir Fraser's PhD dissertation. So why am I reiterating it here? Partly because I'm probably going need to write something like this for a report of my own at some point, but also because digging into a PhD can be tough work and these ideas deserve to be better known.

So here's an STM linked list:

The first thing to notice is that nothing holds a direct pointer to anything - they are all via STM object headers. The object header holds the true pointer to the data.

When starting a transaction, a transaction context must be created. This holds the status of the transaction (undecided at the moment) and two lists; a read list and a write list. In this STM design, objects must be `opened' - either for reading or for writing. When you open an object you get a pointer to the true data and that object is recorded in the context.

Thus you open the objects you need (say, opening for read each list element in turn until you find one to delete, thus opening the previous one for write). When an object is opened for write, a copy is made and a pointer to that is returned. Thus you don't edit `live' objects.

So assume that you're removing the last element from this list. Thus you've read the first element and altered the second element (to put a null pointer in its next field). Your memory now looks like this:

Since you are finished altering the list you commit the transaction. At this point we need to introduce a hardware level primitive that we'll be using. In the litrature it's called CAS (Compare and Swap) and, on Intel systems, it's called Compare and Exchange (cmpxchg). It takes three arguments: (old value, new value, memory location) and replaces the contents of memory location with new value if, and only if, its current value is old value, returning the contents of memory location at the end of the instruction - and it does this atomically.

So the CAS operation allows us to update a value in memory, assuming that someone hasn't already beaten us to it. A transaction commit uses this operation to change the pointer in the object headers of objects in the write list, to point to the transaction context instead. This is called `aquiring' the header and it is done in order of increasing memory address.

So, assuming that no other transaction is comming and has beaten us to it, our memory now looks like:

But what if another thread is commiting a conflicting transaction? In that case we'll find a pointer to its transaction context when we do a CAS and we recursively help. Since we have a pointer to its context, we have all the information we need in order to perform its commit ourselves, so we do so. This is to ensure that some progress is always being made by the program as a whole. Unfortunately, it also means that our transaction is void so we abort and try it again.

Assuming that we have accquired all the object headers that we are writing we have to check that none of the objects that we read have been updated in the mean time. If everything looks good we update the pointers in the object headers to point to our shadow blocks and release them - the transaction is complete:

(Note, this is a simplification - see the paper for the full details. Gray objects are now garbage.)

There are several other tricks which can be added to the above. The same Cambridge group has some ideas in one of their recent papers about how IO can be included in a tranaction. IO is, of course, a side effectful operation so clashes if we ever need to `roll-back' a transaction. The Cambridge group have implemented IO in Java which can rebuffer and be included in a transaction.

More cool ideas are presented in this paper (which is talking about a Haskell based STM). There the author presents a primitive called retry which aborts the current transaction and only retries is when an object in the read-list is updated. Thus a queue can be implemented like this:

class Q:
  def get(self):
    if self.length == 0:
    return self.q.pop()

Thus no more missed condition variables resulting in a stuck thread.