«The undecidability of liveness is a corollary of the halting problem»Richard Jones, Antony Hosking, Eliot Moss; The Garbage Collection Handbook.
A point in memory — a cell — is said to be alive if it will be accessed at any time in the future by the mutator — any of the threads of execution of your program. Unfortunately, to decide the liveness of a cell is equivalent to decide whether a program will halt: it is not possible in the general sense. We know since Turing and Church, it’s the Halting Problem, a corollary of the Entscheildungsproblem. Computer Sciences are full of things you will never be able to do because of it, and liveness is one of them.
But liveness can be nicely approximated by a property that is decidable: pointer reachability1. An object N can be reached from another object M if it can be reached by following a chain of pointers from M. Note that a program can have a reference to an object that will never access again: this is a case of a dead but reachable object, which is undesirable for reachability to be a good approximation.
It’s obvious to notice that the set of all objects of a program, together with all pointers in the program, form a directed graph: objects are nodes, and pointers are the arrows. Reachability then induces a preorder2 over this directed graph: in mathematical terminology, a binary relation between elements of the set, that is specially reflexive (a is reachable from a), and transitive (if c is reachable from b, and b is reachable from a, the c is reachable from a — just follow the pointers). Moreover, this relationship needs not be total (two different objects need not be related, that is, not reachable from each other, it’s always finite (memory itself is a discrete set), and it can contain cycles.
Tracing Garbage Collection
We can do many things with this graph. The first thing to notice, as shown in the picture above, is that it can have disconnected components. This is important: if you can detect disconnected components, you may realize some of these components are not necessary for your program. For this, we need to define the root set: the set of objects of the graph at the heart of the program, those objects whose lifetime is automatically determined by the program itself. If you recall the memory segments, where we had data, stack, and the heap, then the root is the data segment plus the stack segment. Then, anything that doesn’t belong to the connected component where data and stack live, is for sure dead, and can be deallocated.
This is the idea behind the most popular kind of garbage collection: tracing collection. This was created by McCarthy for his newborn language Lisp 3, in what was called the Mark & Sweep algorithm, which goes as follows:
cell* ref = allocate();
if (ref == nullptr)
ref = allocate();
if (ref == nullptr)
throw exception(“Out of memory”);
for (auto& ref : ROOTS)
Address addr = ref.address();
if (addr && !addr.is_Marked())
Address cursor = Heap.start();
while (cursor < Heap.end())
The idea is to trace all the nodes reachable from the roots, mark them as reachable, and then trace the entire heap, and free everything that is not marked as reachable.
Reference Counting Garbage Collection
Another way to work with the memory graph, is by keeping track of “how much reachable” a node is. That is, we can reduce our point of view, instead of to the entire graph, to individual nodes. Each node then needs to know if he’s still reachable from somewhere else, otherwise he can die in peace. This way, the responsibility of deallocations is much more fine-grained: nodes are responsible for themselves, instead of the entire heap acting as a node’s nanny. Nodes then need to keep track of how many other nodes can see him: he needs to count the number of references towards him.
This is an idea from Collins 4. In this view, each new reference increments the count, and each free decrements it: when the count reaches zero, the node is not reachable any more, so it can be deallocated and decrement the counts of the nodes he can reach in one level. This requires a barrier, an extra step to be taken any time a reference is copied. The algorithm goes like this:
Cell* cell = allocate();
cell->ref_count = 0;
cell->ref_count = cell->ref_count - 1;
if (cell->ref_count == 0)
for (Cell* child : cell->Children())
Cell* newCell = new Cell();
newCell->data = cell.data;
newCell->ref_count = cell.ref_count;
The function copy_cell is automatically inserted by the framework, at compile-time or at run-time, any time a reference to the heap is copied. This way, at all times we know how many times a cell is referenced. A cell is then a box with two indirected contents: the value of the cell, and the reference count. In copying a reference, we increment a shared ref_count, and copy the pointers to both the data and the counter.
Taking two worlds apart
When is one style preferred over the other? To answer this question, it is necessary to consider goals and analyse trade-offs. For example, if you’ve ever heard anything about Garbage Collectors, it probably is about the pause times. This is a problem of tracing collectors, for which McCarthy was perfectly aware: at a specific moment — in McCarthy’s model, when the allocator runs out of memory, the collector stops the world and executes long procedures to clean the heap; hence we say that Tracing collectors collect in Batch style, collection is deferred until many things are to be collected at once, incurring in noticeable pauses.
Analogously, reference counting collects a node as soon as it becomes unreachable, and it collects that node only: there’s no theoretical pause time, as the pause is the time it takes to collect one node — with the particular case of maybe making another node unreachable while collecting this node — only, and not the entire heap. This way, ref_count-ing is real-time, and incremental. But ref_count has two important disadvantages: one is the write barrier, the fact that every copy requires updating a ref_count, and to make it even more expensive, atomically (just try to see the multi-threading problems if not), which lowers its overall throughput. The second is perhaps even more painful: ref_counting does not work on cycles!
Imagine the following scenario:
B* b = A.getB(); // <– ref_count for B is 1
b->c = new C();
b->c->d = new D();
b->c->d->e = new E();
b->c->d->e->b2 = b; // <– here, a write barrier gets injected, and the ref_count for B is incremented to 2
delete (b->c); // <– here, the ref_count for B is decremented to 1
See what happens? That’s a problem.
Cycles are often solved by having a tracing collector run every once in a while — that’s how Python works it out —, or by careful handcrafting of weak references, which mean there’s a second weak-ref_count that doesn’t necessarily keep the object alive but might be promoted to a strong ref_count if the object is still there — that’s how smart pointers in modern C++ work.
One interesting thought to leave you with, in case none of this seems complicated and you want some more interesting challenge, is the following: notice how, in the memory graph, tracing effectively computes the alive component, while ref_counting effectively computes its complement, the dead component. We can imagine that tracing starts with all nodes“ ref_count set to zero, and computes their increments, while ref_counting starts with a ref_count set to the real value and computes the decrements. See the parallel? I’ll tell you a bit more about that next time.
- Another example of a preorder is the subtyping relationship: over the set of all types of a program, whether A is a subtype of B induces the same structure. Preorders can also be seen as thin categories.
- John McCarthy, Recursive Functions of Symbolic Expressions and Their Computation by Machine, Part I; 1960
- George Collins, A Method for Overlapping and Erasure of Lists, 1960