Automatic Memory Management

An example of an industry that needs automatic cleaning

«The unde­cid­ab­il­ity of live­ness is a corol­lary of the halt­ing problem»

Richard Jones, Antony Hosking, Eliot Moss; The Garbage Collection Handbook.

A point in memory — a cell — is said to be alive if it will be accessed at any time in the future by the mutat­or — any of the threads of exe­cu­tion of your pro­gram. Unfortunately, to decide the live­ness of a cell is equi­val­ent to decide wheth­er a pro­gram will halt: it is not pos­sible in the gen­er­al sense. We know since Turing and Church, it’s the Halting Problem, a corol­lary of the Entscheildungsproblem. Computer Sciences are full of things you will nev­er be able to do because of it, and live­ness is one of them.

But live­ness can be nicely approx­im­ated by a prop­erty that is decid­able: point­er reach­ab­il­ity1. An object N can be reached from anoth­er object M if it can be reached by fol­low­ing a chain of point­ers from M. Note that a pro­gram can have a ref­er­ence to an object that will nev­er access again: this is a case of a dead but reach­able object, which is undesir­able for reach­ab­il­ity to be a good approximation.

Graph. Reachability, a reflexive and transitive relation
A reflex­ive and trans­it­ive relation

It’s obvi­ous to notice that the set of all objects of a pro­gram, togeth­er with all point­ers in the pro­gram, form a dir­ec­ted graph: objects are nodes, and point­ers are the arrows. Reachability then induces a pre­order2 over this dir­ec­ted graph: in math­em­at­ic­al ter­min­o­logy, a bin­ary rela­tion between ele­ments of the set, that is spe­cially reflex­ive (a is reach­able from a), and trans­it­ive (if c is reach­able from b, and b is reach­able from a, the c is reach­able from a — just fol­low the point­ers). Moreover, this rela­tion­ship needs not be total (two dif­fer­ent objects need not be related, that is, not reach­able from each oth­er, it’s always finite (memory itself is a dis­crete set), and it can con­tain cycles.

Directed Disconnected Graph
Directed Disconnected Graph

Tracing Garbage Collection

We can do many things with this graph. The first thing to notice, as shown in the pic­ture above, is that it can have dis­con­nec­ted com­pon­ents. This is import­ant: if you can detect dis­con­nec­ted com­pon­ents, you may real­ize some of these com­pon­ents are not neces­sary for your pro­gram. For this, we need to define the root set: the set of objects of the graph at the heart of the pro­gram, those objects whose life­time is auto­mat­ic­ally determ­ined by the pro­gram itself. If you recall the memory seg­ments, where we had data, stack, and the heap, then the root is the data seg­ment plus the stack seg­ment. Then, any­thing that does­n’t belong to the con­nec­ted com­pon­ent where data and stack live, is for sure dead, and can be deallocated.

Data Stack Heap Disconnected Components
data has a stat­ic A, the stack has a ref­er­ence to B through A.getB(), and the graphs start­ing in Y and X are not reach­able from the pro­gram any more.

This is the idea behind the most pop­u­lar kind of garbage col­lec­tion: tra­cing col­lec­tion. This was cre­ated by McCarthy for his new­born lan­guage Lisp 3, in what was called the Mark & Sweep algorithm, which goes as follows:

[cc_cpp]
cell* new_cell()
{
cell* ref = allocate();
if (ref == nullptr)
{
Worklist

worklist;
mark(worklist);
sweep(worklist);
ref = allocate();
if (ref == nullptr)
throw exception(“Out of memory”);
}
return ref;
}
[/cc_cpp]

[cc_cpp]
void mark(Worklist

& work­list)
{
for (auto& ref : ROOTS)
{
Address addr = ref.address();
if (addr && !addr.is_Marked())
{
addr.set_Marked();
worklist.add(addr);
mark(worklist);
}
}
}
[/cc_cpp]

[cc_cpp]
void sweep(Worklist

worklist)
{
Address curs­or = Heap.start();
while (curs­or Heap.end()) { if (worklist.Contains(cursor)) worklist[cursor].unset_Marked(); else free(cursor); cursor.next(); } } [/cc_cpp]

The idea is to trace all the nodes reach­able from the roots, mark them as reach­able, and then trace the entire heap, and free everything that is not marked as reachable.

Reference Counting Garbage Collection

Another way to work with the memory graph, is by keep­ing track of “how much reach­able” a node is. That is, we can reduce our point of view, instead of to the entire graph, to indi­vidu­al nodes. Each node then needs to know if he’s still reach­able from some­where else, oth­er­wise he can die in peace. This way, the respons­ib­il­ity of deal­loc­a­tions is much more fine-grained: nodes are respons­ible for them­selves, instead of the entire heap act­ing as a node’s nanny. Nodes then need to keep track of how many oth­er nodes can see him: he needs to count the num­ber of ref­er­ences towards him.

This is an idea from Collins 4. In this view, each new ref­er­ence incre­ments the count, and each free decre­ments it: when the count reaches zero, the node is not reach­able any more, so it can be deal­loc­ated and decre­ment the counts of the nodes he can reach in one level. This requires a bar­ri­er, an extra step to be taken any time a ref­er­ence is copied. The algorithm goes like this:

[cc_cpp]
Cell* new_cell()
{
Cell* cell = allocate();
cell->ref_count = 0;
return cell;
}
[/cc_cpp]

[cc_cpp]
void free_cell(Cell* cell)
{
cell->ref_count = cell->ref_count – 1;
if (cell->ref_count == 0)
{
for (Cell* child : cell->Children())
free_cell(child);
free(cell);
}
}
[/cc_cpp]

[cc_cpp]
Cell& copy_cell(Cell& cell) //copy barrier
{
Cell* newCell = new Cell();
newCell->data = cell.data;
newCell->ref_count = cell.ref_count;
*(newCell->ref_count)++;
return newCell;
}
[/cc_cpp]

The func­tion copy_cell is auto­mat­ic­ally inser­ted by the frame­work, at com­pile-time or at run-time, any time a ref­er­ence to the heap is copied. This way, at all times we know how many times a cell is ref­er­enced. A cell is then a box with two indir­ec­ted con­tents: the value of the cell, and the ref­er­ence count. In copy­ing a ref­er­ence, we incre­ment a shared ref_count, and copy the point­ers to both the data and the counter.

Taking two worlds apart

When is one style pre­ferred over the oth­er? To answer this ques­tion, it is neces­sary to con­sider goals and ana­lyse trade-offs. For example, if you’ve ever heard any­thing about Garbage Collectors, it prob­ably is about the pause times. This is a prob­lem of tra­cing col­lect­ors, for which McCarthy was per­fectly aware: at a spe­cif­ic moment — in McCarthy’s mod­el, when the alloc­at­or runs out of memory, the col­lect­or stops the world and executes long pro­ced­ures to clean the heap; hence we say that Tracing col­lect­ors col­lect in Batch style, col­lec­tion is deferred until many things are to be col­lec­ted at once, incur­ring in notice­able pauses.

Analogously, ref­er­ence count­ing col­lects a node as soon as it becomes unreach­able, and it col­lects that node only: there’s no the­or­et­ic­al pause time, as the pause is the time it takes to col­lect one node — with the par­tic­u­lar case of maybe mak­ing anoth­er node unreach­able while col­lect­ing this node — only, and not the entire heap. This way, ref_count-ing is real-time, and incre­ment­al. But ref_count has two import­ant dis­ad­vant­ages: one is the write bar­ri­er, the fact that every copy requires updat­ing a ref_count, and to make it even more expens­ive, atom­ic­ally (just try to see the multi-thread­ing prob­lems if not), which lowers its over­all through­put. The second is per­haps even more pain­ful: ref_counting does not work on cycles!

Imagine the fol­low­ing scenario:

[cc_cpp]
// …
B* b = A.getB(); // – ref_count for B is 1 b->c = new C();
b->c->d = new D();
b->c->d->e = new E();
b->c->d->e->b2 = b; // – here, a write bar­ri­er gets injec­ted, and the ref_count for B is incre­men­ted to 2 // … delete (b->c); // – here, the ref_count for B is decre­men­ted to 1 // and.… [/cc_cpp]

Decrementing the ref_count of C doens’t deal­loc­ate any­thing, while the node is unreach­able: a memory leak!

See what hap­pens? That’s a problem.

Cycles are often solved by hav­ing a tra­cing col­lect­or run every once in a while — that’s how Python works it out —, or by care­ful hand­craft­ing of weak ref­er­ences, which mean there’s a second weak-ref_­count that does­n’t neces­sar­ily keep the object alive but might be pro­moted to a strong ref_count if the object is still there — that’s how smart point­ers in mod­ern C++ work.

One inter­est­ing thought to leave you with, in case none of this seems com­plic­ated and you want some more inter­est­ing chal­lenge, is the fol­low­ing: notice how, in the memory graph, tra­cing effect­ively com­putes the alive com­pon­ent, while ref_counting effect­ively com­putes its com­ple­ment, the dead com­pon­ent. We can ima­gine that tra­cing starts with all nodes“ ref_count set to zero, and com­putes their incre­ments, while ref_counting starts with a ref_count set to the real value and com­putes the decre­ments. See the par­al­lel? I’ll tell you a bit more about that next time.


  1. https://en.wikipedia.org/wiki/Reachability
  2. Another example of a pre­order is the sub­typ­ing rela­tion­ship: over the set of all types of a pro­gram, wheth­er A is a sub­type of B induces the same struc­ture. Preorders can also be seen as thin cat­egor­ies.
  3. John McCarthy, Recursive Functions of Symbolic Expressions and Their Computation by Machine, Part I; 1960
  4. George Collins, A Method for Overlapping and Erasure of Lists, 1960

One Reply to “Automatic Memory Management”

  1. […] saw the essen­tially two only ways to do Garbage Collection, and we saw their naive dis­ad­vant­ages. I say naive, because we saw the very simplistic […]

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.