Posts

Showing posts with the label reference counting

Background reading on the reference counting vs tracing garbage collection debate

Image
Eight years ago I answered a question on Stack Overflow about the suitability of OCaml and Haskell for soft real-time work like visualization: " for real-time applications you will also want low pause times from the garbage collector. OCaml has a nice incremental collector that results in few pauses above 30ms but, IIRC, GHC has a stop-the-world collector that incurs arbitrarily-long pauses " My personal experience has always been that RAII in C++ incurs long pauses when using non-trivial data (i.e. nested, structured, collections of collections of collections, trees, graphs and so on), non-deferred reference counting has the same problem for the same reason, tracing garbage collectors like OCaml work beautifully but there are many notoriously bad tools like Java that have given tracing garbage collection a bad name. Now that I am revisiting this issue I am surprised to find many individuals and organisations repeating exactly the same experimental tests that I did and coming...

Does reference counting really use less memory than tracing garbage collection? Mathematica vs Swift vs OCaml vs F# on .NET and Mono

Image
Our previous post caused some controversy by questioning the validity of some commonly-held beliefs. Specifically, the beliefs that reference counting (RC) always requires less memory than tracing garbage collection (GC) and that tracing GCs require 3-4x more memory in order to perform adequately. We presented a benchmark written in Swift and OCaml and noted that the RC'd Swift implementation ran over 5x slower and required over 3x more memory than the tracing GC'd OCaml implementation. This observation disproves these beliefs in their strong form and even brings into question whether there is even any validity in the weak forms of those beliefs. After all, we have never seen any empirical evidence to support these beliefs in any form. We received a lot of criticism for that post. The vast majority of the criticism was not constructive but two valid points did arise. Firstly, although our result was anomalous it would be more compelling to see the benchmark repeated across a w...

Does reference counting really use less memory than tracing garbage collection? Swift vs OCaml

Image
The way software developers cling to folklore and avoid proper experimental testing of hypotheses really bothers me. I think perhaps the worst branch of computer science still riddled with myths and legends is memory management. Despite over half a century of research on garbage collection showing that reference counting is inferior to tracing garbage collections algorithms (even when almost all GC research restricts consideration to Java when much better algorithms have been known for years) there are still many people who claim otherwise. C++ developers obviously still believe that reference counted smart pointers are superior to tracing garbage collection but now other people are doing it too. People who should know better. This situation recently reared its ugly head again when Apple released a promising new programming language called Swift that shuns tracing garbage collection in favor of reference counting. Now, there are logical reasons for Apple to have chosen reference count...
Another deleted answer of mine from Stack Overflow: As I can see, smart pointers are used extensively in many real-world C++ projects. True but, objectively, the vast majority of code is now written in modern languages with tracing garbage collectors. Though some kind of smart pointers are obviously beneficial to support RAII and ownership transfers, there is also a trend of using shared pointers by default, as a way of "garbage collection", so that the programmer do not have to think about allocation that much. That's a bad idea because you still need to worry about cycles. Why are shared pointers more popular than integrating a proper garbage collector like Boehm GC? (Or do you agree at all, that they are more popular than actual GCs?) Oh wow, there are so many things wrong with your line of thinking: Boehm's GC is not a "proper" GC in any sense of the word. It is truly awful. It is conservative so it leaks and is inefficient by design. See: http://flying...

Memory management myths: promptness

People often assert that scope-based reference counting such as shared_ptr in C++ collects garbage “promptly” and some people define this as “collects at the earliest possible point”. For example, at the time of writing the Wikipedia page about garbage collection says: “ Compared to tracing garbage collection, reference counting guarantees that objects are destroyed as soon as they become unreachable … ”  – Wikipedia Similar claims can even be seen in published research such as the paper “ Down for the Count? Getting Reference Counting Back in the Ring ”: “Of the two fundamental algorithms on which the garbage collection literature is built, reference counting has lived in the shadow of tracing. It has a niche among language developers for whom either performance or completeness is not essential, and is unused by mature high performance systems, despite a number of intrinsic advantages such as promptness of recovery and dependence on local rather than global state.” – Blackburn...

Memory management myths: determinism

Although the vast majority of programmers have now migrated to garbage collected languages and will probably never go back, there are still a few clinging to manual memory management. In most cases, the continued use of manual memory management is for good reason but some of these people are perpetuating myths in an attempt to justify avoiding garbage collection. Determinism can be a genuinely good reason to stick with manual memory management and is practically important in memory-constrained embedded devices. However, C++ programs are not as deterministic as people sometimes claim and, in particular, thread-safe reference counting using  shared_ptr  is non-deterministic. Specifically, threads holding references to shared reference-counted objects race to perform the final decrement and the thread that wins the race is responsible for destruction.

Herb Sutter's favorite C++ 10-liner has a memory management bug

In a recently-posted video , Herb Sutter (a prominent C++ expert) describes his favorite C++ 10-liner as “a thread-safe reference-counted object cache”: shared_ptr<widget> get_widget(int id) {   static map<int, weak_ptr<widget>> cache;   static mutex m;   lock_guard<mutex> hold(m);   auto sp = cache[id].lock();   if (!sp) cache[id] = sp = load_widget(id);   return sp; } This example is very interesting. Firstly, it manages to pull in reference counting, weak references and a mutex which are all very rare in modern programming. Secondly, it contains a memory leak that is difficult to fix in C++ because APIs are burdened with memory management details and this API is incapable of expressing deterministic cleanup because there is no facility for a widget's destructor to remove its entry in the map. Finally, the correct name for this data structure is a concurrent weak dictionary , specifically one with weak values. You'll find corre...

How do reference counting and tracing garbage collection compare?

Reference counting works by counting the number of references to each object. When the reference count drops to zero the object is definitely unreachable and can be recycled. The first advantage is simplicity. The second advantage is that decrements can occur as locals fall out of scope so collection can be deterministic and predictable. The third advantage is that the asymptotic complexity of reference counting does not depend upon the total size of the heap. The first problem is that cycles keep reference counts above zero even when they are unreachable, so reference counting alone cannot handle the general case. The second problem with reference counting is that incrementing and decrementing a counter in an object every time a reference to it is copied or dropped is very computationally expensive because it incurs a cache miss even if nothing else in the object is read or written. The third problem is that multithreaded reference counting is non-deterministic because increments and ...

Real garbage collector characteristics

Image
The trade-offs between tracing and reference counting garbage collectors are nicely demonstrated by systems like F# and Mathematica. F# inherits garbage collection from .NET which uses a conventional generational tracing garbage collector with three generations and a Large Object Heap (LOH). Mathematica uses reference counting with language semantics that make it impossible to create cycles in the heap. The following Mathematica program creates a balanced binary tree 25 levels deep that contains 2 25 branches stores it in a variable and then mutates the variable back to the empty list: Because Mathematica uses reference counting, the act of resetting the variable back to the empty list decrements the reference count for the root of our tree back to zero, causing an avalanche of reference counts in the branches down to the leaves also being decremented back to zero, reclaiming all of the space that was consumed by the tree that is now unreachable. Moreover, Mathematica is serial so all...

Classifying garbage collection algorithms

Richard Jones' excellent new book about garbage collection has rekindled my fascination with this important subject. The Wikipedia page about GC is disappointingly poor quality so it is productive to review the main points here. GC algorithms are categorized into four main families with production garbage collectors often combining algorithms from different families. The families are copying, mark-sweep, mark-compact and reference counting. The first three are all tracing collectors that work by tracing all reachable values by starting from the global roots (global variables and locals held on the stack). Copying collectors allocate into one space and then "evacuate" the reachable heap blocks (called "survivors") into another space before clearing the space they came from. The Cheney semi-space algorithm is the simplest copying collector: it uses two spaces and copies from one to the other and back again. The advantage of copying collection is that many heap-...

Boost's shared_ptr up to 10× slower than OCaml's garbage collection

Image
Our recent post about the merits of accurate garbage collection over reference counting prompted Tezka to ask for measurements demonstrating the performance differences between an accurate GC and the reference counted shared_ptr smart pointers provided by the Boost C++ library. The benchmarks we have been using to prototype allocators for our HLVM project provide the perfect setting for such an experiment. We already have C++ code that solves the n -queens problem using logic programming with a variety of different allocation strategies. Adding another version of our benchmark using Boost's shared_ptr gave the following remarkable results: These new results highlight just how slow reference counting can be. The C++ code using Boost's reference counted smart pointers is running up to 10× slower than OCaml! Note also the new "stack" section that is the first C++ to beat OCaml, albeit cheating by exploiting the fact that this implementation of this benchmark always hap...

Why GC when you have reference counted smart pointers?

Reference counted smart pointers are a simple form of garbage collection usable from the C++ programming language. A recent question on Stack Exchange asks why anyone would want anything more when reference counted smart pointers are already available. Other forms of garbage collection (most notably tracing GCs) have several advantages over reference counting: Accuracy: Reference counting alone leaks cycles so reference counted smart pointers will leak memory in general unless other techniques are added to catch cycles. Once those techniques are added, reference counting's benefit of simplicity has vanished. Throughput: Smart pointers are one of the least efficient forms of garbage collection, particularly in the context of multi-threaded applications when reference counts are bumped atomically. There are advanced reference counting techniques designed to alleviate this but tracing GCs are still the algorithm of choice in production environments. Latency: Typical smart pointer ...