Efficiency

What is the cost of reference counting?

Reference counting is a common form of memory management. In some languages, like Python, it’s the natural way objects are tracked, and in others it’s in the standard library, such as shared_ptr in C++. Like all memory management, reference counting has some definite overhead costs. In this article I’ll show how basic reference counting works and explore the costs associated with it.

How reference counting works

When we talk about reference counting we’re talking about how memory is tracked, in particular when assigning a pointer from one variable to another:

1
a = b

In the above example a and b are a pointer, or reference, or other kind of shared variable type; the assignment results in a and b referring to the same object.

With classic C pointers this assignment is nothing more than a simple memory address assignment: a basic mov operation. With reference counting the objects are wrappers to the “true” object. Each assignment involves incrementing and decrementing the reference count. The pseudo-code for a reference counting assignment looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
if( a != null) {
    if( --a.referenceCount == 0 ) {
        a.release()
        a = null
    }
}

if (b != null) {
    a = b
    a.referenceCount++
}

To allow unset pointers (nulls), we need this very defensive coding style. We can also see it’s the releasing of the previous reference, not the assignment of the new one, that involves most of work.

The costly concern

The call to release isn’t a new cost. This is the destructor call we’d expect to happen at some point anyway. This is true even in a manually managed system, like C’s free and malloc, or a scanning GC: at some point we need to actually cleanup the objects we’ve initialized. Having this “call” to the release throughout the code can however bloat its size, which can negatively impact on performance.

The if statements also have some overhead. Both the code size increase and the actual comparison can have a minor impact on the code speed.

But it’s the increment and decrement that are by far the most troublesome. These aren’t just plain integer operations, they need to be synchronized operations; multiple threads need to have a consistent view of the count.

These type of instructions can result in cache flushing, waiting, and also prevent several CPU optimizations (refer to CPU memory). It’s this aspect of modifying the reference count which has the highest cost.

How often though

If we actually had that overhead on every logical pointer assignment then reference counting might be slow. We can’t discount how good CPUs are at dealing with this though; they’ve got really good at handling this kind of cross-thread synchronization without being too slow. So even with this reference counting overhead we’d still have a usable system.

But there’s a more important point: the vast majority of pointer assignments don’t actually need the reference counting overhead. Consider a simple function call, though this does logically copy the pointer, it doesn’t need to modify the reference count.

1
2
3
4
5
6
7
defn cone = ( x : shared integer ) -> {
}

defn pine = -> {
    var y : shared integer
    cone(y)
}

When foo(y) is called the compiler knows that y must be a valid shared object (in Leaf the shared type creates a reference counted object). It doesn’t need to increment the count on y to call cone since it knows the lifetime of y will be longer than the function. The parameter assignment ends up being the same as it would in C, just the raw pointer being copied.

Consider how this looks in C++ using shared_ptr. A function with the signature ( shared_ptr<integer> x ) would require the reference overhead on the parameter. But a function with the signature ( shared_ptr<integer> const & x ) does not have any reference counting overhead. Unfortunately in C++ this syntax is not as flexible as the non-const shared_ptr due to it being a library, rather than core language, feature.

Short-lived local assignments, and certain kinds of transitive assignments don’t need to modify the reference counts either. Not all variables need to be reference counted: if the lifetime is limited to a scope it could be locally allocated on the stack without a count.

In short

The primary cost of reference counting are the synchronized increment and decrement operations. These mess with the CPU pipelining and lead to delays. To reduce this cost, compilers optimize by eliminating reference counting in many places where it’d be redundant.

8 replies »

  1. … and that defined types cannot create referencing loops, like:
    class A{
    public B b;
    }

    class B{
    public A a;
    }

    The algorithm would then be really slow.

    • I’m not sure how this applies to the cost of reference counting. It is a limitation — though there are cycle detectors around, in D and Python I believe.

  2. You’ll want to addref before decref to handle self assignment.

    In C++ one would normally use something like view_ptr if the function is lifetime neutral

    • Yes, it makes sense to do the addref first. I wasn’t thinking too carefully of details I’m afraid, focusing just on the costs involved.

  3. I don’t think I’d ever accept `const std::shared_ptr&` as a parameter. If I know the function will just be doing operations on the object, it only needs to accept a `T*`. If the function plans on retaining ownership, it will accept `std::shared_ptr` by value, and I will simply move it into the right place.

    • I know it’s a bulky parameter signature, but it is the correct one if you wish to imply the function “may” take a copy. Consider that actual copy may not happen directly in that function, but something further down the call chain. If each of them use a `std::shared_ptr` it will go through needless reference counting, whereas if a `shared_ptr const &` was used, only the place that actually takes the copy incurs the overhead.

      In any case it’s bulky in C++. In Leaf this the `std::shared_ptr const &` is basically the standard way which shared objects are passed (with no additional syntax).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s