Multithreaded programming in a perfect environment can be frustrating. It becomes infuriating when dealing with a plethora of libraries each with their own notion of thread-safety. As concurrency has evolved so have the libraries, some more than others. Through a lot of trial, and lots of theory, we’ve arrived at a set of basic expectations of concurrent APIs.
Here we present those basic expectations. These rules should be followed by all libraries and deviations properly documented. They represent a sane basis for doing concurrent programming. Any library that does not uphold these requirements, or fails to note deviations, is simply not concurrency friendly or simply not thread-safe.
In this discussion we are going to use the term object to refer to any data created by the API. This may actually be a high level language object (a C-style struct, or a Java class) or merely an opaque handle which refers to data. There are other combined representations as well. Regardless of how it is referenced we assume the library exposes discrete data objects.
In this light the library must describe the relationship between its objects. There will be both independent and dependent objects. The requirements listed here focus primarily on dealing with independent objects; logically distinct data structures are the primary focus of concurrency. Note additionally that in some contexts dependents objects may be treated as independent, this facilitates things such as containers and object pools.
Example: A has table is a data object which contains several other data objects. At the level of the hash table itself, such as inserting and removing elements, all objects in the hash table are dependent. However, when working directly on the contained objects, they can be treated as independent of each other.
(These rules apply to both threading and multiple processes equally. They will however be described in terms of threads as that is how most people consider such concurrency.)
The library can be used from multiple threads. This is the absolute basic requirement of a threading friendly library. Such a library could yet hardly be called concurrent. The API must allow data operations on objects in different threads at the same time. This is what would be considered the actual minimum to expect from a thread friendly library. It may now be called concurrent as you can actually do data processing in parallel.
An API with only those two requirements would however be troublesome to use. On top of these we need that individual objects are usable from multiple threads. That is, if you create an object in one thread you must be able to use it in another thread as well. Without this ability you, as the user of the library, will be overly stressed trying to use the various objects.
Example: You are using a financial library and wish to do some bond calculations. You create a bond object in thread A and store it in a global table. Later you are in thread B and wish to use that bond object. This should be allowed.
As an aside, certain libraries offer ways to move objects between threads. While this is better than nothing, it is still problematic, especially if you can only push to another thread rather than pulling to your current thread. Some operating systems however have this basic limtiation when it comes to certain resources. Our definition extends to these OS modules as well: they aren’t concurrent, perhaps not even thread friendly.
Dependent Object Read-Write Safety
The standard rules for data access to dependent objects are multiple read safety and exclusive write safety. This equally applies to parallel access to an individaul object (perhaps the most common case of access).
The functions which do not modify an object (read-only functions) are expected to be completely thread-safe. You can call any number of different read-only functions on dependent objects from any number of threads at the same time.
Example: You have construced a string. Any number of threads should be able to read this string at the same time.
Functions which can modify an object require exclusive access to that object’s dependent group. The caller is expected to perform sufficient locking to ensure that none of its other threads are either reading or writing to the same object at the same time.
Example: An image is actually a collection of related data. You have the bits of the image itself, a color table, author information, print attributes, etc. All this sub-data is child data of the image itself. Therefore it is reasonable that exclusive access (for write) is required on the whole set of data (the image object) when modifying any of this related data.
Independent Object Read-Write Safety
Independent objects must be usable concurrently from multiple threads without restriction or concern for data races. This applies to both objects of distinct types and those of the same type.
Unrelated objects can be modified at the same time. For this reason it is vital the API describe it’s ownership rules and indicate which objects are related to each other. Some APIs may define a parent-child relationship and require that modifications to any children require an exclusive lock on the parent (thereby excluding all reading and writing from the parent and all of its children).
Example: Given a graphics processing library you create image A. The basic requirements state that any thread may read any properties of image A at the same time. If you wish to modify A however you are responsible for providing exclusive access. Now, if you create a second image B, it is assumed independent from image A. In this light you should be able to modify A in one thread and B in another thread at the same time.
Example: Containers usually have a special relationship with their children. The container itself is a single object, thus it requires exclusive access for modifications, including inserting and removing data. However, it is normally expected that so long as the container is not being modified, each member in the container is independent of the others.
Limited Thread Data and Cleanup
Though thread freedom allows objects to be used from multiple threads there is no prohibition on thread local data. Thread local structures can often be used to improve performance by reducing contention on shared data. A library is free to do so, on one condition however: when the thread exits the thread local data must be cleaned up. Simply put, as threads come and go in your program you should expect that the library will not be leaking resources. The space complexity of the library should be related to how many threads are using it now, and how many objects you have.
Related here is how much data, or how many resources, the thread local structures retain. A user of a library should be safe in expecting that only a sensible amount of memory is being used and that no scarce resources are cached per thread. A library which does not uphold this ideal will essentially violate the thread freedom requirement.
Example: You have an audio library for playing and mixing sounds. Some sound cards have on-card memory which can be used for sample playback. This memory is usually quite limited. If each thread in which the libary was used allocated some of this memory for its private use, you’d find it quickly exhausted — and indeed probably sitting idle in threads which aren’t currently using the sound library.
Different languages have different ways of reporting error values, in some cases these cannot be returned directly as a result of calling a function. For example, a lot of POSIX functions return a handle-like object, where a 0, or -1, indicates an error has occurred. To actually determine what error has occurred another function, or a global variable (like errno) needs to be consulted.
Such mechanisms must be thread-safe. One thread must be able to call the primary function and then consult the return value without having another thread overwrite that value. In practical terms that means this return state must be thread local.
Example: You call the fopen function to open a file. If file cannot be opened a null value is returned and errno set to indicate what happened. Now, between this call and reading errno another thread also calls a function which sets errno. It is expected that this second call does not disrupt the value of errno in the first thread. That is, each thread has its own unique copy of errno.
External State Defines Requirements
All data objects have an externally visible state tied to some internal state. The user of a library is oblivious to this internal state: they work only with the external view. This gives libraries a lot of implementation flexibility. Since the user does not see this internal model, the concurrency requirements must be described and met according to the externally visible state.
Example: If you have string A and then create a copy of it to produce string B. You should now be able to assume that these are two independent objects with respect to the API. Thus you should be able to modify both objects at the same time from different threads. While this may seem obvious, it has very significant implications for libraries which wish to use copy-on-write semantics.
So far we’ve not talked much about low-level data visibility. It is probably too deep for this article, but we must mention it briefly. The standard language rules on visibility apply to all the requirements here. The programmer is expected to create a proper series of happens-before operations and ensure visibility of his objects between threads. This is mentioned in this section since the user will only be ensuring visibility on the external view of the objects. It is vital for the libraries internal state to at least match, or exceed, the user expected visibility.
The rules about thread freedom and data safety are really only useful if the processing can actually be done concurrently. That is, writing to two independent objects can actually make progress on two distinct processors at the same time. It doesn’t help anybody if the first write simply locks the entire library until complete, at which point the second write proceeds.
This is perhaps not a fundamental requirement, but a reasonable expectation of a quality library. The other requirements should not create a lot of resource contention between threads. That is, each call to the library should be able to make progress without waiting on other calls to the library.
Certain APIs will require shared data however, and thus some contention will exist. So what exactly low contention means is a quality issue. On the ideal side we have libraries with zero contention, these tend to be the stateless libraries which actually don’t have shared data. On the really bad side we have libraries which lock on every call and are essentially single-threaded.
Extensions and Exceptions
These requirements are provided as a basic set that a sane library should be providing. If you observe that many libraries don’t meet these requirements you’d be correct. You should also note that many libraries are very difficult to use in a threaded program, some libraries are close to impossible to use correctly, and others end up needing a dedicated and exclusive thread.
A basic set of requirements is still useful when dealing with such libraries. It gives us the basic expectations from which we can note deviations. Rather than just saying the library is not thread-safe we can now indicate specific guarantees which are not met by the library, or which specific limitations are in place.
Other libraries will extend the requirements. The most common extension is creating an active self-locking object. Such objects can be modified by any number of threads at the same time as the library itself will do all the necessary locking. In some contexts this is actually the preferred behaviour, whereas in others it is a significant performance impediment. Again here however, describing a deviation from the expected requirements helps users of the library understand.