I once wrote an editorial, multi-core mania, on the conundrum of ever-increasing numbers of processor cores, but without the concurrent programming techniques to get anywhere near exploiting their performance potential. I came to the.controversial.conclusion that, while the problem loomed for all procedural languages, it was not a big issue for the vast majority of programmers. Two years later, I still think most programmers don’t concern themselves overly with this issue, but I do think that’s a bigger problem than I originally implied.
Firstly, is the performance boost from writing code that can fully exploit all available cores worth the cost of the additional programming complexity? Right now, with quad-core processors that, at best, can make our programs four times faster, the answer is still no for many applications. But what happens in a few years, as the number of cores grows to 100 or even 1000? At this point, it becomes very hard to ignore the potential gains from exploiting concurrency. Possibly, I was optimistic to assume that, by the time we have 100-core processors, and most applications really needed to exploit them, some technology would be around to allow us to do so with relative ease.
The ideal solution would be one that allows programmers to forget about the problem, in much the same way that garbage collection removed the need to worry too much about memory allocation. From all I can find on the topic, though, there is only a remote likelihood that we’ll ever have a compiler that takes a program written in a single-threaded style and “auto-magically” converts it into an efficient, correct, multi-threaded program.
At the same time, it seems clear that what is currently the most common solution, multi-threaded programming with shared memory, is unsustainable. As soon as a piece of state can be changed by a different thread of execution, the potential number of execution paths through your program grows exponentially with the number of threads. If you have two threads, each executing n instructions, then there are 2^n possible “interleavings” of those instructions. Of course, many of those interleavings will have identical behavior, but several won’t. Not only does this make understanding how a program works an order of magnitude harder, but it will also result in irreproducible, non-deterministic, bugs. And of course, the problem will be many times worse when you have a hundred or a thousand threads.
So what is the answer? All of the possible alternatives require a change in the way we write programs and, currently, seem to be plagued by performance issues. Software transactional memory (STM) applies the ideas of database transactions, and optimistic concurrency control, to memory. However, working out how to break down your program into sufficiently small transactions, so as to avoid contention issues, isn’t easy. Another approach is concurrency with actors, where instead of having threads share memory, each thread runs in complete isolation, and communicates with others by passing messages. It simplifies concurrent programs but still has performance issues, if the threads need to operate on the same large piece of data.
There are doubtless other possible solutions that I haven’t mentioned, and I would love to know to what extent you, as a developer, are considering the problem of multi-core concurrency, what solution you currently favor, and why.