<?xml version="1.0" encoding="UTF-8" ?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en-GB"><title type="html">Simon Cooper</title><subtitle type="html" /><id>http://www.simple-talk.com/community/blogs/simonc/atom.aspx</id><link rel="alternate" type="text/html" href="http://www.simple-talk.com/community/blogs/simonc/default.aspx" /><link rel="self" type="application/atom+xml" href="http://www.simple-talk.com/community/blogs/simonc/atom.aspx" /><generator uri="http://communityserver.org" version="2.0.60217.2664">Community Server</generator><updated>2011-06-15T17:53:00Z</updated><entry><title>Inside the ConcurrentCollections: ConcurrentQueue</title><link rel="alternate" type="text/html" href="http://www.simple-talk.com/community/blogs/simonc/archive/2012/01/24/105658.aspx" /><id>http://www.simple-talk.com/community/blogs/simonc/archive/2012/01/24/105658.aspx</id><published>2012-01-24T22:03:00Z</published><updated>2012-01-24T22:03:00Z</updated><content type="html">&lt;p&gt;&lt;code&gt;ConcurrentQueue&lt;/code&gt; is, like &lt;code&gt;ConcurrentStack&lt;/code&gt;, a lockless collection, in that it is implemented without using any locks at all. However, the semantics required for a queue impose a quite different approach; unlike &lt;code&gt;ConcurrentStack&lt;/code&gt;, which has a single point of concurrent contention, a queue can be changed at both the head and tail. This means that at least two variables are involved, most likely more. The simple approach of atomically modifying a single variable won't work here.&lt;/p&gt;

&lt;h4&gt;What does &lt;code&gt;System.Collections.Generic.Queue&lt;/code&gt; do?&lt;/h4&gt;

&lt;p&gt;Well, let's have a look at the non-concurrent queue. This is implemented as a &lt;a href="http://en.wikipedia.org/wiki/Circular_buffer"&gt;circular buffer&lt;/a&gt;. However, this design is very hard to use in a lockless way; when a new item is enqueued and the backing array is full, the array is doubled in size to accommodate the new item.&lt;/p&gt;

&lt;p&gt;To resize the array, the entire contents of the queue has to be read sequentially without using locks and copied into a brand new array by a single thread, during which other threads can enqueue and dequeue items many times. This is very tricky to implement without being subject to race conditions and still ensure high performance.&lt;/p&gt;

&lt;p&gt;So, instead, ConcurrentQueue forgoes the circular buffer, and uses a linear array instead. More specifically, it needs to be a linear array with a (conceptually) infinite capacity - although there will only be a finite number of items stored in the queue, items can be enqueued and dequeued an unlimited number of times. Such an array can't be created as a single object.&lt;/p&gt;

&lt;p&gt;However, you don't need to keep the entire infinite array in memory, you only need the section of the array that is actually storing items. By splitting the array into finite segments you can create and discard segments as items get enqueued and dequeued to the queue. This is what &lt;code&gt;ConcurrentQueue&lt;/code&gt; does.&lt;/p&gt;

&lt;h4&gt;Queue Segments&lt;/h4&gt;

&lt;p&gt;To demonstrate, I'll be using a queue with a segment size of 4. To start off with, three items are enqueued. These are added to the leftmost available slot of the tail segment:&lt;/p&gt;

&lt;img src="/blogbits/simon.cooper/ConcurrentQueue1.png"&gt;

&lt;p&gt;Three more items are then enqueued. A new segment is created and assigned to the tail:&lt;/p&gt;

&lt;img src="/blogbits/simon.cooper/ConcurrentQueue2.png"&gt;

&lt;p&gt;And another three:&lt;/p&gt;

&lt;img src="/blogbits/simon.cooper/ConcurrentQueue3.png"&gt;

&lt;p&gt;Five items are then dequeued from the head. The head segment is now empty, so it is discarded:&lt;/p&gt;

&lt;img src="/blogbits/simon.cooper/ConcurrentQueue4.png"&gt;

&lt;p&gt;Using this method, the illusion of an infinite linear array can be created using a finite number of segments.&lt;/p&gt;

&lt;h4&gt;Achieving thread-safety&lt;/h4&gt;

&lt;p&gt;&lt;code&gt;ConcurrentQueue&lt;/code&gt; uses a segment size of 32, and along with the backing array each segment stores the index items are to be added (&lt;code&gt;m_high&lt;/code&gt;) and where they are to be removed (&lt;code&gt;m_low&lt;/code&gt;):

&lt;/p&gt;&lt;pre&gt;private class Segment {
    volatile T[] m_array;
    volatile int m_high;
    volatile int m_low;
}&lt;/pre&gt;

&lt;p&gt;Thread-safety, in this case, depends on the operation of both enqueuing and dequeuing. So we'll look at the operations in isolation, then put them both together.&lt;/p&gt;

&lt;h4&gt;Enqueuing&lt;/h4&gt;

&lt;p&gt;To enqueue an item, you need to:
&lt;/p&gt;&lt;ol&gt;
&lt;li&gt;Increment the &lt;code&gt;m_high&lt;/code&gt; variable. This is now the index the item should be inserted at.&lt;/li&gt;
&lt;li&gt;Assign the item to the specified slot in &lt;code&gt;m_array&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

The key to this is step 1. If step 1 can be done atomically, returning the new incremented value, that gives each thread trying to enqueue an item at the same time separate slots. Step 2 can then be performed concurrently by each thread without having to worry about conflicts. &lt;code&gt;Interlocked.Increment&lt;/code&gt; performs this increment atomically:&lt;p&gt;&lt;/p&gt;

&lt;pre&gt;public void Enqueue(T item) {
    int insertIndex = Interlocked.Increment(ref m_high);
    m_array[insertIndex] = item;
}&lt;/pre&gt;

&lt;p&gt;That's it! Simple, really...&lt;/p&gt;

&lt;h4&gt;Dequeuing&lt;/h4&gt;
&lt;p&gt;Dequeuing acts in a similar way to enqueuing:

&lt;/p&gt;&lt;ol&gt;
&lt;li&gt;Increment the &lt;code&gt;m_low&lt;/code&gt; variable. The previous index is the index of the item to be dequeued.&lt;/li&gt;
&lt;li&gt;Return the item stored at the specified slot in &lt;code&gt;m_array&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

And, similarly, if each thread trying to dequeue an item can be atomically 'assigned' a slot by performing step 1 using &lt;code&gt;Interlocked.Increment&lt;/code&gt;, then step 2 can be performed concurrently by several threads:

&lt;pre&gt;public T Dequeue() {
    // Increment returns the incremented value of the variable
    int index = Interlocked.Increment(ref m_low);
    return m_array[index-1];
}&lt;/pre&gt;

&lt;p&gt;But hang on, if we're dequeuing items, we need to check whether the queue is empty first. So checks need to be added to determine when a queue is empty. This depends on the segment state that represents an empty segment. In the case of &lt;code&gt;ConcurrentQueue&lt;/code&gt;, this state is when &lt;code&gt;m_low &amp;gt; m_high&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The exact reasons for this specific state representing an empty segment are a gory implementation detail; if you're wondering the reasons for this, then start by working out exactly what &lt;code&gt;m_high&lt;/code&gt; and &lt;code&gt;m_low&lt;/code&gt; represent.&lt;/em&gt;&lt;/p&gt;

&lt;pre&gt;public bool TryDequeue(out T item) {
    if (m_low &amp;gt; m_high) {
        item = default(T);
        return false;
    }
    
    int index = Interlocked.Increment(ref m_low);
    item = array[index-1];
    return true;
}&lt;/pre&gt;

&lt;p&gt;All good? Well, no. We've actually introduced a subtle race condition. If we've got a queue with one item in it, and two threads trying to dequeue that one item, then if both threads perform the &lt;code&gt;if (m_low &amp;gt; m_high)&lt;/code&gt; check before &lt;code&gt;m_low&lt;/code&gt; has been incremented, one will dequeue the item and one will successfully dequeue on an empty queue.&lt;/p&gt;

&lt;p&gt;In &lt;code&gt;ConcurrentStack&lt;/code&gt;, a similar race condition was solved by using &lt;code&gt;Interlocked.CompareExchange&lt;/code&gt;; taking a snapshot, doing some work, then atomically making that work visible to other threads only if the state is still valid. By judicious use of local variables and loops, we can do the same thing here:

&lt;/p&gt;&lt;pre&gt;public bool TryDequeue(out T item) {
    int index;
    do {
        index = m_low;
        
        // check if the queue is empty
        if (index &amp;gt; m_high) {
            item = default(T);
            return false;
        }
        
        // I think m_low has the same value as index. If it is, replace it with index+1.
        // Return to me what m_low actually was.
        if (Interlocked.CompareExchange(ref m_low, index+1, index) == index) {
            // success - index is now a valid index to dequeue, specific to this thread
            item = m_array[index]
            return true;
        }
        
        // m_low has been changed in the meantime. Go back to the start and try again.
    }
    while (true);
}&lt;/pre&gt;

&lt;p&gt;In this case, &lt;code&gt;CompareExchange&lt;/code&gt; is acting as a conditional Increment.

&lt;/p&gt;&lt;h4&gt;Combining the two&lt;/h4&gt;

&lt;p&gt;This is where it gets complicated. The most obvious issue is another race condition, between enqueuing and dequeuing. To reiterate, the steps performed are:&lt;/p&gt;

&lt;strong&gt;Enqueuing&lt;/strong&gt;:
&lt;ol&gt;
&lt;li&gt;Increment the &lt;code&gt;m_high&lt;/code&gt; variable. This is now the index the item should be inserted at.&lt;/li&gt;
&lt;li&gt;Assign the item to the specified slot in &lt;code&gt;m_array&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;strong&gt;Dequeuing&lt;/strong&gt;:
&lt;ol&gt;
&lt;li&gt;Check if the queue is empty (&lt;code&gt;m_low &amp;gt; m_high&lt;/code&gt;). If it is, return false.&lt;/li&gt;
&lt;li&gt;Increment &lt;code&gt;m_low&lt;/code&gt; (if &lt;code&gt;m_low&lt;/code&gt; hasn't changed) and return the item.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Start with an empty queue. Thread 1 executes step 1 of enqueuing; &lt;code&gt;m_high&lt;/code&gt; is incremented. Thread 2 then tries to dequeue an item; step 1 of dequeuing thinks the queue has items because &lt;code&gt;m_low &amp;lt;= m_high&lt;/code&gt;. Step 2 of dequeuing then executes, and the item dequeued is the null value, because thread 1 hasn't yet assigned the enqueued item to the backing array.&lt;/p&gt;

&lt;p&gt;Hmm, how to solve this? There's more than one variable involved here; the race condition involves the relationship between &lt;em&gt;two&lt;/em&gt; variables, &lt;code&gt;m_high&lt;/code&gt; and &lt;code&gt;m_low&lt;/code&gt;, not a single variable being changed behind the scenes, which was the problem solved previously using &lt;code&gt;CompareExchange&lt;/code&gt; and loops.&lt;/p&gt;

&lt;p&gt;Well, we can't swap round the order of operations for enqueuing, because step 1 is what atomically assigns a slot when several threads are enqueuing items at once.&lt;/p&gt;

&lt;p&gt;What we need is some sort of notification that enqueuing has finished assigned the item to the backing array. Then the dequeuing thread can wait for that notification before dequeuing the item from the queue:&lt;/p&gt;

&lt;strong&gt;Enqueuing&lt;/strong&gt;
&lt;ol&gt;
&lt;li&gt;Increment the &lt;code&gt;m_high&lt;/code&gt; variable. This is now the index the item should be inserted at.&lt;/li&gt;
&lt;li&gt;Assign the item to the specified slot in &lt;code&gt;m_array&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Set state indicating the item at that index has been enqueued.&lt;/li&gt;
&lt;/ol&gt;

&lt;strong&gt;Dequeuing&lt;/strong&gt;:
&lt;ol&gt;
&lt;li&gt;Check if the queue is empty (&lt;code&gt;m_high &amp;gt;= m_low&lt;/code&gt;). If it is, return false.&lt;/li&gt;
&lt;li&gt;Increment &lt;code&gt;m_low&lt;/code&gt; (if &lt;code&gt;m_low&lt;/code&gt; hasn't changed).&lt;/li&gt;
&lt;li&gt;Wait for the state to be set at the index to dequeue.&lt;/li&gt;
&lt;li&gt;Return the item.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is the role played by the &lt;code&gt;m_state&lt;/code&gt; array in &lt;code&gt;ConcurrentQueue.Segment&lt;/code&gt;. Each segment has an &lt;code&gt;int[] m_state&lt;/code&gt; alongside the &lt;code&gt;T[] m_array&lt;/code&gt;; a 1 in a particular slot indicates the item has been set, and a 0 (the default value) indicates it isn't set. This turns the Enqueue method into this:

&lt;/p&gt;&lt;pre&gt;public void Enqueue(T item) {
    int insertIndex = Interlocked.Increment(ref m_high);
    m_array[insertIndex] = item;
    m_state[insertIndex] = 1;
}&lt;/pre&gt;

and TryDequeue into this:

&lt;pre&gt;public bool TryDequeue(out T item) {
    int index;
    do {
        index = m_low;
        
        if (index &amp;gt; m_high) {
            item = default(T);
            return false;
        }
        
        if (Interlocked.CompareExchange(ref m_low, index+1, index) == index) {
            // busywait until m_state[index] is set
            while (m_state[index] == 0) {}
            
            item = m_array[index]
            return true;
        }
    }
    while (true);
}&lt;/pre&gt;

&lt;p&gt;By marking both &lt;code&gt;m_array&lt;/code&gt; and &lt;code&gt;m_state&lt;/code&gt; as volatile, this stops the JIT reordering accesses to those arrays, ensuring that the invariant implicit in this code (&lt;code&gt;m_state[index] == 1&lt;/code&gt; if and only if &lt;code&gt;m_array[index]&lt;/code&gt; has an item in it) doesn't get broken by the JIT or processors reordering instructions.&lt;/p&gt;

&lt;p&gt;Tweaking the control flow a bit, and sprinkling calls to &lt;code&gt;SpinWait.SpinOnce()&lt;/code&gt; around as the abstraction of a busywait, and you've got the code of the enqueue and dequeue methods of &lt;code&gt;ConcurrentQueue&lt;/code&gt;.

&lt;/p&gt;&lt;h4&gt;Final notes&lt;/h4&gt;

&lt;p&gt;There's a couple of things I need to briefly mention. Firstly, I haven't yet covered growing the queue. Creating new segments when the current segment gets full is largely handled by enqueuing. When an item is added to the last slot in a segment, the Enqueue method creates a new segment and assigns it as the next segment in the queue using the &lt;code&gt;m_next&lt;/code&gt; variable. When the dequeue method reaches the end of the current segment, it waits for the &lt;code&gt;m_next&lt;/code&gt; variable to be set before looking inside that segment to see if there's an item to dequeue.&lt;/p&gt;

&lt;p&gt;Secondly, the eagle-eyed among you might have noticed that dequeued items don't get removed from the backing array; they are still referenced from the array when items are dequeued. I can't see any specific threading reason for this, other than giving one less location for possible threading issues. However, more importantly, this guarantees that the invariant I mentioned above is never be broken, which can be important if you're reasoning about the collection.&lt;/p&gt;

&lt;p&gt;Keeping references around longer than they need isn't so big an issue; the items referenced in the array are dereferenced all together when the containing segment is dereferenced. However it is something to bear in mind if you have large objects referenced in a &lt;code&gt;ConcurrentQueue&lt;/code&gt; for a long time; they might not be garbage collected when they otherwise could be.&lt;/p&gt;

&lt;h4&gt;Concurrent principles&lt;/h4&gt;

&lt;p&gt;So what principles can we extract from this examination of &lt;code&gt;ConcurrentQueue&lt;/code&gt;?&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;h4&gt;Separating threads&lt;/h4&gt;
&lt;p&gt;&lt;code&gt;Increment&lt;/code&gt; and &lt;code&gt;CompareExchange&lt;/code&gt; are used to atomically assign each thread performing an action a separate slot. Each thread can then do work on that slot concurrently without conflicting with each other.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;h4&gt;Invariants&lt;/h4&gt;
&lt;p&gt;An invariant is used to eliminate a race condition between enqueuing and dequeuing, in this case, &lt;code&gt;m_state[index] == 1&lt;/code&gt; iff &lt;code&gt;m_array[index]&lt;/code&gt; has an item in it. This ensures enqueuing and dequeuing on the same slot works as it should. Note that this invariant is maintained even at the cost of keeping things in memory longer than they should.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;h4&gt;&lt;code&gt;volatile&lt;/code&gt; used as a memory barrier&lt;/h4&gt;
&lt;p&gt;Furthermore, &lt;code&gt;volatile&lt;/code&gt; is used to add memory barriers ensuring the JIT doesn't reorder &lt;code&gt;m_state&lt;/code&gt; and &lt;code&gt;m_array&lt;/code&gt; array accesses and inadvertantly break the invariant.&lt;/p&gt;
&lt;/li&gt;&lt;/ol&gt;

&lt;p&gt;Well, that's the core of the &lt;code&gt;ConcurrentQueue&lt;/code&gt; class. I haven't even begun to cover the supporting methods, and glossed over most of the gory details. These I leave for you to explore and figure out. It's time to move on! We finally get some locks involved, and peer under the covers at &lt;code&gt;ConcurrentDictionary&lt;/code&gt;.&lt;/p&gt;&lt;img src="http://www.simple-talk.com/community/aggbug.aspx?PostID=105658" width="1" height="1"&gt;</content><author><name>Simon Cooper</name><uri>http://www.simple-talk.com/community/user/Profile.aspx?UserID=24200</uri></author></entry><entry><title>Inside Red Gate - Experimental Results</title><link rel="alternate" type="text/html" href="http://www.simple-talk.com/community/blogs/simonc/archive/2012/01/17/105506.aspx" /><id>http://www.simple-talk.com/community/blogs/simonc/archive/2012/01/17/105506.aspx</id><published>2012-01-17T17:16:00Z</published><updated>2012-01-17T17:16:00Z</updated><content type="html">&lt;p&gt;As a brief interlude from my Concurrent Collections series, I thought I would give an roundup of how the lean startup experiments have been progressing. As you can expect, there's been some good aspects and some bad aspects.&lt;/p&gt;

&lt;h4&gt;The experiments so far&lt;/h4&gt;
&lt;p&gt;After lots of discussions, arguments, posing and ruling out hypotheses, we came up with two 'starter' hypotheses we could test fairly easily:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Customers don't agree to send data on how they use SmartAssembly; which features they use, the versions of .NET they have installed, etc, because the consent dialog doesn't make it clear the data is all anonymous and aggregated.
&lt;p&gt;This is a prequisite for further experimentation. SmartAssembly isn't a webapp, with google analytics and web logs telling us everything. In order to conduct experiments on SmartAssembly, we need to know how customers are using it once it is installed on their machines, and they need to confirm it's ok for us to collect that data. Our current acceptance rate for usage reporting on SmartAssembly itself is quite low, so we needed to improve this for future experiments.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;Customers aren't using feature usage reporting and error reporting within their own applications because those options are swamped amongst all the obfuscation options.&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;Experiment 1&lt;/h4&gt;
&lt;p&gt;&lt;strong&gt;Hypothesis:&lt;/strong&gt; Customers don't agree to send data on how they use SmartAssembly because the dialog asking for consent doesn't make it clear the data is all anonymous&lt;/p&gt;
&lt;p&gt;The experiment for this is pretty obvious - improve the wording on the consent dialog. This change was applied to SmartAssembly 6.5. We would compare signup rates with 6.2 to see what effect, if any, this change would have.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; Inconclusive&lt;/p&gt;
&lt;p&gt;We found after 6.5 had been released that we weren't collecting the right data from our download and install process to be able to accurately calculate an acceptance percentage. One of the quirks of the existing feature usage instrumentation is that the answer users give to the consent dialog is used for all future versions of the product with the same major version.&lt;/p&gt;
&lt;p&gt;Since 6.5 is a minor version upgrade from 6.2, this means we couldn't differentiate between an existing customer downloading to upgrade from &amp;lt;6.2 to 6.5 (who wouldn't be presented with the new consent dialog), and a new user downloading for the first time. The data we collected couldn't be interpreted one way or the other; there were too many other variables.&lt;/p&gt;

&lt;h4&gt;Experiment 2&lt;/h4&gt;
&lt;p&gt;&lt;strong&gt;Hypothesis:&lt;/strong&gt; Customers aren't using feature usage reporting and error reporting within their own applications because those options are swamped amongst all the obfuscation options&lt;/p&gt;
&lt;p&gt;To perform this experiment, we produced a version of SmartAssembly that only had merging, signing, feature usage and error reporting available. We only wanted to present this version to customers whom we knew were downloading SmartAssembly for the error and/or feature usage reporting. The only place we could reasonably guarantee this was a reporting-specific landing page. So, the download link on that page would be A/B tested between the reporting-only and standard version of SmartAssembly.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; Not enough data&lt;/p&gt;
&lt;p&gt;Our limited scope for this test - one specific landing page amongst the many pages on SmartAssembly where customers could download - meant we got very few people downloading the reporting-only version of SmartAssembly; not enough for any conclusion to be drawn one way or the other.&lt;/p&gt;
&lt;p&gt;We had also added a 'cop-out' link on the download page so people could guarantee to get the standard version, if they did happen to download the reporting-only version and wonder where all the obfuscation went. We suspect a lot of users clicked this link; unfortunately, it was untracked, so we don't really know.&lt;/p&gt;

&lt;h4&gt;It's not all bad news...&lt;/h4&gt;
&lt;p&gt;So, the first experiments we performed didn't really work. However, that doesn't mean they weren't useful. We worked through a lot of the infrastructure issues that restricted our experimentation, and, most importantly, we learnt things:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Make sure you know what the experiment is for, and what data you will collect to validate or refute your hypotheses. Check the data you will be collecting will make sense.&lt;/li&gt;
&lt;li&gt;For an A/B test, ensure you will have a sensible number of users on both sides to able to draw conclusions.&lt;/li&gt;
&lt;li&gt;Don't allow users to opt-out of experimental builds, even if it's not marketed as an 'experimental' version. Don't blur the boundaries of experimental and non-experimental builds.&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;So what now?&lt;/h4&gt;
&lt;p&gt;We've decided we're going down the wrong route. We shouldn't be trying to target existing users of SmartAssembly, who are looking primarily for an obfuscation tool, we need to target new users. We've decided to pivot and create a new product, a new website, and a new analytics platform. If you're interested, there are samples of the kind of data you could have on your applications at &lt;a href="http://www.applicationmetrics.com"&gt;http://www.applicationmetrics.com&lt;/a&gt;, as well as a sign up to receive more information and take part in our experiments.&lt;/p&gt;&lt;img src="http://www.simple-talk.com/community/aggbug.aspx?PostID=105506" width="1" height="1"&gt;</content><author><name>Simon Cooper</name><uri>http://www.simple-talk.com/community/user/Profile.aspx?UserID=24200</uri></author></entry><entry><title>Inside the Concurrent Collections: ConcurrentStack</title><link rel="alternate" type="text/html" href="http://www.simple-talk.com/community/blogs/simonc/archive/2012/01/12/105370.aspx" /><id>http://www.simple-talk.com/community/blogs/simonc/archive/2012/01/12/105370.aspx</id><published>2012-01-12T12:35:00Z</published><updated>2012-01-12T12:35:00Z</updated><content type="html">&lt;p&gt;The first concurrent collection we'll look at is &lt;code&gt;ConcurrentStack&lt;/code&gt;. This is conceptually the same as &lt;code&gt;System.Collections.Generic.Stack&lt;/code&gt;, but is geared towards occasional concurrent modifications.&lt;/p&gt;

&lt;p&gt;Now, in these posts I won't be looking to explain what every method does; just like my other explorations of various collection types, I'll be concentrating on the core implementation and concepts used by the collections, and glossing over the more sordid implementation details and scaffolding required to support an &lt;code&gt;ICollection&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;So, without further ado, let's get started:&lt;/p&gt;

&lt;h4&gt;ConcurrentStack&lt;/h4&gt;
&lt;p&gt;There are three basic operations you can perform on a stack:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;code&gt;Push&lt;/code&gt;: Push a new value onto the top of the stack&lt;/li&gt;
&lt;li&gt;&lt;code&gt;TryPeek&lt;/code&gt;: Peeking at the top value. In &lt;code&gt;ConcurrentStack&lt;/code&gt;, this is quite a simple method.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;TryPop&lt;/code&gt;: Popping the top value off the stack&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;ConcurrentStack supplements these with &lt;code&gt;PushRange&lt;/code&gt; and &lt;code&gt;TryPopRange&lt;/code&gt; for pushing and popping an array of values, &lt;code&gt;ToArray&lt;/code&gt;, and the standard &lt;code&gt;ICollection&lt;/code&gt; methods.&lt;/p&gt;

&lt;p&gt;The stack is implemented as a singly-linked list, with nodes represented by the &lt;code&gt;ConcurrentStack&amp;lt;T&amp;gt;.Node&lt;/code&gt; private class:
&lt;/p&gt;&lt;pre&gt;private class Node {
    internal T m_Value;
    internal Node m_Next;
}&lt;/pre&gt;
&lt;p&gt;The top of the stack is referenced by the volatile &lt;code&gt;m_head&lt;/code&gt; variable. An empty stack is represented by a null value in &lt;code&gt;m_head&lt;/code&gt;. For example, if you push the integers 1, 2 and 3 onto the stack in that order, the nodes will look like this:&lt;/p&gt;

&lt;img src="/blogbits/simon.cooper/ConcurrentStack1.png"&gt;

&lt;p&gt;If you then pop a single value off the stack, &lt;code&gt;m_head&lt;/code&gt; will point at the '2' node.&lt;/p&gt;

&lt;h4&gt;Pushing values&lt;/h4&gt;
&lt;p&gt;So, there are three operations required to push a new value onto the stack:
&lt;/p&gt;&lt;ol&gt;
&lt;li&gt;Create a new &lt;code&gt;Node&lt;/code&gt; object to hold the new value&lt;/li&gt;
&lt;li&gt;Setting the node's &lt;code&gt;m_next&lt;/code&gt; variable to point to the current head node&lt;/li&gt;
&lt;li&gt;Changing &lt;code&gt;m_head&lt;/code&gt; to point to the newly-created node.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;And, straight away, we've got the possibility of a race condition. Say Thread 1 is pushing the value '4' onto the stack. It's just executed step 2:&lt;/p&gt;

&lt;img src="/blogbits/simon.cooper/ConcurrentStack4.png"&gt;

&lt;p&gt;Now, Thread 2 is scheduled on the CPU and is pushing the value '5'. Thread 2 executes steps 1, 2 and 3:&lt;/p&gt;

&lt;img src="/blogbits/simon.cooper/ConcurrentStack5.png"&gt;

&lt;p&gt;Thread 1 then executes step 3:&lt;/p&gt;

&lt;img src="/blogbits/simon.cooper/ConcurrentStack6.png"&gt;

&lt;p&gt;The node successfully pushed by Thread 2 has disappeared from the stack.&lt;/p&gt;

&lt;p&gt;So, what we want to do is perform step 3 &lt;em&gt;only if the head hasn't changed in the meantime&lt;/em&gt;. If it has been changed, then go back to step 2 to use the updated head node as the value of &lt;code&gt;m_next&lt;/code&gt;. Furthermore, this all has to be done atomically so other threads can't see the intermediate state.&lt;/p&gt;

&lt;p&gt;This is where &lt;code&gt;Interlocked.CompareExchange&lt;/code&gt; comes into play. If you recall my previous post, &lt;code&gt;CompareExchange&lt;/code&gt; performs the following as an atomic operation:
&lt;/p&gt;&lt;pre&gt;public static T CompareExchange&amp;lt;T&amp;gt;(ref T location, T value, T comparand)
  where T : class {
    T origValue = location;
    if (location == comparand)
        location = value;
    return origValue;
}&lt;/pre&gt;
Or, in words:
&lt;p&gt;I think &lt;code&gt;location&lt;/code&gt; has the same value as &lt;code&gt;comparand&lt;/code&gt;. If it is, replace it with &lt;code&gt;value&lt;/code&gt;. Return me what &lt;code&gt;location&lt;/code&gt; actually is.&lt;/p&gt;

&lt;p&gt;This allows us to do the 'check and assign a new head, else try again' operation atomically, like so:&lt;/p&gt;
&lt;pre&gt;public void Push(T item) {
    Node node = new Node(item);
    do {
        node.m_next = m_head;
    }
    while (Interlocked.CompareExchange(ref m_head, node, node.m_next) != node.m_next);
}&lt;/pre&gt;
&lt;p&gt;Or, as words:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Create a new node&lt;/li&gt;
&lt;li&gt;Set the node's next field to the current head&lt;/li&gt;
&lt;li&gt;atomically{I think the current head is the value of the node's next field. If it is, replace it with the new node}. If the head was't actually the same as the next field, go back to 2.&lt;/li&gt;
&lt;/ol&gt;

&lt;img src="/blogbits/simon.cooper/ConcurrentStack3.png"&gt;

&lt;p&gt;This has eliminated the race condition we had between steps 2 and 3. Only if the head hasn't been changed is the new node pushed onto the stack as the new head. This loop repeats until &lt;code&gt;CompareExchange&lt;/code&gt; successfully sets the head to the new node.&lt;/p&gt;

&lt;p&gt;This can easily be generalised to a range of values in &lt;code&gt;PushRange&lt;/code&gt; - we can simply change step 1 to construct the list of nodes holding the values being pushed, then using the &lt;code&gt;m_next&lt;/code&gt; field of the tail node, and the top node as the new head, like so:&lt;/p&gt;

&lt;img src="/blogbits/simon.cooper/ConcurrentStack2.png"&gt;

&lt;p&gt;So that's pushing values onto the stack. What about removing them?&lt;/p&gt;

&lt;h4&gt;Popping values&lt;/h4&gt;
&lt;p&gt;As you can expect, popping items from the stack can be done in a similar way to pushing them:
&lt;/p&gt;&lt;pre&gt;Node topNode;
do {
    topNode = m_head;
}
while (Interlocked.CompareExchange(ref m_head, m_head.m_next, topNode) != topNode);&lt;/pre&gt;

&lt;p&gt;However, it's not quite so simple as that. For one thing, the stack may be empty, so &lt;code&gt;m_head&lt;/code&gt; might be null. For another, you need to be able to pop several items off the stack as a single atomic operation, which requires navigating several &lt;code&gt;m_next&lt;/code&gt; references at once. We need to be slightly more clever about it.&lt;/p&gt;

&lt;p&gt;Let's cover checking for an empty stack first. Checking for null is easy, right? &lt;code&gt;if (m_head == null) return false;&lt;/code&gt; So we put this at the start of the pop method:
&lt;/p&gt;&lt;pre&gt;public bool TryPop(out T item) {
    if (m_head == null) {
        item = default(T);
        return false;
    }

    Node topNode;
    do {
        topNode = m_head;
    }
    while (Interlocked.CompareExchange(ref m_head, m_head.m_next, topNode) != topNode);
    
    item = topNode.m_value;
    return true;
}&lt;/pre&gt;
&lt;p&gt;However, again, we've got a race condition. The null check only happens once. If another thread were to empty the stack while the current thread is inside the while loop, this leaves us with a &lt;code&gt;NullReferenceException&lt;/code&gt; lying in wait when we try and get &lt;code&gt;m_head.m_next&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The simple way to fix this is to do the null check inside the loop, eliminating the race condition:&lt;/p&gt;
&lt;pre&gt;public bool TryPop(out T item) {
    Node topNode;
    do {
        topNode = m_head;
        
        if (topNode == null) {
            item = default(T);
            return false;
        }
    }
    while (Interlocked.CompareExchange(ref m_head, m_head.m_next, topNode) != topNode);
    
    item = topNode.m_value;
    return true;
}&lt;/pre&gt;

&lt;p&gt;Sorted. So, what about popping multiple values off the stack? Again, like the null check, this has to be performed inside the while loop using the 'snapshot' of the head we've taken in &lt;code&gt;topNode&lt;/code&gt;:

&lt;/p&gt;&lt;pre&gt;public int TryPopRange(T[] items, int count) {
    int i;
    Node topNode;
    do {
        topNode = m_head;
        Node curNode = topNode;
        i=0;

        while (i&amp;lt;count &amp;amp;&amp;amp; curNode != null) {
            items[i] = curNode.m_value;
            curNode = curNode.m_next;
            i++;
        }
    }
    while (Interlocked.CompareExchange(ref m_head, curNode, topNode) != topNode);
    
    return i;
}&lt;/pre&gt;

&lt;p&gt;In fact, this idea of taking the current head, doing some work on it, then atomically swapping it if it hasn't been changed in the meantime can be generalised:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Take an atomic snapshot of the current state&lt;/li&gt;
&lt;li&gt;Do some work on the snapshot locally to the current thread&lt;/li&gt;
&lt;li&gt;Atomically make the work visible to all other threads, if the snapshot you took is still a valid representation of the state&lt;/li&gt;
&lt;li&gt;If it isn't, go back to the start and try again&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is, at its core, how &lt;code&gt;ConcurrentStack&lt;/code&gt; works.&lt;/p&gt;

&lt;h4&gt;Performing atomic operations&lt;/h4&gt;
&lt;p&gt;So, what underlying aspects of the system let us perform these operations atomically?
&lt;/p&gt;&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Object references are atomic&lt;/strong&gt;&lt;br&gt;
This lets us take an atomic snapshot of the current state simply by storing the value of &lt;code&gt;m_head&lt;/code&gt; in a local variable; &lt;code&gt;m_head&lt;/code&gt; will always be pointing at a valid node (or null), it won't be 'half-changed'.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;Interlocked.CompareExchange&lt;/code&gt; is atomic&lt;/strong&gt;&lt;br&gt;
This ensures the state-check-and-switch won't be afflicted by race conditions with other threads.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;m_head&lt;/code&gt; is a volatile variable&lt;/strong&gt;&lt;br&gt;
&lt;code&gt;m_head&lt;/code&gt; is marked as &lt;code&gt;volatile&lt;/code&gt;. This means that any changes to that variable by one thread are instantly visible to all other threads on the system; there's no caching to get in the way. Any threads that are attempting to modify &lt;code&gt;m_head&lt;/code&gt; will be forced to try again, and they won't overwrite the changes already made.&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;Gory implementation details&lt;/h4&gt;
&lt;p&gt;Now, I said at the start that I wouldn't be covering gory implementation details of these collections. However, in this case, there are some important considerations.&lt;/p&gt;

&lt;p&gt;If you decompile &lt;code&gt;ConcurrentStack&lt;/code&gt;, you'll notice there's a &lt;code&gt;Push&lt;/code&gt; method and a &lt;code&gt;PushCore&lt;/code&gt; method (similarly for &lt;code&gt;TryPop&lt;/code&gt; and &lt;code&gt;TryPopRange&lt;/code&gt;). This is a small but significant performance detail; &lt;code&gt;ConcurrentStack&lt;/code&gt; assumes there aren't going to be any thread collisions. The public methods all optimistically attempt to perform the operation once. Only if another thread gets in the way do the corresponding Core methods get executed. It is these that contain the full loop logic, along with calls to &lt;code&gt;SpinWait.SpinOnce&lt;/code&gt; to try and get the other threads out of the way before they try again.&lt;/p&gt;

&lt;p&gt;In fact, &lt;code&gt;TryPopCore&lt;/code&gt; goes one step further, and implements a randomised &lt;a href="http://en.wikipedia.org/wiki/Exponential_backoff"&gt;exponential backoff&lt;/a&gt; to reduce the chances of several threads all waiting the same amount of time. It does mean that, although &lt;code&gt;ConcurrentStack&lt;/code&gt; is lockless, and so eliminates the risk of deadlock, the performance will suffer if there are many threads all modifying the stack at the same time, as each method will have to try several times before its operation succeeds.&lt;/p&gt;

&lt;p&gt;That's the essence of how &lt;code&gt;ConcurrentStack&lt;/code&gt; is implemented. Next time, I'll take a look at the second lockless concurrent collection, &lt;code&gt;ConcurrentQueue&lt;/code&gt;, which goes about achieving thread-safety in quite a different way.&lt;/p&gt;

&lt;h4&gt;Footnote&lt;/h4&gt;
&lt;p&gt;I'm now on twitter! If you want to get in touch, comment on the series, or suggest other things I could look at, my username is &lt;a href="https://twitter.com/#%21/simonmcooper"&gt;@simonmcooper&lt;/a&gt;&lt;/p&gt;&lt;img src="http://www.simple-talk.com/community/aggbug.aspx?PostID=105370" width="1" height="1"&gt;</content><author><name>Simon Cooper</name><uri>http://www.simple-talk.com/community/user/Profile.aspx?UserID=24200</uri></author></entry><entry><title>Inside the Concurrent Collections</title><link rel="alternate" type="text/html" href="http://www.simple-talk.com/community/blogs/simonc/archive/2012/01/05/105166.aspx" /><id>http://www.simple-talk.com/community/blogs/simonc/archive/2012/01/05/105166.aspx</id><published>2012-01-05T14:41:00Z</published><updated>2012-01-05T14:41:00Z</updated><content type="html">&lt;p&gt;The concurrent collections, located in the &lt;code&gt;System.Collections.Concurrent&lt;/code&gt; namespace, were introduced in .NET 4 as thread-safe collections that could be used without locking, and there are plenty of posts and articles out on the interwebs giving an overview of the collections and how to use them.&lt;/p&gt;
&lt;p&gt;Instead, I'll be focusing these posts on how the concurrent collections are implemented; how they achieve thread-safety and the principles behind their implementation. Not only because I find that sort of thing quite interesting, but also to give you ideas about how you might use these principles in your own code.&lt;/p&gt;
&lt;p&gt;Note however, that writing bulletproof thread-safe collections is &lt;em&gt;hard&lt;/em&gt;. &lt;em&gt;Really&lt;/em&gt; hard. Think carefully about somehow using one of the existing collections before trying to write or adapt your own.&lt;/p&gt;

&lt;h4&gt;What is a 'thread-safe' collection?&lt;/h4&gt;
&lt;p&gt;First of all, we need to understand what we mean by 'thread-safe'. Well, let's start with the repository of all human knowledge - &lt;a href="http://en.wikipedia.org/wiki/Thread_safety"&gt;wikipedia&lt;/a&gt;:
&lt;/p&gt;&lt;blockquote&gt;A piece of code is thread-safe if it only manipulates shared data structures in a manner that guarantees safe execution by multiple threads at the same time&lt;/blockquote&gt;
OK, well, as an example, if &lt;code&gt;m_Collection&lt;/code&gt; is some sort of 'thread-safe' &lt;code&gt;ICollection&lt;/code&gt;, what will the result of these two threads running concurrently do?&lt;p&gt;&lt;/p&gt;
&lt;table&gt;
    &lt;tr&gt;
        &lt;th&gt;Thread 1&lt;/th&gt;
        &lt;th&gt;Thread 2&lt;/th&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;&lt;pre&gt;m_Collection.Add(m_Item);&lt;/pre&gt;&lt;/td&gt;
        &lt;td&gt;&lt;pre&gt;bool removed = m_Collection.Remove(m_Item);&lt;/pre&gt;&lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;
&lt;p&gt;The answer depends on exactly which order Thread 1 and Thread 2 run with respect to each other. So, whether &lt;code&gt;m_Item&lt;/code&gt; is in &lt;code&gt;m_Collection&lt;/code&gt; after these two threads have run depends entirely on the whim of the operating system. Thread 2 could try and remove an item that's not in the collection, setting &lt;code&gt;removed&lt;/code&gt; to false, then Thread 1 adds the item to the collection. Or Thread 1 could run first, adding the item, which is then immediately removed by Thread 2, setting &lt;code&gt;removed&lt;/code&gt; to true. Thread 1 could then merrily carry on executing, assuming &lt;code&gt;m_Item&lt;/code&gt; is in &lt;code&gt;m_Collection&lt;/code&gt;, not knowing it's been immediately removed by Thread 2.&lt;/p&gt;
&lt;p&gt;That, however, is an implementation detail of whoever wrote the code for Thread 1 and Thread 2. The important thing is that, after these two threads have run the above code in some order, &lt;code&gt;m_Collection&lt;/code&gt; will either have the item in it, or not; it won't be in some corrupted half-state where some internal datastructures think it has the item and some don't, so that (for example) &lt;code&gt;m_Collection.Contains(m_Item)&lt;/code&gt; returns false but the enumerator still returns the item. So, I propose that this is what is meant by a thread-safe collection:
&lt;/p&gt;&lt;blockquote&gt;A thread-safe collection is one that can be modified by multiple threads at the same time without corrupting itself.&lt;/blockquote&gt;&lt;p&gt;&lt;/p&gt;

&lt;h4&gt;Achieving concurrency&lt;/h4&gt;
&lt;p&gt;There is a very simple way of writing a thread-safe collection - implement &lt;code&gt;ICollection&amp;lt;T&amp;gt;&lt;/code&gt; by wrapping a &lt;code&gt;List&amp;lt;T&amp;gt;&lt;/code&gt; that locks the backing list object on every method call (if you're unclear about what locks are, I'll be covering them below). This means that only one thread can access and modify the backing list at once; thread-safety is therefore maintained. However, this isn't a very good solution; many threads could be blocked by one costly list resize operation. This solution doesn't have very good concurrency.&lt;/p&gt;
&lt;p&gt;Ideally, we want collections that don't keep threads waiting. As we'll see, two of the collections (ConcurrentStack and ConcurrentQueue) achieve thread-safety using no locks at all, and the other collections minimise the chances of two threads blocking on the same lock.

&lt;/p&gt;&lt;h4&gt;Locking and synchronization primitives&lt;/h4&gt;
&lt;p&gt;You can't just create a lockless thread-safe datastructure out of nowhere, you need some underlying support from the hardware, operating system and runtime to achieve this. Before we look at the collections themselves, we need to understand these primitives and what guarantees they provide.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Atomic types&lt;/strong&gt;
&lt;p&gt;Some types in .NET are specified as being 'atomic'. That is, if you change the value of a variable of an atomic type, it is guaranteed to appear to change value instantaneously; another thread won't be able to see the variable 'half-changed'. Object references and primitive types of 4 bytes or shorter are always atomic (ints, chars, booleans, floats etc), but 8-byte primitives are only atomic when running on a 64-bit process. The atomicity of object references is especially important to lockless collections, as we'll see in later posts.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Volatile variables&lt;/strong&gt;
&lt;p&gt;I discussed volatility in a &lt;a href="/community/blogs/simonc/archive/2010/11/24/95830.aspx"&gt;previous post&lt;/a&gt;. To recap, marking a variable as volatile means that:
&lt;/p&gt;&lt;ul&gt;
&lt;li&gt;The JIT compiler won't cache the variable in a cpu register; it will always read and write to it using the original memory location. This is important when multiple threads are changing a variable at the same time, so that any changes made to the variable in one thread are immediately picked up by other threads.&lt;/li&gt;
&lt;li&gt;A read or write to a volatile location introduces a memory barrier at that point, so that other reads and writes won't be reordered with respect to each other. This is important later on, as we'll see.&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Locks&lt;/strong&gt;
&lt;p&gt;Mutual-exclusion locks are one of the more basic synchronization primitives. Each object has a lock associated with it, and by locking on an object you only allow one thread to have 'ownership' of that lock at a time. This allows only one thread to execute the section of code protected by that lock at a time.&lt;/p&gt;
&lt;p&gt;When the currently executing thread 'releases' the lock, another thread (and only one other thread) is then allowed to take control of the lock and execute the protected code. Any other threads waiting to take the lock are blocked and cannot continue executing until it is their turn to take the lock.&lt;/p&gt;
&lt;p&gt;C# has special syntax for locking on an object:&lt;/p&gt;
&lt;pre&gt;private readonly object m_LockObj = new object();

public void SynchronizedMethod() {
    lock (m_LockObj) {
        // protected code
     }
}&lt;/pre&gt;
C# actually compiles this method to something like this:
&lt;pre&gt;public void SynchronizedMethod() {
    bool lockTaken = false;
    object lockObj = m_LockObj;
    
    try {
        System.Threading.Monitor.Enter(lockObj, ref lockTaken);
        // protected code
    }
    finally {
        if (lockTaken)
            System.Threading.Monitor.Exit(lockObj);
    }
}&lt;/pre&gt;
&lt;p&gt;This uses the &lt;a href="http://msdn.microsoft.com/en-us/library/x090d6tf.aspx"&gt;&lt;code&gt;System.Threading.Monitor&lt;/code&gt;&lt;/a&gt; class to implement the lock. The call to &lt;code&gt;Monitor.Enter&lt;/code&gt; blocks the thread until it can take control of the lock, and the lock is released by &lt;code&gt;Monitor.Exit&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;System.Threading.Interlocked&lt;/code&gt;&lt;/strong&gt;
&lt;p&gt;&lt;code&gt;Interlocked&lt;/code&gt; provides various methods for performing operations that wouldn't normally be atomic in an atomic fashion. For example, &lt;a href="http://msdn.microsoft.com/en-us/library/system.threading.interlocked.read.aspx"&gt;&lt;code&gt;Interlocked.Read&lt;/code&gt;&lt;/a&gt; allows an 8-byte &lt;code&gt;long&lt;/code&gt; to be read as an atomic operation (remember, 8-byte primitives are only atomic on 64-bit processes), &lt;a href="http://msdn.microsoft.com/en-us/library/system.threading.interlocked.add.aspx"&gt;&lt;code&gt;Interlocked.Add&lt;/code&gt;&lt;/a&gt; allows you to perform &lt;code&gt;a = a + b&lt;/code&gt; (aka &lt;code&gt;a+=b&lt;/code&gt;) atomically, &lt;a href="http://msdn.microsoft.com/en-us/library/system.threading.interlocked.decrement.aspx"&gt;&lt;code&gt;Interlocked.Decrement&lt;/code&gt;&lt;/a&gt; performs &lt;code&gt;a = a - 1&lt;/code&gt; (aka &lt;code&gt;--a&lt;/code&gt;) atomically, etc.&lt;/p&gt;
&lt;p&gt;The most important of these is &lt;a href="http://msdn.microsoft.com/en-us/library/4wx2c0dx.aspx"&gt;&lt;code&gt;Interlocked.CompareExchange&lt;/code&gt;&lt;/a&gt;. This family of methods performs the following operation atomically (using the generic overload as an example):&lt;/p&gt;
&lt;pre&gt;public static T CompareExchange&amp;lt;T&amp;gt;(ref T location, T value, T comparand)
  where T : class {
    T originalValue = location;
    if (location == comparand)
        location = value;
    return originalValue;
}&lt;/pre&gt;
&lt;p&gt;This might not seem like a particularly useful atomic operation, but it is crucial to the lockless collections, as we'll see.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;System.Threading.SpinWait&lt;/code&gt;&lt;/strong&gt;
&lt;p&gt;Introduced in .NET 4, this structure encapsulates the idea of a 'spinwait':
&lt;/p&gt;&lt;pre&gt;while (!variableThatShouldBeTrueToContinue) {}&lt;/pre&gt;
&lt;p&gt;This keeps the thread continually spinning round the while loop, repeatedly checking whether another thread has set &lt;code&gt;variableThatShouldBeTrueToContinue&lt;/code&gt; to true. However, performing such a spinwait gives no guarantees that the thread that is meant to set &lt;code&gt;variableThatShouldBeTrueToContinue&lt;/code&gt; will be given the chance to execute; the operating system could, if it wishes, simply keep on executing the spinwait and not switch to other threads that actually change this variable.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;System.Threading.SpinWait&lt;/code&gt; gets round this problem by, as well as simply spinning, occasionally calling &lt;code&gt;Thread.Sleep&lt;/code&gt; and &lt;code&gt;Thread.Yield&lt;/code&gt;. This will respectively encourage and force the operating system to execute other threads, giving a chance for the spinning condition to be satisfied.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;Using the concurrent collections&lt;/h4&gt;
&lt;p&gt;At this point it's also worth pointing out that using a thread-safe collection will not make the rest of your code thread-safe and free of race conditions; it is not a panacea. As in the example at the top of this post, using a thread-safe collection does not stop a race condition between adding and removing the same item from the collection. Using a thread-safe collection simply means you have one less thing to worry about when writing threaded code. You still have the rest of your code to worry about!&lt;/p&gt;

&lt;p&gt;That's it for introductions; in the &lt;a href="/community/blogs/simonc/archive/2012/01/12/105370.aspx"&gt;next post&lt;/a&gt;, we'll look at the simplest of the concurrent collections, &lt;code&gt;ConcurrentStack&lt;/code&gt;.&lt;/p&gt;&lt;img src="http://www.simple-talk.com/community/aggbug.aspx?PostID=105166" width="1" height="1"&gt;</content><author><name>Simon Cooper</name><uri>http://www.simple-talk.com/community/user/Profile.aspx?UserID=24200</uri></author></entry><entry><title>Anatomy of a .NET Assembly - Type forwards</title><link rel="alternate" type="text/html" href="http://www.simple-talk.com/community/blogs/simonc/archive/2011/12/23/104984.aspx" /><id>http://www.simple-talk.com/community/blogs/simonc/archive/2011/12/23/104984.aspx</id><published>2011-12-23T14:02:00Z</published><updated>2011-12-23T14:02:00Z</updated><content type="html">&lt;p&gt;If you've ever had a poke around System.dll or System.Core.dll in Reflector, you may have noticed &lt;code&gt;TypeForwardedToAttributes&lt;/code&gt; applied to the assembly:

&lt;/p&gt;&lt;pre&gt;[assembly: TypeForwardedTo(typeof(Lazy&amp;lt;&amp;gt;))]
[assembly: TypeForwardedTo(typeof(LazyThreadSafetyMode))]
[assembly: TypeForwardedTo(typeof(Action))]
[assembly: TypeForwardedTo(typeof(Action&amp;lt;,&amp;gt;))]
[assembly: TypeForwardedTo(typeof(Action&amp;lt;,,&amp;gt;))]
[assembly: TypeForwardedTo(typeof(Action&amp;lt;,,,&amp;gt;))]&lt;/pre&gt;

This post has a look at what these are, and how they're implemented.

&lt;h4&gt;Type forwards&lt;/h4&gt;
&lt;p&gt;&lt;code&gt;TypeForwardedToAttribute&lt;/code&gt; is part of a feature introduced in .NET 2 - &lt;a href="http://msdn.microsoft.com/en-us/library/ms404275.aspx"&gt;Type forwarding&lt;/a&gt;. As the documentation says, this is a feature that allows a type to be moved to a different assembly without having to recompile assemblies using that type. This is used extensively by the class libraries when types were moved from System.Core.dll to mscorlib.dll between .NET 3.5 and 4, allowing assemblies compiled against .NET 2 and 3.5 to run on the .NET 4 framework as-is, without having to be recompiled.&lt;/p&gt;

&lt;p&gt;However, if you think about it, this is much more than a simple attribute; this is a core change to the CLR type resolution mechanism. Every type that is resolved in an assembly first has to check for the existance of a type forward indicating the type has been moved somewhere else.&lt;/p&gt;
&lt;p&gt;This isn't something that can easily be represented in a simple attribute; &lt;code&gt;TypeForwardedToAttribute&lt;/code&gt; is actually an example of a &lt;a href="/community/blogs/simonc/archive/2010/11/30/95936.aspx"&gt;pseudo custom attribute&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;ExportedType&lt;/h4&gt;
&lt;p&gt;First, a bit of background. The &lt;code&gt;ExportedType&lt;/code&gt; metadata table was originally designed to be used in multi-module assemblies as part of the 'public contract' of an assembly; the manifest module for the assembly would contain an entry in &lt;code&gt;ExportedType&lt;/code&gt; for every public type defined in other modules, comprising&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;TypeName&lt;/code&gt; &amp;amp; &lt;code&gt;TypeNamespace&lt;/code&gt;: the exported type's full name&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Implementation&lt;/code&gt;: the module (or nested &lt;code&gt;ExportedType&lt;/code&gt;) the type can be found.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;TypeDefId&lt;/code&gt;: the index within the &lt;code&gt;TypeDef&lt;/code&gt; table in the module the type is located.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This allows any tool referencing the multi-module assembly to only need to look in the manifest module to find every public type defined in the assembly and where it can be found; it doesn't have to scan every module comprising the assembly.&lt;/p&gt;
&lt;p&gt;When .NET 2 came along, &lt;code&gt;ExportedType&lt;/code&gt; was repurposed to store type forwards as well. If &lt;code&gt;Implementation&lt;/code&gt; is a reference to an assembly rather than a module, then that assembly is the new location of the type named by &lt;code&gt;TypeName&lt;/code&gt; and &lt;code&gt;TypeNamespace&lt;/code&gt; (type forwards don't allow you to change the type name or namespace).&lt;/p&gt;

&lt;p&gt;So, when the C# compiler sees a type forward in System.Core.dll specified by the attribute
&lt;/p&gt;&lt;pre&gt;[assembly: TypeForwardedTo(typeof(Action))]&lt;/pre&gt;
the &lt;code&gt;Action&lt;/code&gt; type is resolved using the normal C# type resolution rules to &lt;code&gt;[mscorlib]System.Action&lt;/code&gt;, and the compiler generates an entry in &lt;code&gt;ExportedType&lt;/code&gt; like so:
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;TypeName&lt;/code&gt;: &lt;code&gt;Action&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;TypeNamespace&lt;/code&gt;: &lt;code&gt;System&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Implementation&lt;/code&gt;: assembly reference to &lt;code&gt;mscorlib&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;TypeDefId&lt;/code&gt;: 0 (type forwards don't use this field)&lt;/li&gt;
&lt;/ul&gt;
this entry is then followed by the core CLR type resolution mechanism so that any references to &lt;code&gt;[System.Core]System.Action&lt;/code&gt; are transparently redirected to &lt;code&gt;[mscorlib]System.Action&lt;/code&gt; at runtime; assemblies using the forwarded type don't have to be recompiled.&lt;p&gt;&lt;/p&gt;

&lt;h4&gt;TypeForwardedFrom&lt;/h4&gt;
&lt;p&gt;Finally, &lt;code&gt;TypeForwardedFromAttribute&lt;/code&gt; is the counterpart to &lt;code&gt;TypeForwardedToAttribute&lt;/code&gt;; it specifies the assembly a type has been forwarded from (using an assembly name string, rather than a direct metadata assembly reference). However, unlike &lt;code&gt;TypeForwardedTo&lt;/code&gt;, this is a normal attribute, has no effect on CLR type resolution, and exists primarily for bookkeeping purposes.&lt;/p&gt;

&lt;p&gt;Type forwards may seem like a niche feature aimed at library writers, but they are invaluable in the right circumstances. In the BCL, they allow programs compiled against the .NET 2 and 3.5 frameworks to run on the .NET 4 framework without needing to be recompiled.&lt;/p&gt;&lt;img src="http://www.simple-talk.com/community/aggbug.aspx?PostID=104984" width="1" height="1"&gt;</content><author><name>Simon Cooper</name><uri>http://www.simple-talk.com/community/user/Profile.aspx?UserID=24200</uri></author></entry><entry><title>Subterranean IL: Explicit overrides</title><link rel="alternate" type="text/html" href="http://www.simple-talk.com/community/blogs/simonc/archive/2011/12/12/104739.aspx" /><id>http://www.simple-talk.com/community/blogs/simonc/archive/2011/12/12/104739.aspx</id><published>2011-12-12T13:02:00Z</published><updated>2011-12-12T13:02:00Z</updated><content type="html">&lt;p&gt;Normally, virtual method overrides in .NET are done implicitly; if a subclass has a virtual method with the same name and signature as a virtual method in a base class, then the method in the subclass overrides the method in the base class:&lt;/p&gt;
&lt;pre&gt;.class public BaseClass {
    .method public instance virtual string Method1(int32 i1, object o1) { ... }
}

.class public SubClass extends BaseClass {
    // this method implicitly overrides BaseClass::Method1
    .method public instance virtual string Method1(int32 i1, object o1) { ... }
}&lt;/pre&gt;

&lt;p&gt;This is in contrast to C# and VB, where a 'base' virtual method has to be declared as &lt;code&gt;virtual&lt;/code&gt;, and all overrides of that method in subclasses have to be declared as &lt;code&gt;override&lt;/code&gt;. But, similarly to IL, exactly which method it overrides in a base type depends on the name and signature of the methods concerned.&lt;/p&gt;

&lt;h4&gt;Explicit interface implementations&lt;/h4&gt;
&lt;p&gt;So what happens with explicit interface implementations? Well, IL and the assembly metadata allow you to specify a &lt;em&gt;MethodImpl&lt;/em&gt;, which forces a method in the class to override one with a different name (but the same signature) in a base class. C# uses this mechanism to implement explicit interface overrides; this C#:&lt;/p&gt;

&lt;pre&gt;public interface IInterface {
    string MyMethod(string s1);
}

public class MyClass : IInterface {
    public string MyMethod2(string s1) { ... }
    
    // explicit implementation of IInterface.MyMethod
    string IInterface.MyMethod(string s1) { ... }
}&lt;/pre&gt;

is compiled to this IL:
&lt;pre&gt;.class public interface abstract IInterface {
    .method public abstract virtual instance string MyMethod(string s1) {}
}

.class public MyClass implements IInterface {
    // normal MyMethod2 method
    .method public instance string MyMethod2(string s1) {}

    // explicit implementation of IInterface.MyMethod
    .method private virtual final
            instance string MyApp.IInterface.MyMethod(string s1) {
        &lt;strong&gt;.override IInterface::MyMethod&lt;/strong&gt;
        ...
    }
}&lt;/pre&gt;

&lt;p&gt;There are several interesting things to note here:
&lt;/p&gt;&lt;ol&gt;
&lt;li&gt;The &lt;code&gt;.override&lt;/code&gt; directive specifies the explicit override. Here, it forces &lt;code&gt;MyClass::MyApp.IInterface.MyMethod&lt;/code&gt; to override &lt;code&gt;IInterface::MyMethod&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The explicit override method is named &lt;code&gt;MyApp.IInterface.MyMethod&lt;/code&gt;. There is no necessity for any explicit overrides to be named anything in particular, but using this name format guarantees there won't be any name conflicts in the same class; neither with normal C# code, as that can't create method names with dots in, nor with any other explicit implementations, as the name contains the C# fully qualified name of the overridden method.&lt;/li&gt;
&lt;li&gt;The explicit override method is marked &lt;code&gt;virtual final&lt;/code&gt;. In order to take part in any virtual method lookup, a method has to be declared &lt;code&gt;virtual&lt;/code&gt;, even if the lookup information is 'forced' by an &lt;code&gt;.override&lt;/code&gt; directive.&lt;/li&gt;
&lt;li&gt;The explicit override method is &lt;code&gt;private&lt;/code&gt;. Although implicit overrides have to have at least the same accessibility as the method they're overriding (ie you can't implicitly override a &lt;code&gt;public&lt;/code&gt; method with an &lt;code&gt;internal&lt;/code&gt; one), these rules don't apply to explicit overrides. In this case, the explicit override can &lt;em&gt;only&lt;/em&gt; be called within &lt;code&gt;MyClass&lt;/code&gt; or through the &lt;code&gt;IInterface&lt;/code&gt; interface.&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;Abusing explicit overrides&lt;/h4&gt;
&lt;p&gt;Although &lt;code&gt;.override&lt;/code&gt; is used by the C# compiler for explicit interface implementation, it is not limited to that. In fact, the only specific rules governing the use of &lt;code&gt;.override&lt;/code&gt; are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The signatures of the methods have to match (as I explored in an &lt;a href="/community/blogs/simonc/archive/2010/07/14/93495.aspx"&gt;earlier post&lt;/a&gt;, you can't implement override variance using explicit overrides). Explicit overrides only allow you to change the name.&lt;/li&gt;
&lt;li&gt;The overridden method has to be in the type hierarchy of the containing class.&lt;/li&gt;
&lt;li&gt;Both overridden and overriding methods have to be virtual.&lt;/li&gt;
&lt;li&gt;The overridden method can't be final.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's it. In particular, notice how there aren't any restrictions on whether the overridden method is on an interface or concrete base class:&lt;/p&gt;

&lt;pre&gt;.class public ToStringExplicitOverride extends [mscorlib]System.Object {
    .method private virtual instance string ExplicitToString() {
        .override [mscorlib]System.Object::ToString
        
        ldstr "Explicit override of object.ToString!"
        ret
    }
}&lt;/pre&gt;

&lt;p&gt;or on the number of &lt;code&gt;.override&lt;/code&gt; declarations on a particular method:&lt;/p&gt;

&lt;pre&gt;.class public ClassA {
    .method public virtual instance void Method1(string s, object o) { ... }
    .method public virtual instance void Method2(string s, object o) { ... }
}

.class public ClassB extends ClassA {
    .method public virtual instance void Method3(string s, object o) {
        .override ClassA::Method1
        .override ClassA::Method2
    }
}&lt;/pre&gt;

&lt;p&gt;or that the overriding method has to be private:&lt;/p&gt;

&lt;pre&gt;.class public ClassA {
    .method public virtual instance void Method1(string s, object o) { ... }
}

.class public ClassB extends ClassA {
    .method public virtual instance void Method2(string s, object o) {
        .override ClassA::Method1
    }
}

.class public ClassC extends ClassB {
    .method public virtual instance void Method2(string s, object o) {
        // this method will be invoked whenever ClassA::Method1
        // is called on an instance of ClassC
    }
}&lt;/pre&gt;

&lt;p&gt;or that the use of explicit and implicit overrides are exclusive of each other:&lt;/p&gt;

&lt;pre&gt;.class public ClassA {
    .method public virtual instance void Method1(string s, object o) { ... }
    .method public virtual instance void Method2(string s, object o) { ... }
}

.class public ClassB extends ClassA {
    .method public virtual instance void Method1(string s, object o) {
        .override ClassA::Method2
        
        // implements ClassA::Method1 implicitly, and ClassA::Method2 explicitly
        // this method will be invoked if ClassA::Method1 or ClassA::Method2
        // is called on an instance of ClassB
    }
    
    .method public virtual instance void Method2(string s, object o) {
        // normally, this would implicitly override ClassA::Method2
        // but this method will NOT be called if ClassA::Method2 is invoked
        // as Method1 explicitly overrides ClassA::Method2 instead
    }
}&lt;/pre&gt;

&lt;p&gt;or any combination of these. The use of explicit overrides is limited only by the compiler writer's ingenuity.&lt;/p&gt;

&lt;h4&gt;Alternate explicit override syntax&lt;/h4&gt;

&lt;p&gt;As well as specifying &lt;code&gt;.override&lt;/code&gt; inside the method body, you can also specify it in the class itself:&lt;/p&gt;

&lt;pre&gt;.class public EqualsClass extends [mscorlib]System.Object {
    .method public virtual instance bool AlternateEquals(object o) { ... }
    
    .override [mscorlib]System.Object::Equals with instance bool AlternateEquals(object)
}&lt;/pre&gt;

&lt;p&gt;This leads onto the question as to whether you can use this syntax to link two disparate arms of a type's inheritance heirarchy. As an example using implicit implementations:&lt;/p&gt;

&lt;pre&gt;.class public interface IInterface {
    .method public virtual abstract instance string Method1() {}
}

.class public abstract BaseClass {    
    .method public virtual instance string Method1() {
        ldstr "foo"
        ret
    }
}

.class public SubClass extends BaseClass implements IInterface {
    // SubClass implements IInterface::Method1 using BaseClass::Method1
}&lt;/pre&gt;

but does this work for explicit implementations?

&lt;pre&gt;.class public interface IInterface {
    .method public virtual abstract instance string Method1() {}
}

.class public abstract BaseClass {    
    .method public virtual instance string Method2() {
        ldstr "foo"
        ret
    }
}

.class public SubClass extends BaseClass implements IInterface {
    .override IInterface::Method1 with instance string BaseClass::Method2()
}&lt;/pre&gt;

well, &lt;code&gt;ilasm&lt;/code&gt; compiles it successfully, but it fails verification, and the CLR fails to load the dll:

&lt;pre&gt;[MD]: Error: Class implements interface but not method
    (class:0x02000004; interface:0x06000001; method:0x0069004d).
    [token:0x09000001]
[MD]: Error: MethodImpl has body from another TypeDef
    (token=0x02000003).
    [token:0x00000001]&lt;/pre&gt;

&lt;p&gt;Although this can be represented in the assembly metadata, the CLR specification disallows explicit overrides using methods in a base class (section 10.3.2). The overriding method &lt;em&gt;must&lt;/em&gt; be in the same type as the &lt;code&gt;.override&lt;/code&gt; directive.&lt;/p&gt;
&lt;p&gt;This is a shame; this seems like a rather arbitary limitation, given that implicit implementations using base class methods work without errors. It would be interesting to hear from the CLR team as to why this limitation exists in the specification.&lt;/p&gt;

&lt;p&gt;In the meantime, you can create a stub method in the implementing class to call the correct method on a base class (which is a technique sometimes used by the C# compiler).&lt;/p&gt;&lt;img src="http://www.simple-talk.com/community/aggbug.aspx?PostID=104739" width="1" height="1"&gt;</content><author><name>Simon Cooper</name><uri>http://www.simple-talk.com/community/user/Profile.aspx?UserID=24200</uri></author></entry><entry><title>Inside Red Gate - Experimenting In Public</title><link rel="alternate" type="text/html" href="http://www.simple-talk.com/community/blogs/simonc/archive/2011/11/21/104407.aspx" /><id>http://www.simple-talk.com/community/blogs/simonc/archive/2011/11/21/104407.aspx</id><published>2011-11-21T14:27:00Z</published><updated>2011-11-21T14:27:00Z</updated><content type="html">&lt;p&gt;Over the next few weeks, we'll be performing experiments on SmartAssembly to confirm or refute various hypotheses we have about how people use the product, what is stopping them from using it to its full extent, and what we can change to make it more useful and easier to use. Some of these experiments can be done within the team, some within Red Gate, and some need to be done on external users.&lt;/p&gt;

&lt;h4&gt;External testing&lt;/h4&gt;
&lt;p&gt;Some external testing can be done by standard usability tests and surveys, however, there are some hypotheses that can only be tested by building a version of SmartAssembly with some things in the UI or implementation changed. We'll then be able to look at how the experimental build is used compared to the 'mainline' build, which forms our baseline or control group, and use this data to confirm or refute the relevant hypotheses.&lt;/p&gt;
&lt;p&gt;However, there are several issues we need to consider before running experiments using separate builds:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Ideally, the user wouldn't know they're running an experimental SmartAssembly. We don't want users to use the experimental build like it's an experimental build, we want them to use it like it's the real mainline build. Only then will we get valid, useful, and informative data concerning our hypotheses.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;There's no point running the experiments if we can't find out what happens after the download. To confirm or refute some of our hypotheses, we need to find out how the tool is used once it is installed. Fortunately, we've applied feature usage reporting to the SmartAssembly codebase itself to provide us with that information.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Of course, this then makes the experimental data conditional on the user agreeing to send that data back to us in the first place. Unfortunately, even though this does limit the amount of useful data we'll be getting back, and possibly skew the data, there's not much we can do about this; we don't collect feature usage data without the user's consent. Looks like we'll simply have to live with this.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;What if the user tries to buy the experiment? This is something that isn't really covered by the Lean Startup book; how do you support users who give you money for an experiment? If the experiment is a new feature, and the user buys a license for SmartAssembly based on that feature, then what do we do if we later decide to pivot &amp;amp; scrap that feature? We've either got to spend time and money bringing that feature up to production quality and into the mainline anyway, or we've got disgruntled customers. Either way is bad. Again, there's not really any good solution to this.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Similarly, what if we've removed some features for an experiment and a potential new user downloads the experimental build? (As I said above, there's no indication the build is an experimental build, as we want to see what users really do with it). The crucial feature they need is missing, causing a bad trial experience, a lost potential customer, and a lost chance to help the customer with their problem. Again, this is something not really covered by the Lean Startup book, and something that doesn't have a good solution.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;So, some tricky issues there, not all of them with nice easy answers. Turns out the practicalities of running Lean Startup experiments are more complicated than they first seem!&lt;/p&gt;&lt;img src="http://www.simple-talk.com/community/aggbug.aspx?PostID=104407" width="1" height="1"&gt;</content><author><name>Simon Cooper</name><uri>http://www.simple-talk.com/community/user/Profile.aspx?UserID=24200</uri></author></entry><entry><title>Inside Red Gate - Exercises in Leanness</title><link rel="alternate" type="text/html" href="http://www.simple-talk.com/community/blogs/simonc/archive/2011/11/15/104350.aspx" /><id>http://www.simple-talk.com/community/blogs/simonc/archive/2011/11/15/104350.aspx</id><published>2011-11-15T18:00:00Z</published><updated>2011-11-15T18:00:00Z</updated><content type="html">&lt;p&gt;There's a new movement rumbling around Red Gate Towers - the &lt;a href="http://www.amazon.com/Lean-Startup-Entrepreneurs-Continuous-Innovation/dp/0307887898/ref=sr_1_1?ie=UTF8&amp;amp;qid=1321027466&amp;amp;sr=8-1"&gt;Lean Startup&lt;/a&gt;. At its core is the idea that you don't have to be in a company with single-digit employees to be an entrepreneur; you simply have to (being blunt) not know what you should be doing. Specifically, you accept that you don't know everything you need to know in order to create a useful, successful &amp;amp; profitable product.&lt;/p&gt;

&lt;p&gt;This is something that Red Gate has had problems with in the past; we've created products that weren't aimed at the correct market, or didn't solve the problem the user had (although they solved the problem we thought the users had, or the problem the users thought they had). As a result, these products weren't as successful as they could have been.&lt;/p&gt;

&lt;p&gt;The ideas at the core of the Lean Startup help to combat this tendency to build large, well-engineered products that solve the wrong problem. You need to actually test your hypotheses about what the users and the market needs, rather than just running a project based on those untested assumptions. Furthermore, these tests need to be done as fast as possible (on the order of a week) so that, if necessary, you can change the direction of the project without wasting effort going down a dead end. Over time, as more tests are done and more hypotheses are confirmed or refuted, the project moves towards something that solves users' actual problems.&lt;/p&gt;

&lt;p&gt;However, re-aligning the development teams that operate within Red Gate along these lines does itself have some issues; we've got very good at doing large, monolithic releases, with a feature set decided well in advance. Currently it takes about 2 weeks to do install &amp;amp; release testing before a release; this is clearly not practicable for a team doing weekly, or even daily releases. There's also many infrastructure issues to be solved; in our source control, build system, release mechanism, support pages &amp;amp; documentation, licensing system, update system, and download pages. All these need modifications to allow the fast releases necessary for each experiment.&lt;/p&gt;

&lt;p&gt;Not only do we have to change our infrastructure, we have to change our mindset. Doing daily releases means each release won't get nearly as much testing as 'standard' releases. As a team, we have to be prepared that there will be releases that have bugs and issues with them; not only do we have to be prepared to change direction with every experiment we do, but we have to be ready to fix any bugs that are reported very quickly as well.&lt;/p&gt;

&lt;p&gt;The SmartAssembly team is spearheading this move towards leanness within the company, using Feature Usage Reporting (FUR). We think this is a cracking feature that will really help developers 
learn how people use their products, but we need to confirm this 
hypothesis. So, over the next few weeks, we'll be running a variety of experiments on SmartAssembly to either confirm or refute our hypotheses concerning how people use SmartAssembly and apply FUR to their own products.&lt;/p&gt;

&lt;p&gt;In the rest of this series, I'll be documenting how the experiments we perform get on, and our experiences with applying the Lean Startup model to a mature product like SmartAssembly.&lt;/p&gt;&lt;img src="http://www.simple-talk.com/community/aggbug.aspx?PostID=104350" width="1" height="1"&gt;</content><author><name>Simon Cooper</name><uri>http://www.simple-talk.com/community/user/Profile.aspx?UserID=24200</uri></author></entry><entry><title>The SmartAssembly Rearchitecture</title><link rel="alternate" type="text/html" href="http://www.simple-talk.com/community/blogs/simonc/archive/2011/11/14/104330.aspx" /><id>http://www.simple-talk.com/community/blogs/simonc/archive/2011/11/14/104330.aspx</id><published>2011-11-14T11:10:00Z</published><updated>2011-11-14T11:10:00Z</updated><content type="html">&lt;p&gt;You may have noticed that not a lot has happened to SmartAssembly in the past few months. However, the team has been very busy behind the scenes working on an entirely new version of SmartAssembly.&lt;/p&gt;

&lt;h4&gt;SmartAssembly 6.5&lt;/h4&gt;
&lt;p&gt;Over the past few releases of SmartAssembly, the team had come to the realisation that the current 'architecture' - grown organically, way before RedGate bought it, from a simple name obfuscator over the years into a full-featured obfuscator and assembly instrumentation tool - was simply not up to the task. Not for what we wanted to do with it at the time, and not what we have planned for the future.&lt;/p&gt;
&lt;p&gt;Not only was it not up to what we wanted it to do, but it was severely limiting our development capabilities; long-standing bugs in the root architecture that couldn't be fixed, some rather...interesting...design decisions, and convoluted logic that increased the complexity of any bugfix or new feature tenfold.&lt;/p&gt;
&lt;p&gt;So, we set out to fix this. Earlier this year, a new engine was written on which SmartAssembly would be based. Over the following few months, each feature was ported over to the new engine and extensively tested by our existing unit and integration tests. The engine was linked into the existing UI (no easy task, due to the tight coupling between the UI and old engine), and existing RedGate products were tested on the new SmartAssembly to ensure the new engine acted in the same way. The result is SmartAssembly 6.5.&lt;/p&gt;

&lt;h4&gt;The risks of a rearchitecture&lt;/h4&gt;
&lt;p&gt;Are there risks to rearchitecting a product like SmartAssembly? Of course. There was a lot of undocumented behaviour in the old engine, and as part of the rearchitecture we had to find this behaviour, define it, and document it. In the process we found some behaviour of the old engine that simply did not make sense; hence the changes in pruning &amp;amp; obfuscation behaviour in the release notes. All the special edge cases we had to find, document, and re-implement.&lt;/p&gt;

&lt;p&gt;There was a chance that these special cases would not be found until near the end of the project, when everything is functionally complete and interacting together. By that stage, it would be hard to go back and change anything without a whole lot of extra work, delaying the release by months. We always knew this was a possibility; our initial estimate of the time required was '4 months, ± 4 months'. And that was including various mitigation strategies to reduce the likelihood of these issues being found right at the end. Fortunately, this worst-case did not happen.&lt;/p&gt;

&lt;p&gt;However, the rearchitecture did produce some benefits. As well as numerous bug fixes that we could not fix any other way, we've also added logging that lets you find out exactly why a particular field or property wasn't pruned or obfuscated. There's a new command line interface, we've tested it with WP7.1 and Silverlight 5, and we've added a new option to error reporting to improve the performance of instrumented apps by ~10%, at the cost of inaccurate line numbers in reports.&lt;/p&gt;

&lt;h4&gt;So? What differences will I see?&lt;/h4&gt;
&lt;p&gt;Largely none. SmartAssembly 6.5 produces the same output as SmartAssembly 6.2. The performance of 6.5 will be much faster for some users, and generally the same as 6.2 for the remaining. If you've encountered a bug with previous versions of SmartAssembly, I encourage you to try 6.5, as it has most likely been fixed in the rearchitecture. If you encounter a bug with 6.5, please do tell us; we'll be doing another release quite soon, so we'll aim to fix any issues caused by 6.5 in that release.&lt;/p&gt;
&lt;p&gt;Most importantly, the new architecture finally allows us to implement some Big Things with SmartAssembly we've been planning for many months; these will fundamentally change how you build, release and monitor your application. Stay tuned for further updates!&lt;/p&gt;&lt;img src="http://www.simple-talk.com/community/aggbug.aspx?PostID=104330" width="1" height="1"&gt;</content><author><name>Simon Cooper</name><uri>http://www.simple-talk.com/community/user/Profile.aspx?UserID=24200</uri></author></entry><entry><title>Some non-generic collections</title><link rel="alternate" type="text/html" href="http://www.simple-talk.com/community/blogs/simonc/archive/2011/10/21/103950.aspx" /><id>http://www.simple-talk.com/community/blogs/simonc/archive/2011/10/21/103950.aspx</id><published>2011-10-21T10:03:00Z</published><updated>2011-10-21T10:03:00Z</updated><content type="html">&lt;p&gt;Although the collections classes introduced in .NET 2, 3.5 and 4 cover most scenarios, there are still some .NET 1 collections that don't have generic counterparts. In this post, I'll be examining what they do, why you might use them, and some things you'll need to bear in mind when doing so.&lt;/p&gt;

&lt;h4&gt;BitArray&lt;/h4&gt;
&lt;p&gt;&lt;code&gt;System.Collections.BitArray&lt;/code&gt; is conceptually the same as a &lt;code&gt;List&amp;lt;bool&amp;gt;&lt;/code&gt;, but whereas List&amp;lt;bool&amp;gt; stores each boolean in a single byte (as that's what the backing &lt;code&gt;bool[]&lt;/code&gt; does), BitArray uses a single bit to store each value, and uses various bitmasks to access each bit individually. This means that BitArray is eight times smaller than a List&amp;lt;bool&amp;gt;. Furthermore, BitArray has some useful functions for bitmasks, like &lt;code&gt;And&lt;/code&gt;, &lt;code&gt;Xor&lt;/code&gt; and &lt;code&gt;Not&lt;/code&gt;, and it's not limited to 32 or 64 bits; a BitArray can hold as many bits as you need.&lt;/p&gt;

&lt;p&gt;However, it's not all roses and kittens. There are some fundamental limitations you have to bear in mind when using BitArray:
&lt;/p&gt;&lt;ol&gt;
&lt;li&gt;&lt;p&gt;It's a non-generic collection. The enumerator returns &lt;code&gt;object&lt;/code&gt; (a boxed boolean), rather than an unboxed bool. This means that if you do this:&lt;/p&gt;
&lt;pre&gt;foreach (bool b in bitArray) { ... }&lt;/pre&gt;
&lt;p&gt;&lt;em&gt;Every single&lt;/em&gt; boolean value will be boxed, then unboxed. And if you do this:&lt;/p&gt;
&lt;pre&gt;foreach (var b in bitArray) { ... }&lt;/pre&gt;
&lt;p&gt;you'll have to manually unbox &lt;code&gt;b&lt;/code&gt; on every iteration, as it'll come out of the enumerator an &lt;code&gt;object&lt;/code&gt;. Instead, you should manually iterate over the collection using a for loop:&lt;/p&gt;
&lt;pre&gt;for (int i=0; i&amp;lt;bitArray.Length; i++) {
    bool b = bitArray[i];
    ...
}&lt;/pre&gt;&lt;/li&gt;
&lt;li&gt;Following on from that, if you want to use BitArray in the context of an &lt;code&gt;IEnumerable&amp;lt;bool&amp;gt;&lt;/code&gt;, &lt;code&gt;ICollection&amp;lt;bool&amp;gt;&lt;/code&gt; or &lt;code&gt;IList&amp;lt;bool&amp;gt;&lt;/code&gt;, you'll need to write a wrapper class, or use the &lt;code&gt;Enumerable.Cast&amp;lt;bool&amp;gt;&lt;/code&gt; extension method (although Cast would box and unbox every value you get out of it).&lt;/li&gt;
&lt;li&gt;There is no &lt;code&gt;Add&lt;/code&gt; or &lt;code&gt;Remove&lt;/code&gt; method. You specify the number of bits you need in the constructor, and that's what you get. You can change the length yourself using the Length property setter though.&lt;/li&gt;
&lt;li&gt;It doesn't implement &lt;code&gt;IList&lt;/code&gt;. Although not really important if you're writing a generic wrapper around it, it is something to bear in mind if you're using it with pre-generic code.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;However, if you use BitArray carefully, it can provide significant gains over a List&amp;lt;bool&amp;gt; for functionality and efficiency of space.&lt;/p&gt;

&lt;h4&gt;OrderedDictionary&lt;/h4&gt;
&lt;p&gt;&lt;code&gt;System.Collections.Specialized.OrderedDictionary&lt;/code&gt; does exactly what you would expect - it's an IDictionary that maintains items in the order they are added. It does this by storing key/value pairs in a Hashtable (to get O(1) key lookup) and an ArrayList (to maintain the order). You can access values by key or index, and insert or remove items at a particular index. The enumerator returns items in index order.&lt;/p&gt;

&lt;p&gt;However, the &lt;code&gt;Keys&lt;/code&gt; and &lt;code&gt;Values&lt;/code&gt; properties return ICollection, not IList, as you might expect; CopyTo doesn't maintain the same ordering, as it copies from the backing Hashtable, not ArrayList; and any operations that insert or remove items from the middle of the collection are O(n), just like a normal list.&lt;/p&gt;

&lt;p&gt;In short; don't use this class. If you need some sort of ordered dictionary, it would be better to write your own generic dictionary combining a &lt;code&gt;Dictionary&amp;lt;TKey, TValue&amp;gt;&lt;/code&gt; and &lt;code&gt;List&amp;lt;KeyValuePair&amp;lt;TKey, TValue&amp;gt;&amp;gt;&lt;/code&gt; or &lt;code&gt;List&amp;lt;TKey&amp;gt;&lt;/code&gt; for your specific situation.&lt;/p&gt;

&lt;h4&gt;ListDictionary and HybridDictionary&lt;/h4&gt;
&lt;p&gt;To look at why you might want to use &lt;code&gt;ListDictionary&lt;/code&gt; or &lt;code&gt;HybridDictionary&lt;/code&gt;, we need to examine the performance of these dictionaries compared to Hashtable and Dictionary&amp;lt;object, object&amp;gt;. For this test, I added n items to each collection, then randomly accessed n/2 items:&lt;/p&gt;

&lt;img src="/blogbits/simon.cooper/Dictionary%20performance.png"&gt;

&lt;p&gt;So, what's going on here? Well, ListDictionary is implemented as a linked list of key/value pairs; all operations on the dictionary require an O(n) search through the list. However, for small n, the constant factor that big-o notation doesn't measure is much lower than the hashing overhead of Hashtable or Dictionary.&lt;/p&gt;

&lt;p&gt;HybridDictionary combines a Hashtable and ListDictionary; for small n, it uses a backing ListDictionary, but switches to a Hashtable when it gets to 9 items (you can see the point it switches from a ListDictionary to Hashtable in the graph). Apart from that, it's got very similar performance to Hashtable.&lt;/p&gt;

&lt;p&gt;So why would you want to use either of these? In short, you wouldn't. Any gain in performance by using ListDictionary over Dictionary&amp;lt;TKey, TValue&amp;gt; would be offset by the generic dictionary not having to cast or box the items you store, something the graphs above don't measure.&lt;/p&gt;

&lt;p&gt;Only if the performance of the dictionary is vital, the dictionary will hold less than 30 items, and you don't need type safety, would you use ListDictionary over the generic Dictionary. And even then, there's probably more useful performance gains you can make elsewhere.&lt;/p&gt;&lt;img src="http://www.simple-talk.com/community/aggbug.aspx?PostID=103950" width="1" height="1"&gt;</content><author><name>Simon Cooper</name><uri>http://www.simple-talk.com/community/user/Profile.aspx?UserID=24200</uri></author></entry><entry><title>SortedDictionary and SortedList</title><link rel="alternate" type="text/html" href="http://www.simple-talk.com/community/blogs/simonc/archive/2011/10/05/103667.aspx" /><id>http://www.simple-talk.com/community/blogs/simonc/archive/2011/10/05/103667.aspx</id><published>2011-10-05T16:59:00Z</published><updated>2011-10-05T16:59:00Z</updated><content type="html">&lt;p&gt;Apart from &lt;code&gt;Dictionary&amp;lt;TKey, TValue&amp;gt;&lt;/code&gt;, there's two other dictionaries in the BCL - &lt;code&gt;SortedDictionary&amp;lt;TKey, TValue&amp;gt;&lt;/code&gt; and &lt;code&gt;SortedList&amp;lt;TKey, TValue&amp;gt;&lt;/code&gt;. On the face of it, these two classes do the same thing - provide an &lt;code&gt;IDictionary&amp;lt;TKey, TValue&amp;gt;&lt;/code&gt; interface where the iterator returns the items sorted by the key. So what's the difference between them, and when should you use one rather than the other? (as in my previous post, I'll assume you have some basic algorithm &amp;amp; datastructure knowledge)&lt;/p&gt;

&lt;h4&gt;SortedDictionary&lt;/h4&gt;
&lt;p&gt;We'll first cover &lt;code&gt;SortedDictionary&lt;/code&gt;. This is implemented as a special sort of binary tree called a &lt;a href="http://en.wikipedia.org/wiki/Red-black_tree"&gt;red-black tree&lt;/a&gt;. Essentially, it's a binary tree that uses various constraints on how the nodes of the tree can be arranged to ensure the tree is always roughly balanced (for more gory algorithmical details, see the wikipedia link above).&lt;/p&gt;

&lt;p&gt;What I'm concerned about in this post is how the .NET SortedDictionary is actually implemented. In .NET 4, behind the scenes, the actual implementation of the tree is delegated to a &lt;code&gt;SortedSet&amp;lt;KeyValuePair&amp;lt;TKey, TValue&amp;gt;&amp;gt;&lt;/code&gt;. One example tree might look like this:&lt;/p&gt;

&lt;img src="/blogbits/simon.cooper/SortedDictionary1.png"&gt;

&lt;p&gt;Each node in the above tree is stored as a separate &lt;code&gt;SortedSet&amp;lt;T&amp;gt;.Node&lt;/code&gt; object (remember, in a SortedDictionary, T is instantiated to &lt;code&gt;KeyValuePair&amp;lt;TKey, TValue&amp;gt;&lt;/code&gt;):&lt;/p&gt;

&lt;pre&gt;class Node {
    public bool IsRed;
    public T Item;
    public SortedSet&amp;lt;T&amp;gt;.Node Left;
    public SortedSet&amp;lt;T&amp;gt;.Node Right;
}&lt;/pre&gt;

&lt;p&gt;The SortedSet only stores a reference to the root node; all the data in the tree is accessed by traversing the &lt;code&gt;Left&lt;/code&gt; and &lt;code&gt;Right&lt;/code&gt; node references until you reach the node you're looking for. Each individual node can be physically stored anywhere in memory; what's important is the relationship between the nodes.&lt;/p&gt;

&lt;p&gt;This is also why there is no constructor to SortedDictionary or SortedSet that takes an integer representing the capacity; there are no internal arrays that need to be created and resized. This may seen trivial, but it's an important distinction between SortedDictionary and SortedList that I'll cover later on.&lt;/p&gt;

&lt;p&gt;And that's pretty much it; it's a standard red-black tree. Plenty of webpages and datastructure books cover the algorithms behind the tree itself far better than I could. What's interesting is the comparions between SortedDictionary and SortedList, which I'll cover at the end.&lt;/p&gt;

&lt;p&gt;As a side point, SortedDictionary has existed in the BCL ever since .NET 2. That means that, all through .NET 2, 3, and 3.5, there has been a bona-fide sorted set class in the BCL (called &lt;code&gt;TreeSet&lt;/code&gt;). However, it was internal, so it couldn't be used outside System.dll. Only in .NET 4 was this class exposed as &lt;code&gt;SortedSet&lt;/code&gt;.

&lt;/p&gt;&lt;h4&gt;SortedList&lt;/h4&gt;
&lt;p&gt;Whereas SortedDictionary didn't use any backing arrays, SortedList does. It is implemented just as the name suggests; two arrays, one containing the keys, and one the values (I've just used random letters for the values):&lt;/p&gt;

&lt;img src="/blogbits/simon.cooper/SortedDictionary2.png"&gt;

&lt;p&gt;The items in the &lt;code&gt;keys&lt;/code&gt; array are always guarenteed to be stored in sorted order, and the value corresponding to each key is stored in the same index as the key in the &lt;code&gt;values&lt;/code&gt; array. In this example, the value for key item 5 is 'z', and for key item 8 is 'm'.&lt;/p&gt;

&lt;p&gt;Whenever an item is inserted or removed from the SortedList, a binary search is run on the &lt;code&gt;keys&lt;/code&gt; array to find the correct index, then all the items in the arrays are shifted to accomodate the new or removed item.&lt;/p&gt;
&lt;p&gt;For example, if the key 3 was removed, a binary search would be run to find the array index the item was at, then everything above that index would be moved down by one:&lt;/p&gt;

&lt;img src="/blogbits/simon.cooper/SortedDictionary3.png"&gt;

&lt;p&gt;and then if the key/value pair {7, 'f'} was added, a binary search would be run on the keys to find the index to insert the new item, and everything above that index would be moved up to accomodate the new item:&lt;/p&gt;

&lt;img src="/blogbits/simon.cooper/SortedDictionary4.png"&gt;

&lt;p&gt;If another item was then added, both arrays would be resized (to a length of 10) before the new item was added to the arrays.&lt;/p&gt;

&lt;p&gt;As you can see, any insertions or removals in the middle of the list require a proportion of the array contents to be moved; an O(n) operation. However, if the insertion or removal is at the end of the array (ie the largest key), then it's only O(log n); the cost of the binary search to determine it does actually need to be added to the end (excluding the occasional O(n) cost of resizing the arrays to fit more items).&lt;/p&gt;

&lt;p&gt;As a side effect of using backing arrays, SortedList offers IList &lt;code&gt;Keys&lt;/code&gt; and &lt;code&gt;Values&lt;/code&gt; views that simply use the backing &lt;code&gt;keys&lt;/code&gt; or &lt;code&gt;values&lt;/code&gt; arrays, as well as various methods utilising the array index of stored items, which SortedDictionary does not (and cannot) offer.

&lt;/p&gt;&lt;h4&gt;The Comparison&lt;/h4&gt;
&lt;p&gt;So, when should you use one and not the other? Well, here's the important differences:&lt;/p&gt;

&lt;ul&gt;
    &lt;li&gt;&lt;strong&gt;Memory usage&lt;/strong&gt;&lt;br&gt;
    &lt;p&gt;SortedDictionary and SortedList have got very different memory profiles. SortedDictionary...
        &lt;/p&gt;&lt;ol&gt;
            &lt;li&gt;has a memory overhead of one object instance, a bool, and two references per item. On 64-bit systems, this adds up to ~40 bytes, not including the stored item and the reference to it from the &lt;code&gt;Node&lt;/code&gt; object.&lt;/li&gt;
            &lt;li&gt;stores the items in separate objects that can be spread all over the heap. This helps to keep memory fragmentation low, as the individual node objects can be allocated wherever there's a spare 60 bytes.&lt;/li&gt;
        &lt;/ol&gt;&lt;p&gt;&lt;/p&gt;
        &lt;p&gt;In contrast, SortedList...
        &lt;/p&gt;&lt;ol&gt;
            &lt;li&gt;has no additional overhead per item (only the reference to it in the array entries), however the backing arrays can be significantly larger than you need; every time the arrays are resized they double in size.&lt;p&gt;&lt;/p&gt;
&lt;p&gt;That means that if you add 513 items to a SortedList, the backing arrays will each have a length of 1024. To conteract this, the &lt;code&gt;TrimExcess&lt;/code&gt; method resizes the arrays back down to the actual size needed, or you can simply assign &lt;code&gt;list.Capacity = list.Count&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
            &lt;li&gt;stores its items in a continuous block in memory. If the list stores thousands of items, this can cause significant problems with &lt;a href="http://msdn.microsoft.com/en-us/magazine/cc534993.aspx"&gt;Large Object Heap memory fragmentation&lt;/a&gt; as the array resizes, which SortedDictionary doesn't have.&lt;/li&gt;
        &lt;/ol&gt;&lt;p&gt;&lt;/p&gt;
    &lt;/li&gt;
    
    &lt;li&gt;&lt;strong&gt;Performance&lt;/strong&gt;
    &lt;p&gt;Operations on a SortedDictionary always have O(log n) performance, regardless of where in the collection you're adding or removing items. In contrast, SortedList has O(n) performance when you're altering the middle of the collection.&lt;/p&gt;
&lt;p&gt;If you're adding or removing from the end (ie the largest item), then performance is O(log n), same as SortedDictionary (in practice, it will likely be slightly faster, due to the array items all being in the same area in memory, also called locality of reference).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So, when should you use one and not the other? As always with these sort of things, there are no hard-and-fast rules. But generally, if you:
&lt;/p&gt;&lt;ul&gt;
&lt;li&gt;need to access items using their index within the collection&lt;/li&gt;
&lt;li&gt;are populating the dictionary all at once from sorted data&lt;/li&gt;
&lt;li&gt;aren't adding or removing keys once it's populated&lt;/li&gt;
&lt;/ul&gt;
then use a SortedList. But if you:
&lt;ul&gt;
&lt;li&gt;don't know how many items are going to be in the dictionary&lt;/li&gt;
&lt;li&gt;are populating the dictionary from random, unsorted data&lt;/li&gt;
&lt;li&gt;are adding &amp;amp; removing items randomly&lt;/li&gt;
&lt;/ul&gt;
then use a SortedDictionary.

&lt;p&gt;The default (again, there's no definite rules on these sort of things!) should be to use SortedDictionary, unless there's a good reason to use SortedList, due to the bad performance of SortedList when altering the middle of the collection.&lt;/p&gt;&lt;img src="http://www.simple-talk.com/community/aggbug.aspx?PostID=103667" width="1" height="1"&gt;</content><author><name>Simon Cooper</name><uri>http://www.simple-talk.com/community/user/Profile.aspx?UserID=24200</uri></author></entry><entry><title>The .NET Dictionary</title><link rel="alternate" type="text/html" href="http://www.simple-talk.com/community/blogs/simonc/archive/2011/09/16/103362.aspx" /><id>http://www.simple-talk.com/community/blogs/simonc/archive/2011/09/16/103362.aspx</id><published>2011-09-16T17:51:00Z</published><updated>2011-09-16T17:51:00Z</updated><content type="html">&lt;p&gt;To many people, &lt;code&gt;System.Collections.Generic.Dictionary&amp;lt;TKey,TValue&amp;gt;&lt;/code&gt; is just a useful collection. In this post, I'll be looking inside that collection and see how it really works.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Dictionary&lt;/code&gt; is based on a hashtable; for the rest of this post, I'll assume you know roughly how a hashtable works. The &lt;a href="http://en.wikipedia.org/wiki/Hashtable"&gt;Wikipedia article&lt;/a&gt;, as the source of all knowledge algorithmical, provides a good overview. It will also help if you've got the class open in Reflector so you can see what's going on yourself.&lt;/p&gt;

&lt;h4&gt;The basics&lt;/h4&gt;
&lt;p&gt;The core of the &lt;code&gt;Dictionary&lt;/code&gt; type is a standard hashtable, straight out of an algorithms book. However, it's interesting design comes from how it deals with hash collisions, using a variant of &lt;em&gt;chaining&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;When you add an item to a hash table, the hash code of that item determines which slot in a backing array the item is stored. In the perfect case, every item would have a separate hash code, and so every item would have a separate slot in the array. To find an item, you simply hash it and go to that slot in the backing array.&lt;/p&gt;
&lt;p&gt;However, two or more items can hash to the same hash code, and you get a hash collision. When this happens, and you're using chaining, the extra item is stored in a linked list hanging off the array slot. How does the .NET Dictionary handle this?&lt;/p&gt;

&lt;h4&gt;Storing entries&lt;/h4&gt;

&lt;p&gt;To store its data, the Dictionary uses an array of &lt;code&gt;Entry&lt;/code&gt; structs:
&lt;/p&gt;&lt;pre&gt;private struct Entry {
    public TKey key;
    public TValue value;
    public int hashCode;
    public int next;
}&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;key&lt;/code&gt; and &lt;code&gt;value&lt;/code&gt; should be self-explanatory. &lt;code&gt;hashCode&lt;/code&gt; provides a fast lookup to the original hash code for the functions that need it, and &lt;code&gt;next&lt;/code&gt; is used when dealing with hash collisions.&lt;/p&gt;

&lt;p&gt;So, the backing array is a &lt;code&gt;Entry[]&lt;/code&gt; called &lt;code&gt;entries&lt;/code&gt;. However, where the entries go in this array &lt;em&gt;isn't&lt;/em&gt; determined by the hash code. There's actually a separate array of ints called &lt;code&gt;buckets&lt;/code&gt; alongside &lt;code&gt;entries&lt;/code&gt;. How is this used?&lt;/p&gt;

&lt;h4&gt;Hashing items&lt;/h4&gt;
&lt;p&gt;When adding an item, the hash code is generated modulo the current array size, and that determines the slot the item is stored in. However, that slot is not the one in &lt;code&gt;entries&lt;/code&gt;, it is actually the one in &lt;code&gt;buckets&lt;/code&gt;. The value in &lt;code&gt;buckets&lt;/code&gt; at the hashed index is then the index of the slot in &lt;code&gt;entries&lt;/code&gt; that the data is actually stored at, and that is simply assigned to the next free slot in the array.&lt;/p&gt;

&lt;p&gt;As an example, say you've got a Dictionary with a capacity of 5:&lt;/p&gt;
&lt;img src="/blogbits/simon.cooper/Dictionary1.png"&gt;
&lt;p&gt;You add an item to it, that has a hash code of 3, modulo the size of the dictionary. The actual item goes in the first available slot in &lt;code&gt;entries&lt;/code&gt;, and the slot at index 3 in &lt;code&gt;buckets&lt;/code&gt; then stores the index at which the item is stored in &lt;code&gt;entries&lt;/code&gt;; in this case, 0:&lt;/p&gt;
&lt;img src="/blogbits/simon.cooper/Dictionary2.png"&gt;
&lt;p&gt;You then add two more items to it, with hash codes 0 and 2, respectively:&lt;/p&gt;
&lt;img src="/blogbits/simon.cooper/Dictionary3.png"&gt;
&lt;p&gt;Nice and simple. But, you then add a fourth item that also hashes to 3. Now, we've already added an item with a hash code of 3; we've got a hash collision, so the Dictionary will have to chain it. However, it doesn't create a separate linked list, it stores the index of the next item in the chain in the &lt;code&gt;next&lt;/code&gt; value of the &lt;code&gt;Entry&lt;/code&gt; struct like so:&lt;/p&gt;
&lt;img src="/blogbits/simon.cooper/Dictionary4.png"&gt;
&lt;p&gt;If we were to add another item, also with a hash code of 3, the chain would be extended:&lt;/p&gt;
&lt;img src="/blogbits/simon.cooper/Dictionary5.png"&gt;&lt;p&gt;&lt;/p&gt;

&lt;p&gt;This means, that to search for an item with a particular hash code, you go to the &lt;code&gt;Entry&lt;/code&gt; pointed to by the &lt;code&gt;buckets&lt;/code&gt; value at that hash code, and follow the &lt;code&gt;next&lt;/code&gt; pointers until you find the item you need, or the null pointer (signified by a &lt;code&gt;next&lt;/code&gt; of -1; see the &lt;code&gt;FindEntry&lt;/code&gt; method).&lt;/p&gt;

&lt;h4&gt;Removing items&lt;/h4&gt;
&lt;p&gt;So that's adding items. What about removing items?&lt;/p&gt;

&lt;p&gt;Any item in the dictionary can be removed at any time, which is different to adding items, where they simply get put on the end of the &lt;code&gt;entries&lt;/code&gt; array. If you've got lots of additions and removals, this can lead to a lot of wasted space as items in the middle of the array get removed, and new items get added to the end. Well, this is where a variable called &lt;code&gt;freeList&lt;/code&gt; comes into play.&lt;/p&gt;
&lt;p&gt;When an item is removed, the slot it occupied gets added to the &lt;em&gt;freelist chain&lt;/em&gt;. This is a chain of entries, indexed by a single &lt;code&gt;freeList&lt;/code&gt; variable, which marks array slots where additional items can go. This is chain is also linked by the &lt;code&gt;next&lt;/code&gt; pointer in the &lt;code&gt;Entry&lt;/code&gt; structure.&lt;/p&gt;

&lt;p&gt;So, this is what it would look like if you removed the blue item:&lt;/p&gt;
&lt;img src="/blogbits/simon.cooper/Dictionary6.png"&gt;
&lt;p&gt;then the orange item:&lt;/p&gt;
&lt;img src="/blogbits/simon.cooper/Dictionary7.png"&gt;
&lt;p&gt;Then, if another item is added, instead of adding to the end of the array (and causing a full rehash of the contents), it can simply use the slot pointed to by the &lt;code&gt;freeList&lt;/code&gt; variable:&lt;/p&gt;
&lt;img src="/blogbits/simon.cooper/Dictionary8.png"&gt;
&lt;p&gt;Using this indirect mechanism means that the internal arrays don't need to be resized, and everything re-hashed, until you actually have more items than you can store in the &lt;code&gt;entries&lt;/code&gt; array. It also doesn't cause much performance degredation when the dictionary gets full, which the non-generic &lt;code&gt;System.Collections.Hashtable&lt;/code&gt; had problems with.&lt;/p&gt;

&lt;h4&gt;Some more random observations...&lt;/h4&gt;
&lt;ol&gt;
&lt;li&gt;&lt;code&gt;Clear&lt;/code&gt; doesn't reset the size of the arrays; if you clear a dictionary with 100,000 items in it, the backing arrays will still be sized to store 100,000 items, potentially wasting a lot of space.&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The dictionary enumerator simply iterates down the &lt;code&gt;entries&lt;/code&gt; array. This is why, when you're just adding items to a dictionary, they are returned in the order they were added. Only when you remove items and the freelist becomes involved are items returned in non-consecutive order.&lt;/p&gt;
&lt;p&gt;However, this is very much an implementation detail, and can change in future versions of .NET; do not rely on this behaviour in your own code! Always treat the the enumeration order of any hashtable as random.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;&lt;img src="http://www.simple-talk.com/community/aggbug.aspx?PostID=103362" width="1" height="1"&gt;</content><author><name>Simon Cooper</name><uri>http://www.simple-talk.com/community/user/Profile.aspx?UserID=24200</uri></author></entry><entry><title>Inside Red Gate - Testers</title><link rel="alternate" type="text/html" href="http://www.simple-talk.com/community/blogs/simonc/archive/2011/07/07/102205.aspx" /><id>http://www.simple-talk.com/community/blogs/simonc/archive/2011/07/07/102205.aspx</id><published>2011-07-07T13:50:00Z</published><updated>2011-07-07T13:50:00Z</updated><content type="html">&lt;p&gt;Developers might write good code, but no matter how good they are the result will always have bugs in it. It's up to the testers in the team to make sure the final product is as bug-free as it can be.&lt;/p&gt;

&lt;h4&gt;Deciding what to test&lt;/h4&gt;

&lt;p&gt;Within a project there's normally no official documentation produced, no official record of what the project should accomplish. The closest thing we get to documentation is the greenlight presentation slides (I'll be covering the greenlight process in a later post). This means there's no specification to validate against, no way of confirming that the project is 'complete'. So how does a tester know what to test in the first place?&lt;/p&gt;

&lt;p&gt;Within Red Gate, the same team normally stays on the same product for several minor and major releases (as an example, the same team has worked on SQL Source Control since it's inception back in 2009). Everyone within the team has an intimite knowledge of what the tool does, what problems it solves, how customers use it, and what the main bugs and issues are. Testers are as much a part of the team as the developers or project manager.&lt;/p&gt;

&lt;p&gt;This means testers simply don't need a specification. They know how customers use the tool, what bugs they're running into, how new features should interact with existing ones, and (roughly) how they actually work. The testers are fully involved in the project from day 1. This gives the testers deep knowledge of the application and application domain, so they know where and what they should test to ensure the final product works.&lt;/p&gt;

&lt;h4&gt;How to test a product&lt;/h4&gt;

&lt;p&gt;As with other things at Red Gate, there's no officially-mandated way of testing a product, no documentation that needs to be produced. It's entirely up to the testers to decide how they want to test the product. Most of the time this will involve writing automated unit and integration tests to run on the build server with each new build. When testing the user interface, this usually means manual testing of all the different parts of the UI to make sure it all behaves sensibly whatever actions the user performs (although some testers are experimenting with automated UI testing).&lt;/p&gt;

&lt;p&gt;As a product matures, it gains more and more tests. Not just tests for new features, but tests covering specific bugs encountered by customers. SQL Compare, our oldest existing product, now has over 30,000 unit and integration tests, whereas SmartAssembly (the product I'm working on) was acquired by Red Gate a couple of years ago, and has about 1000 unit tests and counting; Jason's writing more all the time.&lt;/p&gt;

&lt;p&gt;At the end of the project, it's up to the testers to give the final go-ahead to release the product to customers. They need to be satisfy themselves that the product is as bug-free as it can be and the new features work as they should, on every configuration the product will be run on, using whatever tools and methods they see fit. Only then does the installer go up on the website, and the online documentation and product web pages updated with the new features and new documentation.&lt;/p&gt;

&lt;p&gt;Testers are an integral part of the project team, and only they decide when the product can be released. They act as Red Gate's gatekeepers and quality control.&lt;/p&gt;&lt;img src="http://www.simple-talk.com/community/aggbug.aspx?PostID=102205" width="1" height="1"&gt;</content><author><name>Simon Cooper</name><uri>http://www.simple-talk.com/community/user/Profile.aspx?UserID=24200</uri></author></entry><entry><title>Inside Red Gate - Be Reasonable!</title><link rel="alternate" type="text/html" href="http://www.simple-talk.com/community/blogs/simonc/archive/2011/06/24/102060.aspx" /><id>http://www.simple-talk.com/community/blogs/simonc/archive/2011/06/24/102060.aspx</id><published>2011-06-24T16:14:00Z</published><updated>2011-06-24T16:14:00Z</updated><content type="html">&lt;p&gt;As I discussed in my previous posts, divisions and project teams within Red Gate are allowed a lot of autonomy to manage themselves. It's not just the teams though, there's an awful lot of freedom given to individual employees within the company as well.&lt;/p&gt;

&lt;h4&gt;Reasonableness&lt;/h4&gt;
&lt;p&gt;How Red Gate treats it's employees is embodied in the phrase 'You will be reasonable with us, and we will be reasonable with you'. As an employee, you are trusted to do your job to the best of you ability. There's no one looking over your shoulder, no one clocking you in and out each day. Everyone is working at the company because they want to, and one of the core ideas of Red Gate is that the company exists to 'let people do the best work of their lives'. Everything is geared towards that.&lt;/p&gt;

&lt;p&gt;To help you do your job, office services and the IT department are there. If you need something to help you work better (a third or fourth monitor, footrests, or a new keyboard) then ask people in Information Systems (IS) or Office Services and you will be given it, no questions asked. Everyone has administrator access to their own machines, and you can install whatever you want on it. If there's a particular bit of software you need, then ask IS and they will buy it.&lt;/p&gt;

&lt;p&gt;As an example, last year I wanted to replace my main hard drive with an SSD; I had a summer job at school working in a computer repair shop, so knew what to do. I went to IS and asked for 'an SSD, a SATA cable, and a screwdriver'. And I got it there and then, even the screwdriver. Awesome. I screwed it in myself, copied all my main drive files across, and I was good to go. Of course, if you're not happy doing that yourself, then IS will sort it all out for you, no problems.&lt;/p&gt;

&lt;p&gt;If you need something that the company doesn't have (say, a book off Amazon, or you need some specifications printing off &amp;amp; bound), then everyone has a expense limit of £100 that you can use without any sign-off needed from your managers. If you need a company credit card for whatever reason, then you can get it.&lt;/p&gt;

&lt;p&gt;This freedom extends to working hours and holiday; you're expected to be in the office 11am-3pm each day, but outside those times you can work whenever you want. If you need a half-day holiday on a days notice, or even the same day, then you'll get it, unless there's a good reason you're needed that day. If you need to work from home for a day or so for whatever reason, then you can. If it's reasonable, then it's allowed.&lt;/p&gt;

&lt;h4&gt;Trust issues?&lt;/h4&gt;
&lt;p&gt;A lot of trust, and a lot of leeway, is given to all the people in Red Gate. Everyone is expected to work hard, do their jobs to the best of their ability, and there will be a minimum of bureaucratic obstacles that stop you doing your work. What happens if you abuse this trust?&lt;/p&gt;

&lt;p&gt;Well, an example is company trip expenses. You're free to expense what you like; food, drink, transport, etc, but if you expenses are not reasonable, then you will never travel with the company again. Simple as that. Everyone knows when they're abusing the system, so simply don't do it. Along with reasonableness, another phrase used is 'Don't be a ***'. If you act like a ***, and abuse any of the trust placed in you, even if you're the best tester, salesperson, dev, or manager in the company, then you won't be a part of the company any more.&lt;/p&gt;

&lt;p&gt;From what I know about other companies, employee trust is highly variable between companies, all the way up to CCTV trained on employee's monitors. As a dev, I want to produce well-written &amp;amp; useful code that solves people's problems. Being able to get whatever I need - install whatever tools I need, get time off when I need to, obtain reference books within a day - all let me do my job, and so let Red Gate help other people do their own jobs through the tools we produce.&lt;/p&gt;

&lt;p&gt;Plus, I don't think I would like working for a company that doesn't allow admin access to your own machine and blocks Facebook!&lt;/p&gt;&lt;img src="http://www.simple-talk.com/community/aggbug.aspx?PostID=102060" width="1" height="1"&gt;</content><author><name>Simon Cooper</name><uri>http://www.simple-talk.com/community/user/Profile.aspx?UserID=24200</uri></author></entry><entry><title>Inside Red Gate - The Office</title><link rel="alternate" type="text/html" href="http://www.simple-talk.com/community/blogs/simonc/archive/2011/06/15/101951.aspx" /><id>http://www.simple-talk.com/community/blogs/simonc/archive/2011/06/15/101951.aspx</id><published>2011-06-15T16:53:00Z</published><updated>2011-06-15T16:53:00Z</updated><content type="html">&lt;p&gt;The vast majority of Red Gate is on the first and second floors (the second and third floors in US parlance) of an office building in Cambridge Business Park (&lt;a href="http://maps.google.co.uk/maps?f=q&amp;amp;source=s_q&amp;amp;hl=en&amp;amp;geocode=&amp;amp;q=cambridge+business+park&amp;amp;aq=&amp;amp;sll=52.220207,0.141727&amp;amp;sspn=0.175204,0.445976&amp;amp;ie=UTF8&amp;amp;hq=&amp;amp;hnear=Cambridge+Business+Park,+Cambridge+CB4+0WZ,+United+Kingdom&amp;amp;ll=52.228176,0.153082&amp;amp;spn=0.001369,0.003484&amp;amp;t=h&amp;amp;z=19"&gt;here we are!&lt;/a&gt;). As you can see, the building is split into three sections; the two wings, and the section between them.&lt;/p&gt;

&lt;p&gt;As well as being organisationally separate, the four divisions are also split up in the office; each division has it's own floor and wing, so everyone in the division is working together in the same area (.NET and DBA on the left, SQL Tools and New Business on the right). The non-divisional parts of the business share wings with the smaller divisions, again keeping each group together.&lt;/p&gt;

&lt;h4&gt;The canteen&lt;/h4&gt;
&lt;p&gt;One of the downsides of divisionalisation is that communication between people in different decisions is greatly reduced. This is where the canteen (aka the SQL Servery) comes in. Occupying most of the central section on the first floor, the canteen provides free cooked lunch every day, and is where everyone in the company gathers for lunch.&lt;/p&gt;

&lt;p&gt;The idea is to encourage communication between the divisions; having lunch with people in a different division you wouldn't otherwise talk to helps people keep track of what's going on elsewhere in the company. (I'm still amazed at how the canteen staff provide a wide range of superbly cooked food for over 200 people out of a kitchen in which, if you were to swing a cat, it would get severe head injuries.). There's also table tennis and table football tables that anyone can use, provided you can grab them when they're free!&lt;/p&gt;

&lt;h4&gt;Office layout&lt;/h4&gt;
&lt;p&gt;Cubicles are practically unheard of in the UK, and no one, including the CEOs, has separate offices. The entire office is open-plan, as you can see in &lt;a href="http://www.youtube.com/watch?v=POFICXrCegQ"&gt;this youtube video&lt;/a&gt; from when we first moved in (although all the empty desks are now full!). Neil &amp;amp; Simon, instead of having dedicated offices, move between the different divisions every few months to keep up to date with what's going on around the company; sitting with a division gives you a much better overall impression of how the division's doing than written status reports from the division heads.&lt;/p&gt;

&lt;p&gt;There's also the usual plethora of meeting rooms scattered around the place; when we first moved in in 2009 we had a competition to name them all. We've got Afoxalypse A &amp;amp; B, Seagulls A &amp;amp; B, Traffic Jam, Thinking Hats, Camelids A &amp;amp; B, Horses, etc. All the meeting rooms have pictures on the walls corresponding to their theme, which adds a nice bit of individuality to otherwise fairly drab meeting rooms. Generally, any meeting room can be booked by anyone at any time, although some groups have priority in certain rooms (Camelids B is used a lot for UX testing, the Interview Room is used for, well, interviews).&lt;/p&gt;

&lt;p&gt;And, as you can see from the video, each area has various pictures, post-its, notes, signs, on the walls to try and stop it being a dull office space. Yes, it's still an office, but it's designed to be as interesting and as individual as possible.&lt;/p&gt;&lt;img src="http://www.simple-talk.com/community/aggbug.aspx?PostID=101951" width="1" height="1"&gt;</content><author><name>Simon Cooper</name><uri>http://www.simple-talk.com/community/user/Profile.aspx?UserID=24200</uri></author></entry></feed>
