Click here to monitor SSC
  • Av rating:
  • Total votes: 396
  • Total comments: 41
Paul White

Understanding and Using Parallelism in SQL Server

03 March 2011

SQL Server is able to make implicit use of parallelism to speed SQL queries. Quite how it does it, and how you can be sure that it is doing so, isn't entirely obvious to most of us. Paul White begins a series that makes it all seem simple, starting at the gentle level of counting Jelly Beans

Many experienced database professionals have acquired a somewhat jaded view of parallel query execution. Sometimes, this is a consequence of bad experiences with older versions of SQL Server. Just as frequently, however, this view is the result of misconceptions, or an otherwise incomplete mastery of the techniques required to effectively design and tune queries for parallel execution.

This is the first in a series of articles that will provide the reader with the deep knowledge necessary to make advanced use of the parallel query processing features available in Microsoft SQL Server. Part one provides a step-by-step guide to the fundamentals of parallelism in SQL Server, introducing concepts such as parallel scans and seeks, workers, threads, tasks, execution contexts, and the exchange operators that coordinate parallel activity.

Future instalments will provide further insights into the inner workings of the database engine, and show how targeted parallelism can benefit many real-world environments, not just the data warehousing and decision-support systems normally associated with its use. Systems that are often thought of as having a primarily transaction-processing (OLTP) workload often contain queries and procedures that could benefit from the appropriate use of parallelism.

Perhaps inevitably, this and subsequent instalments contain quite deep technical content in places. Making the most effective use of parallelism requires a good understanding of how things like scheduling, query optimization, and the execution engine really work. Nevertheless, it is hoped that even those who are completely new to the topic will find this series informative and useful.

What is Parallelism?

You have probably heard the phrase "many hands make light work". The idea is that splitting a task among a number of people results in each person doing less. From the individual's perspective, the job seems much easier, even though a similar amount of work is being done overall. More importantly, if the extra people can perform their allocation of work at the same time, the total time required for the task is reduced.

Counting Jelly Beans

Imagine you are presented with a large glass jar full of assorted jelly beans, and asked to count how many there are. Assuming you are able to count beans at an average rate of five per second, it would take you a little over ten minutes to determine that this particular jar contains 3,027 jelly beans.

If four of your friends offer to help with the task, you could choose from a number of potential strategies, but let's consider one that closely mirrors the sort of strategy that SQL Server would adopt. You seat your friends around a table with the jar at its centre, and a single scoop to remove beans from the jar. You ask them to help themselves to a scoop of beans whenever they need more to count. Each friend is also given a pen and a piece of paper, to keep a running total of the number of beans they have counted so far.

Once a person finishes counting and finds the jar empty, they pass their individual bean count total to you. As you collect each subtotal, you add it to a grand total. When you have received a subtotal from each of your friends, the task is complete. With four people counting beans simultaneously the whole task is completed in around two and a half minutes – a four-fold improvement over counting them all yourself. Of course, four people still worked for a total of ten minutes (plus the few seconds it took you to add the last subtotal to the grand total).

This particular task is well-suited to parallel working because each person is able to work concurrently and independently. The desired result is obtained much more quickly, without doing much more work overall.

Counting Beans with SQL Server

SQL Server cannot count jelly beans directly, so we ask it to count the number of rows in a table instead. If the table is small, SQL Server will likely use an execution plan like the one shown in Figure 1.

Figure 1: Serial Counting Plan

This query plan uses a single worker – equivalent to counting all the beans yourself. The plan itself is very simple: the Stream Aggregate operator counts the rows it receives from the Index Scan operator, and returns the result once all rows have been processed. You might have chosen a similar strategy if the jelly bean jar had been almost empty, since you would be unlikely to save much time by splitting such a small number of beans among your friends, and the extra workers might even slow the process down slightly, due to the extra step of adding partial counts together at the end.

On the other hand, if the table is large enough, the SQL Server optimizer may choose to enlist additional workers, producing a query plan like the one shown in Figure 2.

Figure 2: Parallel Counting Plan

The small yellow arrow icons identify operations that involve multiple workers. Each worker is assigned a separate part of the problem, and the partial results are then combined to give a final result. As the manual bean-counting example demonstrated, the parallel plan has the potential to complete much faster than the serial plan, because multiple workers will be actively counting rows, simultaneously.

How Parallelism Works

Imagine for a moment that SQL Server has no built-in support for parallelism. You might try to improve the performance of the original row-counting query by manually splitting the query into equally-sized pieces, and running each one concurrently on a separate connection to the server.

Figure 3: Manual Parallelism

Each query in Figure 3 is written to process a separate range of rows from the table, ensuring that every row from the table is processed exactly once overall. With luck, SQL Server would run each query on a separate processing unit, and you could expect to receive the three partial results in roughly a third of the time. Naturally, you would still need to perform the extra step of adding the three values together to get a correct final result.

Parallel Execution as Multiple Serial Plans

The 'manual parallelism' example is not that far removed from the way SQL Server actually implements its parallel query facility. Recall the parallel query plan from Figure 2, and assume that SQL Server allocates three additional workers to the query at runtime. Conceptually, we can redraw the parallel plan to show SQL Server running three serial plans concurrently (this representation is not strictly accurate, but we will correct that shortly).

Figure 4: Multiple Serial Plans

Each additional worker is assigned to one of the three plan branches that feed into the Gather Streams operator. Notice that only the Gather Streams operator retains the little yellow parallelism icon; it is now the only operator that interacts with multiple workers. This general strategy suits SQL Server for two main reasons. Firstly, all the SQL Server code necessary to execute serial plans already exists, and has been optimized over many years and product releases. Secondly, this method scales extremely well: if more workers are available at runtime, SQL Server can easily add extra plan branches to split the work more ways.

The number of extra workers SQL Server assigns to each parallel plan region at runtime is known as the degree of parallelism (often abbreviated to DOP). SQL Server chooses the DOP just before the query starts executing, and it can change between executions without requiring a plan recompilation. The maximum DOP for each parallel region is determined by the number of logical processing units visible to SQL Server.

Parallel Scan and the Parallel Page Supplier

The problem with the conceptual plan shown in Figure 4 is that each Index Scan operator would count every row in the entire input set. Left uncorrected, the plan would produce incorrect results and probably take longer to execute than the serial version did. The manual parallelism example avoided that issue by using an explicit WHERE clause in each query to split the input rows into three distinct and equally-sized ranges.

SQL Server does not use quite the same approach, because distributing the work evenly makes the implicit assumption that each query will receive an equal share of the available processing resources, and that each data row will require the same amount of effort to process. In a simple example like counting rows in a table (on a server with no other activity) those assumptions may well hold, and the three queries might indeed return their partial results at about the same time.

In general, however, it is easy to think of examples where one or more of those hidden assumptions would not apply in the real world, due to any number of external or internal factors. For example, one of the queries might be scheduled on the same logical processor as a long-running bulk load, while the others run without contention. Alternatively, consider a query that includes a join operation, where the amount of effort required to process a particular row depends heavily on whether it matches the join condition or not. If some queries happen to receive more joining rows than others, the execution times are likely to vary widely, and overall performance will be limited by the speed of the slowest worker.

Instead of allocating a fixed number of rows to each worker, SQL Server uses a Storage Engine feature called the Parallel Page Supplier to distribute rows among the workers, on demand. You will not see the Parallel Page Supplier in a graphical query plan because it is not part of the Query Processor, but we can extend the illustration of Figure 4 to show where it would sit and what its connections would be:

Figure 5: The Parallel Page Supplier

The crucial point is that this is a demand-based scheme; the Parallel Page Supplier responds to requests from workers, providing a batch of rows to any worker that needs more work to do. Referring back to the bean-counting analogy, the Parallel Page Supplier is represented by the scoop used to remove beans from the jar. The single, shared scoop ensures that no two people count the same beans, but there is otherwise nothing to stop the same person collecting more beans, as required. In particular, if one person is slower than the others, that person simply takes fewer scoops from the jar, and the other workers will count more beans to compensate.

In SQL Server, a slow worker makes fewer requests to the Parallel Page Supplier, and so processes fewer rows. Other workers are unaffected, and continue to process rows at their individual maximum rates. In this way, the demand-based scheme provides some measure of resilience to variations in worker throughput. Instead of being bound by the speed of the slowest worker, the performance of the demand scheme degrades gracefully as individual worker throughput declines. Nevertheless, the fact that each worker may process a significantly different number of rows, depending on runtime conditions, can cause other problems (a topic we will return to later in this series).

Note that the use of a Parallel Page Supplier does not prevent SQL Server from using existing optimizations like read-ahead scanning (prefetching data from permanent storage). In fact, it may even be slightly more efficient for the three workers to consume rows from a single, underlying physical scan, rather than from the three separate range scans that we saw in the manual parallelism example.

The Parallel Page Supplier is also not limited to use with index scans; SQL Server uses a Parallel Page Supplier whenever multiple workers cooperatively read a data structure. That data structure may be a heap, clustered table, or an index, and the operation may be either a scan or a seek operation. If the latter point surprises you, consider that an Index Seek operation is just a partial scan i.e. it seeks to find the first qualifying row and then scans to the end of the qualifying range.

Execution Contexts

We now turn to the separate server connections, used in the manual parallelism example to allow concurrent execution. It would not be efficient for SQL Server to actually create multiple new connections for each parallel query executed, but the real mechanism is similar in many ways. Instead of creating a separate connection for each serial query, SQL Server uses a lightweight construct known as an execution context.

An execution context is derived from part of the query plan, at runtime, by filling in details that were not known at the time the plan was compiled and optimized. These details include references to objects that do not exist until runtime (a temporary table created within the same batch, for example) and the runtime values of any parameters and local variables. For more details on execution contexts, see this Microsoft White Paper.

SQL Server runs a parallel plan by deriving DOP execution contexts for each parallel region of the query plan, using a separate worker to run the serial plan portion contained in each context. To help visualise the concept, Figure 6 shows the four execution contexts created for the parallel counting plan we have been working with so far. Each colour identifies the scope of an execution context, and although it is not shown explicitly, a Parallel Page Supplier is again used to coordinate the index scans.

Figure 6: Parallel Plan Execution Contexts

The leftmost execution context of a parallel query plan (the one shown in red, in Figure 6) plays a special coordinating role and is executed by the worker provided by the connection that submitted the query. This 'first' execution context is known as execution context zero, and the associated worker is known as thread zero. We will define some of these terms more precisely in the next section, but for now assume that 'worker' and 'thread' mean roughly the same thing.

To provide a more concrete view of the abstract concepts introduced in this section, Figure 7 shows information obtained by running the parallel row-counting query, with the SQL Server Management Studio (SSMS) option, 'Include Actual Execution Plan', turned on.

Figure 7: Parallel Plan Row Counts

The callouts show the number of rows processed by each worker (thread) at two different points in the plan. The information comes from the SSMS Properties window, which can be accessed by clicking on an operator (or connecting line) and pressing F4. Alternatively, you can right-click an operator or line and choose Properties from the pop-up menu.

Reading from the right, we see row counts for each of the three workers in the parallel part of the plan; notice that two workers processed a similar number of rows (around 40,000), but the third obtained just 32,000 rows from the Parallel Page Supplier. As discussed, the demand-based nature of the process means that the precise number of rows processed by each worker depends on timing issues and processor load (among other things) and often varies between executions, even on a lightly-loaded machine.

The left-side of the diagram shows the three partial results (one from each parallel worker, executing in its own execution context) being collected together and summed to a single result by 'thread zero'. It is a quirk of the SSMS Properties window that 'thread zero' is labelled as 'Thread 0' in parallel parts of a graphical plan, and as 'All threads' in a serial region. If you look instead at the XML on which the graphical plan is based, the 'Runtime Counters Per Thread' element always refers to thread 0, never 'All threads'.

Schedulers, Workers, and Tasks

This article has so far used terms like 'thread' and 'worker' somewhat interchangeably. Now seems like a good time to define some terms a little more precisely.

Schedulers

A scheduler in SQL Server represents a logical processor, which might physically be a CPU, a processor core, or perhaps one of several hardware threads running within a core (hyperthreading). The primary purpose of a scheduler is to allow SQL Server precise control over its own thread scheduling, rather than relying on the generic algorithms used by the Windows operating systems. Each scheduler ensures that only one cooperatively-executing thread is runnable (as far as the Operating System is concerned) at any given moment, which has important benefits such as reduced context-switching, and a reduced number of calls into the Windows kernel. Part three of this series covers the internals of task scheduling and execution in much more detail.

Information about schedulers is shown in the system dynamic management view (DMV), sys.dm_os_schedulers.

Workers and Threads

A SQL Server worker is an abstraction that represents either a single operating system thread or a fiber (depending on the configuration setting 'lightweight pooling'). Very few systems run with fiber-mode scheduling enabled, so many texts (including much of the official documentation) refer to 'worker threads' - emphasising the fact that, for most practical purposes, a worker is a thread. A worker (thread) is bound to a particular scheduler for its entire lifetime. Information about workers is shown in the DMV sys.dm_os_workers.

Tasks

Books Online has this to say about tasks:

A task represents a unit of work that is scheduled by SQL Server. A batch can map to one or more tasks. For example, a parallel query will be executed by multiple tasks.

To expand on that rather terse definition, a task is a piece of work performed by a SQL Server worker. A batch that contains only serial execution plans is a single task, and will be executed (from start to finish) by the single connection-provided worker. This is the case even if execution has to pause to wait for another event to complete (such as a read from disk). A single worker is assigned one task, and cannot perform any other tasks until it is fully completed.

Execution Contexts

If a task describes the work to be done, an execution context is where that work takes place. Each task runs inside a single execution context, identified by the exec_context_id column in the sys.dm_os_tasks DMV (you can also see execution contexts using the ecid column in the backward-compatibility view sys.sysprocesses).

The Exchange Operator

To recap briefly, we have seen that SQL Server executes a parallel plan by concurrently running multiple instances of a serial plan. Each serial plan is a single task, run by a dedicated worker thread inside its own execution context. The final ingredient in a parallel plan is the exchange operator, which is the 'glue' SQL Server uses to connect together the execution contexts of a parallel plan. More generally, a complex query plan might contain any number of serial or parallel regions, connected by exchange operators.

So far, we have seen just one form of the exchange operator, namely Gather Streams, but it can appear in graphical plans in two further forms:

Figure 8: Exchange Logical Operations

All forms of the exchange operator are concerned with moving rows between one or more workers, distributing the individual rows among them as it goes. The different logical forms of the operator are used by SQL Server to introduce a new serial or parallel region, or to redistribute rows at the interface between two parallel regions.

The single physical exchange operator is even more flexible than its three logical forms might suggest. Not only can it split, combine, or redistribute rows among the workers connected to it, it can also:

  • Use one of five different strategies to determine to which output to route an input row
  • Preserve the sort order of the input rows, if required
  • Much of this flexibility stems from its internal design, so we will look at that first.

Inside an Exchange

The exchange operator has two distinct sub-components:

  • Producers, which connect to the workers on its input side
  • Consumers, which connect to workers on its output side

Figure 9 shows a magnified view of the (multi-coloured) Gather Streams operator, from Figure 6.

Figure 9: Inside a Gather Streams Exchange

Each producer collects rows from its input and packs them into one or more in-memory buffers. Once a buffer is full, the producer pushes it to the consumer side. Each producer and consumer runs on the same worker thread as the execution context to which it is connected (as indicated by the common colours). The consumer side of the exchange reads a row from an exchange buffer whenever it is asked for one by its parent operator (the red-shaded Stream Aggregate in this case).

One of the major benefits of this design is that the complexities normally associated with sharing data between multiple threads of execution can be handled by SQL Server inside one operator. The other, non-exchange operators in the plan are all running serially, and do not have to be concerned with such things.

The exchange operator uses buffers to minimize overheads and to implement a basic kind of flow control (to prevent fast producers getting too far ahead of a slow consumer, for example). The precise arrangement of buffers varies with the type of exchange, whether or not it is required to preserve order, and how it decides to which consumer a producer row should be routed.

Routing Rows

As noted, an exchange operator can decide to which consumer a producer should route a particular row. This decision depends on the Partitioning Type specified for the exchange, and there are five options.

Partitioning Type

Description

Hash

Most common. The consumer is chosen by evaluating a hash function on one or more column values in the current row.

Round Robin

Each new row is sent to the next consumer in a fixed sequence.

Broadcast

Each row is sent to all consumers.

Demand

The row is sent to the first consumer that asks for one. This is the only partitioning type where rows are pulled from the producer by the consumer inside the exchange operator.

Range

Each consumer is assigned a non-overlapping range of values. The range into which a particular input column falls determines which consumer gets the row.

The Demand and Range partitioning types are much less common than the first three, and are generally only seen in query plans that operate on partitioned tables. The Demand type is used in collocated partitioned joins to assign a partition id to the next worker thread. The Range partitioning type is used, for example, when creating partitioned indexes. The type of partitioning used, and any column values used in the process, is visible in the graphical query plan:

Figure 10: Exchange Partitioning Information

The more common partitioning types will be covered in detail later in the series.

Preserving Input Order

An exchange operator may optionally be configured to preserve sort order. This is useful in plans where rows entering the exchange are already sorted (following an earlier sort, or as a consequence of an ordered read from an index) in a way that is useful to a later operator. If the exchange did not preserve order, the optimizer would have to introduce an extra Sort operator after the exchange to re-establish the required ordering. Common operators that require sorted inputs include Stream Aggregate, Segment, and Merge Join. Figure 11 shows an order-preserving Repartition Streams exchange in action:

Figure 11: An Order-Preserving Repartition Streams Exchange

Rows arriving on the three inputs to the exchange are in sorted order (sorted, that is, from the point of view of the individual workers). An order-preserving exchange, known as a merging exchange, ensures that the worker(s) on its output side receive rows in the same sort order (though the distribution will usually be different, of course).

A Gather Streams exchange can also preserve sort order, if required (and Distribute Streams exchange has no other option, if you think about it). In any case, if the exchange is a merging exchange, the exchange operator has an 'Order By' attribute, as shown in Figure 12:

Figure 12: The 'Order By' Attribute of a Merging Exchange

Note that a merging exchange does not perform any sorting itself; it is limited to preserving the sort order of rows arriving on its inputs. Merging exchanges can be much less efficient than the non-order-preserving variety, and are associated with certain performance problems. This is another topic we will cover in more detail later on in the series.

Summary

This introduction to parallelism used a simple query, and a related real-world example, to explore the model used by SQL Server to allow queries to automatically benefit from the extra processing power provided by modern multi-core servers, without requiring the developer to consider any of the complexities normally associated with multi-threaded designs.

We saw that a parallel query plan may contain any number of parallel and serial regions, bounded by exchange operators. The parallel zones expand into multiple serial queries, each of which uses a single worker thread to process a task within an execution context. The exchange operators are used to route rows between workers, and are the only operators in a parallel plan that interact directly with more than one worker. Finally, we saw that SQL Server provides a Parallel Page Supplier, which allows multiple workers to cooperatively scan a table or index, while guaranteeing correct results.

The next part in this series builds on the foundations provided in this introduction, and looks at how execution contexts, worker threads, and exchange operators are applied to queries using parallel hash and merge joins. We will also look at exchange partitioning types in more detail, and examine a query optimization that is only possible in parallel plans; one which can result in a parallel query using less processor time than the equivalent serial query, while also returning results more quickly.

Paul White

Author profile:

Paul White is an independent SQL Server technical consultant, specializing in performance tuning. Paul lives in Paraparaumu, New Zealand, and has worked with SQL Server since version 6.5. Paul writes a highly-regarded SQL Server technical blog at http://sqlblog.com/blogs/paul_white/ and is a regular contributor to http://www.sqlservercentral.com/. You can also follow him on Twitter at @SQL_Kiwi.

Search for other articles by Paul White

Rate this article:   Avg rating: from a total of 396 votes.


Poor

OK

Good

Great

Must read
Have Your Say
Do you have an opinion on this article? Then add your comment below:
You must be logged in to post to this forum

Click here to log in.


Subject: simply amazing
Posted by: robert matthew cook (view profile)
Posted on: Thursday, March 03, 2011 at 11:10 PM
Message: hope this one earns you a well deserved book deal and more just one mvp vote ;)

Subject: well done
Posted by: mpawlikowski (not signed in)
Posted on: Friday, March 04, 2011 at 7:23 AM
Message: can't wait for next parts :)
great job.

Subject: Great work
Posted by: Adam Machanic (view profile)
Posted on: Friday, March 04, 2011 at 11:46 AM
Message: A fantastic read on a favorite topic of mine.

Subject: A well written artilce
Posted by: Suresh Penumatsa (not signed in)
Posted on: Friday, March 04, 2011 at 12:08 PM
Message: I have been waiting for a good article on Parallelism and just found it. Looking forward for the remaning pieces.

Subject: Great work
Posted by: Sankar Reddy (not signed in)
Posted on: Friday, March 04, 2011 at 12:09 PM
Message: Paul,

I am simply amazed with the quality of your work and when are you writing your book so that I can pre-order on amazon?

Subject: awsome
Posted by: Dant (not signed in)
Posted on: Friday, March 04, 2011 at 12:39 PM
Message: awsome, really good article. looking forward for the next ones

Subject: General Principles?
Posted by: BuggyFunBunny (view profile)
Posted on: Friday, March 04, 2011 at 1:08 PM
Message: I checked the IBM and Oracle standard web sites, and, among other things, this piece is a much airier read.

Beyond that two questions/observations:

- it appears that any parallel database engine would have the semantics described here (jargon not so much), since this is essentially a math problem

- those other two appear to require more manual fiddling to work; anyone with direct experience care to comment

Subject: Nice article
Posted by: Dugi (view profile)
Posted on: Friday, March 04, 2011 at 3:01 PM
Message: Paul this is nice article thanks for sharing!

Subject: what causes workers to be slow?
Posted by: AlexK (view profile)
Posted on: Friday, March 04, 2011 at 9:13 PM
Message: Paul,

I enjoyed reading this, thanks for writing this up. You are stating that "In SQL Server, a slow worker makes fewer requests". What causes some workers to be slower than others?

BuggyFunBunny,

I disagree that "any parallel database engine would have the semantics described here". The engines that implement "shared nothing" approach work differently - each worker has its own jar of beans and cannot access other jars. AFAIK DB2 EEE and Teradata share nothing between nodes.


Subject: Great article
Posted by: Robert L Davis (not signed in)
Posted on: Friday, March 04, 2011 at 9:40 PM
Message: Great article Paul!!

Subject: Comments so far
Posted by: Paul White NZ (view profile)
Posted on: Friday, March 04, 2011 at 11:59 PM
Message: Thanks everyone!

Alex, by 'being slower' I mean that a worker gets less time scheduled on a logical processor. If a scheduler has more than one worker in its runnable list, they will share CPU time.

Paul White

Subject: Excellent Post
Posted by: Meher (not signed in)
Posted on: Sunday, March 06, 2011 at 8:39 PM
Message: Great Post Paul.

Thanks
Meher

Subject: Very nice article!
Posted by: Pawel Potasinski (not signed in)
Posted on: Monday, March 07, 2011 at 1:46 AM
Message: Paul,

Congratulation on another brilliant article. I'm really impressed with how easy it comes to you to write really interesting piece of text on SQL Server internals. Also, I'm one of many readers of your fantastic blog. Keep your current style - it's really cool.

Cheers,
Pawel

Subject: Brilliant!
Posted by: Adam Aspin (not signed in)
Posted on: Monday, March 07, 2011 at 2:07 AM
Message: Thank you for the superb explanation

Subject: Outstanding
Posted by: Tom Thomson (not signed in)
Posted on: Monday, March 07, 2011 at 2:12 AM
Message: Thanks for this. Looking forward to next part.

Subject: Nice article, a few questions..
Posted by: Rick (view profile)
Posted on: Monday, March 07, 2011 at 3:12 AM
Message: I am facing a problem currently whereby I am experiencing Exchange Event deadlocks over multiple spids, a bit the same as at http://social.msdn.microsoft.com/Forums/en/sqldatabaseengine/thread/ea1a28eb-1ff3-41ca-af22-7110347627bf, but harder to find the exact issue as this seems to have been happening a while all over the system I am now working on.

Our isolation level is serializable at this point, but everywhere I have looked has suggested the problem is due to parallelism. I have set maxdop 1 on the main procedure giving these errors. What else would you suggest in regards to getting these types of deadlocks (have looked at indexing and it doesn't seem to be the issue)?

Thanks

Subject: Excellent article, though the final diagram made my brain ache
Posted by: SteveP (view profile)
Posted on: Monday, March 07, 2011 at 3:12 AM
Message: I may not be understanding it properly but I think the 101 and 112 green boxes on the left hand side are the wrong way round on the final diagram.

Can't wait for the following parts.

Subject: Exchange Spills
Posted by: Paul White NZ (view profile)
Posted on: Monday, March 07, 2011 at 3:29 AM
Message: Hi Rick,

This is a topic I'll cover later on, but these are almost always caused by a sequence of exchanges operation in order-preserving mode, separated by one or more operators that require sorted input e.g. Merge Join, Stream Aggregate, Segment...

In SQL Server 2008 there is a Profiler event to show when this type of exchange 'deadlock' is resolved by spilling exchange packets to disk.

The primary workaround is to avoid these plan shapes (especially parallel Merge Join at high DOP, in my experience). It is sometimes enough to force a hash or loop join instead using a hint, but this is advanced tuning and really requires detailed knowledge of the data concerned.

Paul

Subject: Repartition Streams Diagram
Posted by: Paul White NZ (view profile)
Posted on: Monday, March 07, 2011 at 3:34 AM
Message: Hi Steve P,

The repartition streams shows three input threads with data arriving in ascending order per thread. The same is true for the three threads on the output, but the individual values have been moved between threads by the repartitioning.

Try ignoring the colours on the left hand side, and just looking at the three rows: you should see that the output sort order is correct:

Thread 1: 097,101,127
Thread 2: 100,116,122
Thread 3: 112,120,123

Does that help at all?

Paul

Subject: Nice article, a few questions..
Posted by: Rick (view profile)
Posted on: Monday, March 07, 2011 at 6:11 AM
Message: I am facing a problem currently whereby I am experiencing Exchange Event deadlocks over multiple spids, a bit the same as at http://social.msdn.microsoft.com/Forums/en/sqldatabaseengine/thread/ea1a28eb-1ff3-41ca-af22-7110347627bf, but harder to find the exact issue as this seems to have been happening a while all over the system I am now working on.

Our isolation level is serializable at this point, but everywhere I have looked has suggested the problem is due to parallelism. I have set maxdop 1 on the main procedure giving these errors. What else would you suggest in regards to getting these types of deadlocks (have looked at indexing and it doesn't seem to be the issue)?

Thanks

Subject: Exchange Spills
Posted by: Rick (view profile)
Posted on: Monday, March 07, 2011 at 6:14 AM
Message: Thanks for the answer Paul. Data on this one a bit of a nightmare, but slowly winning the battle. :)

Subject: Exchange Spills
Posted by: Paul White NZ (view profile)
Posted on: Monday, March 07, 2011 at 7:32 AM
Message: No worries, Rick. Good luck!

Subject: Excellent Job
Posted by: Mark Parkes (not signed in)
Posted on: Monday, March 07, 2011 at 8:29 AM
Message: Loved it, keep it coming.

Thanks

Subject: Great Job
Posted by: sgray128 (view profile)
Posted on: Monday, March 07, 2011 at 8:50 AM
Message: Paul:
Great job. I'm looking forward to the next one

Steve Gray
www.devshed.us

Subject: Excellent !!!
Posted by: Sagar (not signed in)
Posted on: Monday, March 07, 2011 at 2:10 PM
Message: Thank You Paul.

Subject: Another fantastic article
Posted by: Chris H (not signed in)
Posted on: Tuesday, March 08, 2011 at 2:20 AM
Message: Great stuff Paul. Yout always manage to explain complex technical subjects in a way that even I can understand. I always look out for your articles. Your time spent creating these pieces is greatly appreciated and if you ever write a book, I'll definitely purchase a copy. Thanks a million.

Subject: Thanks Paul
Posted by: SteveP (view profile)
Posted on: Tuesday, March 08, 2011 at 3:15 AM
Message: I think I get it, it preserves order within each subsequent parallel thread but not across them. Which makes sense now my very non parallel brain has had time to look at it again!

many thanks

Steve

Subject: Excellent article
Posted by: Fabricio Catae (not signed in)
Posted on: Tuesday, March 08, 2011 at 7:44 AM
Message: Very interesting topic, and you did a nice job of explaining it. Thanks for sharing the knowledge.

Subject: Great!
Posted by: Aditya Kota (not signed in)
Posted on: Tuesday, March 08, 2011 at 8:17 AM
Message: Excellent Article

Subject: Excellent....
Posted by: TheSmilingDBA (not signed in)
Posted on: Tuesday, March 08, 2011 at 8:43 AM
Message: Some of the information toward the end was very technical, but an excellent example with the jelly beans.

Thanks SQL_Kiwi,
Thomas
http:\\TheSmilingDBA.Blogspot.com

Subject: Excellent article
Posted by: Contractor Dan (not signed in)
Posted on: Tuesday, March 08, 2011 at 12:03 PM
Message: This article put this subject in easy-to-understand terms, and I loved learning in depth about this subject for the first time. I can't wait for the next article. Thanks Paul.

Subject: Just brilliant
Posted by: SqlTheSequel (not signed in)
Posted on: Wednesday, March 09, 2011 at 2:15 AM
Message: Well done, awesome article!

Subject: Comments
Posted by: Paul White NZ (view profile)
Posted on: Wednesday, March 09, 2011 at 9:08 AM
Message: Just thanks to everyone really for the words of encouragement - I hope you all find the remaining instalments as useful.

Paul

Subject: Article
Posted by: AK - Vancouver (not signed in)
Posted on: Thursday, March 10, 2011 at 12:46 AM
Message: I already feel smarter knowing this - thanks, most excellent.

Subject: Excellent
Posted by: mcflyamorim (view profile)
Posted on: Thursday, March 10, 2011 at 5:14 AM
Message: Perfect Paul, I really enjoy read your articles, I'm your fan since your first blog post.

Thanks for that.
[]s
Regards

Subject: wow...very nice article
Posted by: Nilton Pinheiro (not signed in)
Posted on: Thursday, March 10, 2011 at 11:05 AM
Message: Congrats Paul for your nice article! The content is nice to read and very interesting..

Regards

Subject: Great article
Posted by: Anonymous (not signed in)
Posted on: Friday, March 11, 2011 at 3:45 AM
Message: Thanks for the article, can't wait for the next one. :)

Subject: Parallelization for data modification statements (UPDATE, INSERT, DELETE)
Posted by: Michael Brönnimann (view profile)
Posted on: Saturday, May 07, 2011 at 2:45 AM
Message:
Parallelization for queries (SELECT) is covered quite well by the SQL engine itself (thank you for the excellent article!), but when it comes to large volume data modifications (UPDATE, INSERT, DELETE), the standard engine does parallelize towards best use of all available resources (disk, multiple cpu-cores, etc.).

Therefore you may have a look into the approach of SQL Parallel Boost at
http://www.ibax.ch/-IBX-/plain.simpleimagetitletextlinklist.en.solutions.products.parallelboost/default.aspx

A purely SQL engine related parallelisation solution takes advantage of minimized complexity and has no 'external' components like SSIS involved, Furthermore it's the best performing solution regarding task splitting and synchronization, as it hasn't potential connection and communication overhead. The overall performance gain thru parallelisation with SQL Parallel Boost is up to 10 !

In case you don't wan't to rebuild your own solution, SQL Parallel Boost provides a self-contained pure T-SQL based solution, which can be easily embedded in existing applications and ETL process tasks.

Subject: Parallelization of data modifications with SQL Parallel Boost
Posted by: Michael Brönnimann (view profile)
Posted on: Tuesday, August 23, 2011 at 3:58 AM
Message: A free Community Edition of SQL Parallel Boost can be downloaded at http://sqlparallelboost.codeplex.com

Subject: really well written
Posted by: Casey (view profile)
Posted on: Monday, July 01, 2013 at 1:02 PM
Message: and informative. Thank you for sharing.

Subject: excellent
Posted by: sateesh_machineni (view profile)
Posted on: Wednesday, December 25, 2013 at 11:05 AM
Message: Best explanation ever

 

Phil Factor
Searching for Strings in SQL Server Databases

Sometimes, you just want to do a search in a SQL Server database as if you were using a search engine like Google.... Read more...

 View the blog

Top Rated

Searching for Strings in SQL Server Databases
 Sometimes, you just want to do a search in a SQL Server database as if you were using a search engine... Read more...

The SQL Server Sqlio Utility
 If, before deployment, you need to push the limits of your disk subsystem in order to determine whether... Read more...

The PoSh DBA - Reading and Filtering Errors
 DBAs regularly need to keep an eye on the error logs of all their SQL Servers, and the event logs of... Read more...

MySQL Compare: The Manual That Time Forgot, Part 1
 Although SQL Compare, for SQL Server, is one of Red Gate's best-known products, there are also 'sister'... Read more...

Highway to Database Recovery
 Discover the best backup and recovery articles on Simple-Talk, all in one place. Read more...

Most Viewed

Beginning SQL Server 2005 Reporting Services Part 1
 Steve Joubert begins an in-depth tour of SQL Server 2005 Reporting Services with a step-by-step guide... Read more...

Ten Common Database Design Mistakes
 If database design is done right, then the development, deployment and subsequent performance in... Read more...

SQL Server Index Basics
 Given the fundamental importance of indexes in databases, it always comes as a surprise how often the... Read more...

Reading and Writing Files in SQL Server using T-SQL
 SQL Server provides several "standard" techniques by which to read and write to files but, just... Read more...

Concatenating Row Values in Transact-SQL
 It is an interesting problem in Transact SQL, for which there are a number of solutions and... Read more...

Why Join

Over 400,000 Microsoft professionals subscribe to the Simple-Talk technical journal. Join today, it's fast, simple, free and secure.