Click here to monitor SSC
  • Av rating:
  • Total votes: 163
  • Total comments: 10
Robert Sheldon

The EXCEPT and INTERSECT Operators in SQL Server

28 February 2011

The UNION,  EXCEPT and INTERSECT operators of SQL enable you to combine more than one SELECT statement  to form a single result set. The UNION operator returns all rows. The INTERSECT operator returns all rows that are in both result sets. The EXCEPT operator returns the rows that are only in the first result set but not in the second. Simple? Rob Sheldon explains all, with plenty of examples.

Quite often, you’re faced with the task of comparing two or more tables, or query results, to determine which information is the same and which isn't. One of the most common approaches to doing such a comparison is to use the UNION or UNION ALL operator to combine the relevant columns of the results that you want to compare. As long as you adhere to the restrictions placed on either of those operators, you can combine data sets whether they come from different databases or even different servers. With the UNION operator, you end up with a result containing every distinct row from the two results combined. However, it becomes more difficult to use UNION to return only the common data that is held in both results, or the different data that exists in one table but not the other(s). To get the results you need, you must use UNION ALL with a GROUP BY clause, though the logic isn’t always obvious on how to do this. And it isn’t any easier to use a JOIN operator to get the result you want. .

Enter the INTERSECT and EXCEPT operators. Beginning with SQL Server 2005, you can use these operators to combine queries and get the results you need. For instance, you can use the INTERSECT operator to return only values that match within both data sets, as shown in the following illustration .

The illustration shows how the INTERSECT operator returns data that is common to both results set; the common data is represented by the area where the two circles intersect. The illustration also shows how the EXCEPT operator works; only data that exists in one of the data sets outside the intersecting area is returned. For instance, if Set A is specified to the left of the EXCEPT operator, only values that are not in Set B are returned. In the illustration above, that would be the data in the left circle, outside the section where the two data sets intersect. The following bullets sum up which operator to use to return different combinations of data:

  • To return the data in Set A that doesn’t overlap with B, use A EXCEPT B.
  • To return only the data that overlaps in the two sets, use A INTERSECT B.
  • To return the data in Set B that doesn’t overlap with A, use B EXCEPT A.
  • To return the data in all three areas without duplicates, use A UNION B.
  • To return the data in all three areas, including duplicates, use A UNION ALL B.
  • To return the data in the non-overlapping areas of both sets, use (A UNION B) except (A INTERSECT B), or perhaps (A EXCEPT B) UNION (B EXCEPT A)

The differences between the INTERSECT and EXCEPT operators and how to use each of them will become clearer as we work through the examples in the article. Just to give you a basic idea of how they work, we’ll start with a rather unrealistic example. To demonstrate those, however, we must first create two test views (using SQL Server 2005–compatible syntax). The first view contains a single column that describes what you might have had for lunch:

CREATE VIEW Lunch

AS

  SELECT 'Beer' AS item

  UNION SELECT 'Olives'

  UNION SELECT 'Bread'

  UNION SELECT 'Salami'

  UNION SELECT 'Calamari'

  UNION SELECT 'Coffee';

GO

The second view also contains a single column and describes what you might have had for dinner:

CREATE VIEW Dinner

AS

  SELECT 'Wine' AS item

  UNION SELECT 'Olives'

  UNION SELECT 'Bread'

  UNION SELECT 'Steak'

  UNION SELECT 'Aubergines'

  UNION SELECT 'Salad'

  UNION SELECT 'Coffee'

  UNION SELECT 'Apple';

GO

Now we can use these two views to demonstrate how to use the UNION, INTERSECT, and EXCEPT operators. I’ve also included a couple examples that use the JOIN operator to demonstrate the differences.

The first example uses the UNION operator to join the two views in order to return everything you’ve eaten today:

SELECT item FROM Lunch

UNION

SELECT item FROM Dinner;

Now we return the same data by using a full outer join:

SELECT DISTINCT COALESCE(Lunch.item, Dinner.item) AS item

FROM Lunch

  FULL OUTER JOIN Dinner

  ON Dinner.item = Lunch.item

Notice that the join requires more complex syntax; however, both statements return the same results, as shown in the following table:

item

Apple

Aubergines

Beer

Bread

Calamari

Coffee

Olives

Salad

Salami

Steak

Wine

 

Now let’s look at how you would return only the food you ate (or drank) for lunch, but did not have for dinner:

SELECT item FROM Lunch

EXCEPT

SELECT item FROM Dinner;

In this case, I used the EXCEPT operator to return the lunch-only items. I could have achieved the same results using the following left outer join:

SELECT Lunch.item

FROM Lunch

  LEFT OUTER JOIN Dinner

  ON Dinner.item = Lunch.item

WHERE dinner.item IS NULL;

Once again, you can see that the join is more complex, though the results are the same, as shown in the following table:

Item

Beer

Calamari

Salami

 

If you wanted to return those items you had for dinner but not lunch, you can again use the EXCEPT operator, but you must reverse the order of the queries, as shown in the following example:

SELECT item FROM Dinner

EXCEPT

SELECT item FROM Lunch;

Notice that I first retrieve the data from the Dinner view. To use the left outer join, you would again have to reverse the order of the tables:

SELECT dinner.item

FROM dinner

  LEFT OUTER JOIN Lunch

  ON Dinner.item = Lunch.item

WHERE Lunch.item IS NULL;

As expected, the results are the same for both SELECT statements:

item

Apple

Aubergines

Salad

Steak

Wine

 

In the next example, I use the INTERSECT operator to return only the food that was eaten at both meals:

SELECT item FROM Dinner

INTERSECT

SELECT item FROM Lunch;

As you can see, I simply connect the two queries with the INTERSECT operator, as I did with the EXCEPT operator. You can achieve the same results by using an inner join:

SELECT Dinner.item

FROM Dinner

  INNER JOIN Lunch

  ON Dinner.item = Lunch.item;

As the following results show, the only items you had at both meals were olives, bread, and coffee:

item

Bread

Coffee

Olives

 

Now let’s look at how you would return a list of food that you ate at one of the meals, but not both meals, in other words, the food you ate other than bread, olives, and coffee. In the following statement, I use a UNION operator to join two SELECT statements:

SELECT item

FROM

  (

    SELECT item FROM Lunch

    EXCEPT SELECT item FROM Dinner

  ) Only_Lunch

UNION

SELECT item

FROM

  (

    SELECT item FROM Dinner

    EXCEPT SELECT item FROM Lunch

  ) Only_Dinner; --Items you only ate once in the day.

Notice that first statement retrieves only the food you ate for lunch, and the second statement retrieves only the food ate for dinner. I achieve this in the same way I did in previous examples—by using the EXCEPT operator. I then used the UNION operator to join the two result sets. You can achieve the same results by using a full outer join:

SELECT COALESCE(Dinner.item, Lunch.item) AS item

FROM Dinner 

  FULL OUTER JOIN Lunch

  ON Dinner.item = Lunch.item

WHERE Dinner.item IS NULL OR Lunch.item IS NULL;

In both examples, the statements return the following results:

item

Apple

Aubergines

Beer

Calamari

Salad

Salami

Steak

Wine

From this point on, I developed the examples on a local instance of SQL Server 2008 and the AdventureWorks2008 database. Each example uses either the INTERSECT or EXCEPT operator to compare data between the Employee and JobCandidate tables, both part of the HumanResources schema. The comparison is based on the BusinessEntityID column in each table. The BusinessEntityID column in the Employee table is the primary key. In the JobCandidate table, the BusinessEntityID column is a foreign key that references the BusinessEntityID column in the Employee table. The column in the JobCandidate table also permits null values.

NOTE:
You can run these examples against the AdventureWorks database on an instance of SQL Server 2005. However, you must first change the BusinessEntityID column name to EmployeeID, and you must change the JobTitle column name to Title.

Working with the INTERSECT Operator

As I mentioned above, when you use the INTERSECT operator to combine queries, the data returned is only that which matches between the two data sets. That means, with regard to the Employee and JobCandidate tables, the result set will include only rows in which the BusinessEntityID value in the Employee table matches the BusinessEntityID value in the JobCandidate table.

In the following example, I create two queries that retrieve data from the Employee and JobCandidate tables and use the INTERSECT operator to combine those queries:

SELECT BusinessEntityID

FROM HumanResources.Employee

INTERSECT

SELECT BusinessEntityID

FROM HumanResources.JobCandidate;

The first SELECT statement, as you can see, retrieves the BusinessEntityID column from the Employee table, and the second SELECT statement retrieves the BusinessEntityID column from the JobCandidate table. The INTERSECT operator combines the two queries.

When you use the INTERSECT operator to combine queries (or EXCEPT, for that matter), the number of columns must be the same in both queries and the columns must be in the same order. In addition, the corresponding columns between the queries must be configured with compatible data types. The example above meets these conditions because each query returns only one column of the same data type (INT). When the INTERSECT operator is used to combine these the two queries, the statement returns the following results:

BusinessEntityID

212

274

As it turns out, the Employee table and JobCandidate table have only two BusinessEntityID values in common. If you were to examine the data in the JobCandidate table, you would find that the query results here are consistent with the data in that table. The table contains only 13 rows, and the BusinessEntityID column is NULL for all but two of the rows. The BusinessEntityID values in those rows are 212 and 274. And, as you would expect, the Employee table also contains a row with a BusinessEntityID value of 212 and a row with a value of 274.

Certainly, as the above example indicates, using the INTERSECT operator to combine the results of two queries together is a straightforward process. The key, as I’ve stated, is to make sure the SELECT lists in the two queries are in sync with each other.

However, that also points out one of the limitations of using the INTERSECT operator to combine queries—and that is, you cannot include columns in one query that are not included of the second query. If you do include multiple matching columns in each query, all the column values must match in order for a row to be returned. For example, suppose you’re retrieving data from two tables that each include columns for employee IDs, first names, and last names. If you want to match the two tables based on those three columns, the three values in the first table must match the three values in the second table for a row to be returned. (At this point, you might be asking yourself what you’re doing with all that redundant data in your database.)

Instead of taking this approach, you may decide to compare the IDs in both tables, but pull the first and last name from only one of the tables. Or you might decide that you want to pull information from one table that is not stored in the other table. However, because columns must correspond between the two queries when using an INTERSECT operator to combine them, you have to find a way to work around this limitation. One of the easiest ways to do that is to put your INTERSECT construction into a common table expression (CTE) and then join the expression to one of the tables to pull the additional data. For instance, the following example includes a CTE that contains the same INTERSECT construction you saw in the example above:

WITH

  cteCandidates (BusinessEntityID)

  AS

  (

    SELECT BusinessEntityID

    FROM HumanResources.Employee

    INTERSECT

    SELECT BusinessEntityID

    FROM HumanResources.JobCandidate

  )

SELECT

  c.BusinessEntityID,

  e.LoginID,

  e.JobTitle

FROM

  HumanResources.Employee AS e

  INNER JOIN cteCandidates AS c

    ON e.BusinessEntityID = c.BusinessEntityID

ORDER BY

  c.BusinessEntityID;

Notice that I’ve created a CTE named cteCandidates. As you would expect, the CTE returns the BusinessEntityID values that are contained in both the Employee and JobCandidate tables. In the primary SELECT statement, I then join the Employee table to the CTE in order to also retrieve the LoginID and JobTitle values from the Employee table. Because I put the INTERSECT join in the CTE, the statement can now return the following results:

BusinessEntityID

LoginID

JobTitle

212

adventure-works\peng0

Quality Assurance Supervisor

274

adventure-works\stephen0

North American Sales Manager

As you can see, I’ve gotten around the limitations of the INTERSECT operator and am now returning additional information from one of the tables. I could have also joined the CTE to a different table in order to include additional information. For example, I might have joined what I have here to the Person table to retrieve the employee’s first and last names. The point is, the CTE let’s you be quite flexible when working with the INTERSECT operator; you can still determine which rows match but also return all the data you need, regardless of the source table.

Working with the EXCEPT Operator

Once you’ve mastered the INTERSECT operator, you have, for the most part, mastered the EXCEPT operator. As I stated earlier, the EXCEPT operator returns only those rows that do not have a match in the joined table. The important point to remember when using the EXCEPT operator is that it makes a difference which side of the operator you place the query. Only data from the query to the left of the operator (before the operator) is returned. Let’s look at an example to demonstrate how this works.

In the following statement, I again combine two queries, one that retrieves data from the Employee table and one that retrieves data from the JobCandidate table:

SELECT BusinessEntityID

FROM HumanResources.Employee

EXCEPT

SELECT BusinessEntityID

FROM HumanResources.JobCandidate;

This statement is nearly identical to the INTERSECT construction you saw in the first two examples, except, of course, for the use of the EXCEPT operator. However, because the query to the left of the operator is retrieving data from the Employee table, the final result set will include data only from that table, and not the JobCandidate table.

The Employee table, as it turns out, contains 290 rows. As you’ll recall from the previous examples, the two rows in the table with the BusinessEntityID values of 212 and 274 match the two rows in the JobCandidate table that also have BusinessEntityID values of 212 and 274. That means, these two rows should be excluded from the result set of the query above, which is exactly what happens. The query returns 288 rows that have BusinessEntityID values of 1 through 290. However, IDs 212 and 274 are not included in those results.

Now let’s look at what happens when you reverse the order of the queries, as I’ve done in the following example:

SELECT BusinessEntityID

FROM HumanResources.JobCandidate

EXCEPT

SELECT BusinessEntityID

FROM HumanResources.Employee;

Notice that the query that retrieves data from the JobCandidate table now comes first, that is, sits to the left of the EXCEPT operator. The results from this statement, as you would expect, are quite different from the previous example. All that is returned is a single NULL value. In other words, according to the results, the JobCandidate table contains no BusinessEntityID values that are not contained in the Employee table. This is, of course, exactly the case.

As with the CTE example above, which uses the INTERSECT operator, you can also use CTEs with EXCEPT operators. But as the last example points out, if your CTE returns no data, your main query will also return no data (at least if you’re using an inner join), but that’s true with either operator. Overall, in fact, you’ll find there’s little difference between the INTERSECT and EXCEPT operators, in terms of how you use them. The difference, of course, is in the results. INTERSECT returns rows common to both queries, and EXCEPT returns only rows in the left query. Both operators, however, are useful additions to the types of joins that the UNION and UNION ALL operators provide. You can find more details about the INTERSECT and EXCEPT operators by referring to the topic “EXCEPT and INTERSECT (Transact-SQL)” in SQL Server Books Online. There you’ll find additional information about each operator and additional examples of how to use them.

Robert Sheldon

Author profile:

After being dropped 35 feet from a helicopter and spending the next year recovering, Robert Sheldon left the Colorado Rockies and emergency rescue work to pursue safer and less painful interests—thus his entry into the world of technology. He is now a technical consultant and the author of numerous books, articles, and training material related to Microsoft Windows, various relational database management systems, and business intelligence design and implementation. He has also written news stories, feature articles, restaurant reviews, legal summaries, and the novel 'Dancing the River Lightly'. You can find more information at http://www.rhsheldon.com.

Search for other articles by Robert Sheldon

Rate this article:   Avg rating: from a total of 163 votes.


Poor

OK

Good

Great

Must read
Have Your Say
Do you have an opinion on this article? Then add your comment below:
You must be logged in to post to this forum

Click here to log in.


Subject: You used the word 'easier'...
Posted by: Anonymous (not signed in)
Posted on: Monday, March 07, 2011 at 5:05 AM
Message: You appear to be ignoring basic SQL. Instead of that tortured CTE, how about just:

SELECT e.BusinessEntityID, e.LoginID, e.JobTitle
FROM HumanResources.Employee AS e
JOIN HumanResources.JobCandidate AS c
ON e.BusinessEntityID = c.BusinessEntityID
ORDER BY e.BusinessEntityID;

Or:

SELECT BusinessEntityID, LoginID, JobTitle
FROM HumanResources.Employee AS e
WHERE EXISTS
(SELECT *
FROM HumanResources.JobCandidate
WHERE BusinessEntityID = e.BusinessEntityID)
ORDER BY BusinessEntityID;

The latter performs better with AdventureWorks, which raises an important issue you don't address. Using EXISTS and, especially, NOT EXISTS will often provide the best performance, and doesn't have the fundamental limitation of only returning the values from the column in common.

Subject: Performance
Posted by: Anonymous (not signed in)
Posted on: Monday, March 07, 2011 at 5:56 AM
Message: How does performance compare between EXCEPT/INTERSECT and INNER/OUTER joins? I would have thought that the optimiser would find joins easier to deal with.

Subject: Great job, thanks
Posted by: sgray128 (view profile)
Posted on: Monday, March 07, 2011 at 9:02 AM
Message: Robert:
Great job, thanks

Subject: re: performance
Posted by: Phil Factor (view profile)
Posted on: Monday, March 07, 2011 at 2:49 PM
Message: I would be surprised to find any huge difference in performance between the two approaches, but from my own experience, I've found a slight gain from these 'combine' operators on serious data. I would always use Except/Intersect/Union where I was doing that sort of set-based operation because it gives the query optimizer more options from which to choose the best strategy. You'll also find that comparing the two in performance terms will vary with the data. I'd try it with your data and database to see if it works well for you.

Subject: Great job
Posted by: Kealeboga Palai (not signed in)
Posted on: Thursday, March 10, 2011 at 12:06 AM
Message: Hope to learn much from you.very gud job

Subject: Great explanation!
Posted by: rplata (view profile)
Posted on: Thursday, March 10, 2011 at 9:19 AM
Message: Thanks for a clear and concise explanation of this feature.

Subject: I liked it!
Posted by: Anonymous (not signed in)
Posted on: Thursday, March 10, 2011 at 11:12 AM
Message: Loved the pictures! Very good explanation! Learned something today.

Subject: hope to get more from your
Posted by: nghiavt (view profile)
Posted on: Saturday, March 12, 2011 at 6:40 AM
Message: I'm getting started with simple talk since 6 months ago and i'm always very please with Robert Sheldon posts. thanks again.

Subject: Thank you!
Posted by: wedward (view profile)
Posted on: Sunday, March 13, 2011 at 12:09 PM
Message: I wish I had known about this years ago! All those outer joins with NULLs drives me nuts. EXCEPT will be my best friend now - since I seem to always be comparing data sets in my SQL work (migration/development design).

Thanks again!

Subject: Great Article !
Posted by: dattatreysindol (view profile)
Posted on: Tuesday, March 29, 2011 at 3:47 AM
Message: Great Article ! Very clear and crisp explanation.

I wish SQL Server had a better way of achieving "(A UNION B) except (A INTERSECT B)"/"(A EXCEPT B) UNION (B EXCEPT A)" :)
May be a single construct which can do this, would be a great solution !

 

Phil Factor
Searching for Strings in SQL Server Databases

Sometimes, you just want to do a search in a SQL Server database as if you were using a search engine like Google.... Read more...

 View the blog

Top Rated

SQL Server XML Questions You Were Too Shy To Ask
 Sometimes, XML seems a bewildering convention that offers solutions to problems that the average... Read more...

Continuous Delivery and the Database
 Continuous Delivery is fairly generally understood to be an effective way of tackling the problems of... Read more...

The SQL Server Sqlio Utility
 If, before deployment, you need to push the limits of your disk subsystem in order to determine whether... Read more...

The PoSh DBA - Reading and Filtering Errors
 DBAs regularly need to keep an eye on the error logs of all their SQL Servers, and the event logs of... Read more...

MySQL Compare: The Manual That Time Forgot, Part 1
 Although SQL Compare, for SQL Server, is one of Red Gate's best-known products, there are also 'sister'... Read more...

Most Viewed

Beginning SQL Server 2005 Reporting Services Part 1
 Steve Joubert begins an in-depth tour of SQL Server 2005 Reporting Services with a step-by-step guide... Read more...

Ten Common Database Design Mistakes
 If database design is done right, then the development, deployment and subsequent performance in... Read more...

SQL Server Index Basics
 Given the fundamental importance of indexes in databases, it always comes as a surprise how often the... Read more...

Reading and Writing Files in SQL Server using T-SQL
 SQL Server provides several "standard" techniques by which to read and write to files but, just... Read more...

Concatenating Row Values in Transact-SQL
 It is an interesting problem in Transact SQL, for which there are a number of solutions and... Read more...

Why Join

Over 400,000 Microsoft professionals subscribe to the Simple-Talk technical journal. Join today, it's fast, simple, free and secure.