The best SQL puzzles come from real experiences in the workplace. Here, Alex Kozak describes how he took on a task that looked simple for a while. Later, he realized that he's stumbled over an excellent puzzle for Simple-Talk.
Not long ago, one of our managers ask me for help. He wanted me to provide a report for the latest seven days when each cash register in each store was online, processing sales transactions.
At first glance, the task looked very simple. However, later it turned into quite interesting puzzle.
Sample Table
For demonstration purposes, I created the sample table ‘registersOnline’. That table has only three columns – store, register, onlineDays and about 17000 rows.
In order to try all following examples, you will need to download, unzip and execute the scripts from the zipped files in Create_Load_registersOnline.zip which you can get from the speech-bubble in the title-bar
First Solutions
Since the task looked to me like a ranking problem, my first intention was to use a ranking function, as shown in the Listing 1:
However, it has turned out that the customer has SQL Server 2000 and, because of that, I decided to use a correlated sub-query, as demonstrated in the Listing2:
Results: -only a sample shown
store register onlineDays
----- -------- -------------------
203 1 2008-02-25 00:00:00
203 1 2008-02-22 00:00:00
203 1 2008-02-19 00:00:00
203 1 2008-02-15 00:00:00
203 1 2008-02-14 00:00:00
203 1 2008-02-13 00:00:00
203 1 2008-02-12 00:00:00
. . . . . . . . . . . . . . . . . . .
233 3 2008-02-21 00:00:00
233 3 2008-02-20 00:00:00
233 3 2008-02-16 00:00:00
233 3 2008-02-15 00:00:00
233 3 2008-02-14 00:00:00
233 3 2008-02-13 00:00:00
233 3 2008-02-12 00:00:00
233 4 2008-01-05 00:00:00
358 1 2008-02-26 00:00:00
358 1 2008-02-25 00:00:00
358 1 2008-02-22 00:00:00
358 1 2008-02-21 00:00:00
358 1 2008-02-20 00:00:00
358 1 2008-02-19 00:00:00
358 1 2008-02-18 00:00:00
. . . . . . . . . . . . . . . . . . .
10130 4 2008-02-26 00:00:00
10130 4 2008-02-25 00:00:00
10130 4 2008-02-24 00:00:00
10130 4 2008-02-23 00:00:00
10130 4 2008-02-22 00:00:00
10130 4 2008-02-21 00:00:00
10130 4 2008-02-20 00:00:00
(3080 row(s) affected)
Since the manager needed the answer quickly and because I was busy with a few more tasks, I sent the result as it is.
I promised that as soon as I had finished the other tasks, I would create a query that would show the result in a cross-tab format.
However, an hour later, I got a new request.
Incomplete Groups
You probably noticed that in the result of the query from Listing2, some store – register groups are incomplete. That is, there are some registers within the store that were online less than seven times. For example, register 4 from store 233 was online only on 2008-01-05 00:00:00.
A manager found that fact confusing, and asked me to fill all incomplete groups with some arbitrary date, for instance ‘Jan 01, 1900’. For the group with store 233 and register 4 it would mean that one would have to have one row with the real date 2008-01-05 00:00:00 and six rows with an arbitrary date 1900-01-01 00:00:00.
There are a few possible ways to fill incomplete groups.
First is a pure Transact-SQL solution with a loop statement (see Listing3):
Another solution would employ an auxiliary table (see Listing4):
However, there are also the solutions that do not require an auxiliary table.
The Puzzles
The first puzzle
Try to find such a solution. Consider it as a puzzle with the following requirements:
Having table ‘registersOnline’, find the seven most recent days, when each register in each store was online.
If some pair register-store appears online less than seven times, fill that group up to seven rows, where added rows should have an arbitrary date ‘Jan 01, 1900’.
You need to find a single query solution without an auxiliary table, identity columns, ranking functions or constructions like:
SELECT (a1.id + a0.id) id
FROM (
SELECT 0 id UNION ALL
SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5 UNION ALL
SELECT 6 UNION ALL
SELECT 7 UNION ALL
SELECT 8 UNION ALL
SELECT 9
) a0
CROSS JOIN (
SELECT 0 id UNION ALL
SELECT 10 UNION ALL
SELECT 20 UNION ALL
SELECT 30 UNION ALL
SELECT 40 UNION ALL
SELECT 50 UNION ALL
SELECT 60 UNION ALL
SELECT 70 UNION ALL
SELECT 80 UNION ALL
SELECT 90
) a1
ORDER BY 1
All you can use is a table ‘registersOnline’.
Your solution should work in any RDBMS, including two latest versions of Microsoft SQL Server – 2005 and 2000.
The Second Puzzle
In addition, if you are asked to find the most recent 100 days, when the registers were online - your query should work without changes in its structure
Once you have done that, Try to fill the missing rows(dates) in each group not with a sample date ‘Jan 01, 1900’, but with a date that equal to max (or to min) date in that group.
The Third Puzzle
Finally, try to find the solution that shows the result for the last 7 days in a cross-tab format.
Once again, it should be a single query that works in SQL Server 2000 and in SQL Server 2005. You are allowed to use table ‘registersOnline’ only.
As always, the best solution will be awarded with a $50 amazon voucher.
I'll post my solutions as comments after a week or so.