Click here to monitor SSC

Author Profile

Joe Celko

Joe Celko is one of the most widely read of all writers about SQL, and was the winner of the DBMS Magazine Reader's Choice Award four consecutive years. He is an independent consultant living in Austin, TX. He has taught SQL in the US, UK, the Nordic countries, South America and Africa.
He served 10 years on ANSI/ISO SQL Standards Committee and contributed to the SQL-89 and SQL-92 Standards.
He has written over 800 columns in the computer trade and academic press, mostly dealing with data and databases. He is the author of eight books on SQL for Morgan-Kaufmann, including the best selling SQL FOR SMARTIES.
Joe is a well-known figure on Newsgroups and Forums, and he is famous for his his dry wit. He is also interested in Science Fiction.

Database Normalization Basics

The task of Database Normalization doesn't have to be painful, especially if you follow Old Mother Celko's Normalization Heuristics. Read more...

On Handling Dates in SQL

The calendar is inherently complex by the very nature of the astronomy that underlies the year, and the conflicting historical conventions. The handling of dates in TSQL is even more complex because, when SQL Server was Sybase, it was forced by the lack of prevailing standards in SQL to create its own ways of processing and formatting dates and times. Joe Celko looks forward to a future when it is possible to write standard SQL date-processing code with SQL Server. Read more...

SQL Server SEQUENCE Basics

The SEQUENCE statement introduced in SQL Server 2012 brings the ANSI SQL 2003 standard method of generating IDs. This is a great relief to database professionals as it solves some problems what are awkward to solve with the IDENTITY property. Joe Celko explains the basics of using a SEQUENCE Read more...

Nesting Levels in SQL

The 'Structured' part of SQL denotes the fact that queries can be nested inside each other in such a way that, wherever you can use a table, you can use a table expression. Such derived tables can provide powerful magic, to which is added CTEs and Lateral Tables. Joe Celko explains. Read more...

Window Functions in SQL

SQL's windowing functions are surprisingly versatile, and allow us to cut out all those self-joins and explicit cursors. Joe Celko explains how they are used, and shows a few tricks such as calculating deltas in a time series, and filling in gaps. Read more...

Databases and Dominoes

A Dominoes game of Texas 42 inspires Joe to explore unusual uses for check constraints and views. Sometimes, the best way of discovering useful SQL techniques is to tackle the more unusual problems. Read more...

The SQL of Parts Explosions

Parts explosions present a classic IT problem. How can one calculate such things as weight or cost of assemblies in SQL? Joe shows how it can be done using nested sets, with not an IDENTITY or GUID in sight.. Read more...

UNIQUE Constraints in SQL

Here is an in-depth look at an underused constraint, UNIQUE, that can increase the performance of queries and protect data integrity. Read more...

Row Sorting in SQL

It should be easy to model a game of poker in SQL. The problem is, however, that you need to model a permutation from a set of elements. Joe Celko argues that using a group of columns to do this isn't necessarily a violation of 1NF, since a permutation is atomic. Then comes the second problem: how would you sort such a column-base permutation in order? Sorting columns in SQL? Read more...

Book Review: SQL and Relational Theory by C. J. Date

SQL and Relational Theory, by Chris Date, isn't likely to be a book that SQL's greatest defender, Joe Celko, would agree with. However, following the debates between Date and Celko on the relational purity of SQL has all the fascination of watching Godzilla wrestling King Kong. Read more...

Matrix Math in SQL

Relational Databases have tables as data structures, not arrays. This makes it tricky and slow to do matrix operations, but it doesn't mean it is impossible to do. Joe gives the Celko Slant on how to go about doing Matrix Math in SQL. Read more...

SQL View: Beyond the Basics

Following up from his popular article, SQL View Basics, Joe delves into the main uses of views, explains how the WITH CHECK OPTION works, and demonstrates how the INSTEAD OF trigger can be used in those cases where views cannot be updatable. Read more...

SQL VIEW Basics

SQL Views are essential for the database developer. However, it is common to see them misued, or neglected. Joe Celko tackles an introduction to the subject, but there is something about the topic that makes it likely that even the experienced developer will find out something new from reading it. Read more...

Bin Packing Problems: The SQL

The 'bin packing' problem isn't just a fascination for computer scientists, but comes up in a whole range of real-world applications. It isn't that easy to come up with a practical, set oriented solution in SQL that gives a near-optimal result. Read more...

Voting Paradoxes: a SQL Stumper

Voting systems can become very complex, and some of them are easy to manipulate by tactical voting. Joe takes a couple of voting systems and wonders how you would implement them in SQL. He's even more curious as to how you, the reader, would do so. Read more...

Mimicking Network Databases in SQL

Unlike the hierarchical database model, which created a tree structure in which to store data, the network model formed a generalized 'graph' structure that describes the relationships between the nodes. Nowadays, the relational model is used to solve the problems for which the network model was created, but the old 'network' solutions are still being implemented by programmers, even when they are less effective. Read more...

Mimicking Magnetic Tape in SQL

The sequential nature of early data storage devices such as punched card and magnetic tape once forced programmers to devise algorithms that made the best of sequential access. These ways of doing data-processing have become so entrenched that they are still used in modern relational database systems. There is now a better way, as Joe explains. Read more...

Arrays in SQL that Avoid Repeated Groups

It is certainly possible to fake an Array in SQL, but there are only a few occasions when it would be the best design. Most often, the wish for an array in SQL is a sign of a forlorn struggle against poorly-normalised data. One of the worst sins against Codd is the repeating group, as Joe explains. Read more...

A Tale of Identifiers

Identifiers aren't locators, and they aren't pointers or links either. They are a logical concept in a relational database, and, unlike the more traditional methods of accessing data, don't derive from the way that data gets stored. Identifiers uniquely identify members of the set, and it should be possible to validate and verify them. Celko somehow involves watches and taxi cabs to illustrate the point. Read more...

TIME Gentlemen please! The SQL Server temporal datatypes

If you are still using the old Sybase DateTime datatype, it is a good idea to move your code to the more standard datatypes that were introduced in SQL Server 2008. Joe Celko explains why, and walks through some of the history of the TSQL way of storing and manipulating dates and times. Read more...

Check your Digits

The most persistent struggle in data processing has been to ensure clean data. There are many ways that data can be incorrect and a database must check, as best it can, that the data is correct. The CHECK constraint is ideally suited for this sort of work, and the checking routine can become quite complex when dealing with check digits in data. Read more...

Improving Comparison Operators and Window Functions

It is dangerous to assume that your data is sound. SQL already has intrinsic ways to cope with missing, or unknown data in its comparison predicate operators, or Theta operators. Can SQL be more effective in the way it deals with data quality? Joe Celko describes how the SQL Standard could soon evolve to deal with data in ways that allow aggregation and windowing in cases where the data quality is less than perfect Read more...

SQL Server CASE Law

SQLs CASE expressions can be powerful magic, but can trap the unwary who are used to the more familiar CASE statements of procedural languages. Read more...

Look-up Tables in SQL

Lookup tables can be a force for good in a relational database. Whereas the 'One True Lookup Table' remains a classic of bad database design, an auxiliary table that holds static data, and is used to lookup values, still has powerful magic. Joe Celko explains.... Read more...

BIT of a Problem

The BIT data type is an awkward fit for a SQL database. It doesn't have just two values, and it can do unexpected things in expressions. What is worse, it is a flag rather than a predicate, and so its overuse, along with bit masks, is a prime candidate for being listed as a 'SQL Code Smell'. Joe Celko makes the case. Read more...

Contiguous Time Periods

It is always better, and more efficient, to maintain referential integrity by using constraints rather than triggers. Sometimes it is not at all obvious how to do this, and the history table, and other temporal data tables, presented problems for checking data that were difficult to solve with constraints. Suddenly, Alex Kuznetsov came up with a good solution, and so now history tables can benefit from more effective integrity checking. Joe explains... Read more...

VALUES() and Long Parameter Lists - Part II

The use of the comma-separated list of parameters to a SQL routine, that Phil Factor calls the 'comedy-limited list, is a device that makes seasoned SQL Database developers wince. The best choice of design for passing variable numbers of parameters or tuples to SQL Routines varies according to the importance to you of SQL Standards. Joe Celko discusses the pros and cons of the front-runners Read more...

State Transition Constraints

Data Validation in a database is a lot more complex than seeing if a string parameter really is an integer. A commercial world is full of complex rules for sequences of procedures, of fixed or variable lifespans, Warranties, commercial offers and bids. All this requires considerable subtlety to prevent bad data getting in, and if it does, locating and fixing the problem. Joe Celko shows how useful a State transition graph can be, and how essential it can become with the time aspect added. Read more...

The DIS-Information Principle, Part II

Database design simply involves populating a schema with tables that model sets of entities and relationships. A table will contain Columns that model an entity's attributes and contain scalar values. What could go wrong? Plenty, unfortunately, when these simple principles are misunderstood or flouted, and Joe continues to itemise bad design practices that can cause subsequent grief for the application developers. Read more...

The DIS-Information Principle: A Splitting Headache

You can easily re-factor bad DML code, but if a database design is wrong, you can do little to rescue the problem, even with expert queries. So what constitutes 'wrong RDBMS design? What are these errors that continually crop up? How can you recognise them and fix them? Joe embarks on a new series of articles by identifying a series of bad practices based on the habit of 'splitting' that which shouldn't be split. Read more...

VALUES() and Long Parameter Lists

To make progress as a relational Database programmer, you have to think in terms of sets, rather than lists, arrays or sequential data. Until that point, you'll feel the need to pass lists, arrays and the like to functions and procedures. Joe suggests some unusual ways around the difficulty and broods on the power of the VALUES constructor. Read more...

Binary Trees in SQL

A number of hierarchies and networks are most convenently modelled as binary trees. So what is the best way of representing them in SQL? Joe discards the Nested Set solution in favour of surprisingly efficient solution based on the Binary Heap. Read more...

Book Review: Defensive Database Programming With SQL Server

It distils a great deal of practical experience; the writing of it was a considerable task; It packs in a great deal of information. Alex's book shows how to write robust database applications, and we can all learn from it. We took the book to a critic who never minces his words, and were relieved to find that Joe Celko liked it. Read more...

Celko's SQL Stumper: Eggs in one Basket

Joe Celko reveals the winner of his Easter Stumper: the puzzle of designing an apparently simple database to deal with the process of packing eggs into cartons. It wasn't quite as easy as it looked. Read more...

Procedural, Semi-Procedural and Declarative Programing Part II

SQL Server accommodates a whole range of programming styles and will even allow you to create code that is wholly procedural. Is a declarative approach inevitably better? Can it be difficult to maintain? Can you avoid the performance problems of procedural code by using triggers? Joe adds some thoughts. Read more...

Procedural, Semi-Procedural and Declarative Programming in SQL

A lot of the time, the key to making SQL databases perform well is to take a break from the keyboard and rethink the way of approaching the problem; and rethinking in terms of a set-based declarative approach. Joe takes a simple discussion abut a problem with a UDF to illustrate the point that ingrained procedural reflexes can often prevent us from seeing simpler set-based techniques. Read more...

Celko's SQL Stumper: The Class Scheduling Problem

What can we use in SQL instead of E. F. Codd's T theta operators for best-fit? Joe Celko returns with another puzzle that isn't new, in fact it already features “Swedish”, “Croatian” and “Colombian” solutions in chapter 17 of Joe's 'SQL for Smarties' book. These were all written before CTEs or the new WINDOW functions. Is there now a better solution? Was there one even then? We leave it to the readers to provide the answer! Read more...

Celko's SQL Stumper: The Data Warehouse Problem

Joe Celko comes back with a puzzle that isn't new, but one where the answer he originally gave now seems archaic: It is a deceptively simple problem, but is it true that the new features of SQL have simplified the solution? We leave it to the readers to provide the answer! Read more...

Causation, Correlation and Crackpots

Joe Celko explores the dangers of muddling correlation and causation, emphasises the importance of determining how likely it is that a correlation has occurred by chance, and gets stuck into calculating correlation coefficients in SQL. Along the way, Joe illustrates the consequences of leaping to the wrong conclusion from correlations with tales of Pop Dread. Read more...

Getting rid of SQL Code

Joe becomes intrigued by the way that experts make errors in any area of technology, and suggests that the problem is more that of mindsets than lack of knowledge. He illustrates the point with SQL Development by means of the "Britney Spears, Automobiles and Squids" table, and the tangled Stored procedure, and shows ways of getting rid of both procedural and non-procedural code by adopting a different programming mindset. Read more...

Celko's Summer SQL Stumpers: Prime Numbers

Joe Celko kicks off our series of Summer SQL Stumpers with a challenge to improve on his solution to calculating the prime numbers between 1 and 10000. Once the various solutions have been contributed and judged, the winner will be announced. The competition will be run on Simple-Talk and SQL Server Central together. Read more...

Avoiding the EAV of Destruction

A forum posting, from someone who wanted a better solution to the common problem of handling global settings in a database, leads Joe Celko into a fascinating discussion of the ramifications of the various solutions. Read more...

SQL and the Snare of Three-Valued Logic

The whole subject of the Three-Valued (also known as ternary, trivalent or 3VL) Logic of SQL tends to trip people up. This is hardly surprising in view of the fact that it involves an esoteric Polish mathematician and because it behaves differently in the DDL (Data Declaration Language) and the DML (Data Manipulation Language). In response to requests, Joe Celko comes to the rescue and makes it all seem simple. Read more...

Median Workbench

SQL Server database engine doesn't have a MEDIAN() aggregate function. This is probably because there are several types of median, such as statistical, financial or vector medians. Calculating Medians are essentially a row-positioning task, since medians are the middle value of an ordered result. Easy to do in SQL? Nope. Joe Celko explains why Read more...

Divided We Stand: The SQL of Relational Division

Businesses often require reports that require more than the classic set operators. Surprisingly, a business requirement can often be expressed neatly in terms of the DIVISION relationship operator: How can this be done with SQL Server? Joe Celko opens up the 'Manga Guide to Databases', meets the Database Fairy, and is inspired to explain DIVISION. Read more...

Temporal Data Techniques in SQL

In the first part of this series on Temporal Data, Joe explained how it is that the Common Era calendar is irregular and mentioned that, although there are ANSI/ISO Standards for temporal operations in SQL, every vendor has something different. Now, he discusses other factors to take into account when using temporal data such as Holidays, and discusses a few techniques using Calendar, Report Usage and History tables Read more...

Temporal Datatypes in SQL Server

In the first of a series of articles on the tricks of tackling temporal data in SQL, Joe Celko discusses SQL's temporal data types and agonizes over the fact that, although there are ANSI/ISO Standards for temporal operations in SQL, every vendor has something different. He explains the mysteries of such things as time-zones, lawful time, UTC, CUT, GMT, CE, DST, and EST. Read more...

Unique Experiences!

You'd have thought that a unique constraint was an easy concept - Not a bit of it; it can cause a lot of subtle problems in database designs. Joe Celko goes over the ground of unique keys, primary Keys, foreign keys and constraints. Read more...

Constraint Yourself!

In his first article for Simple-Talk, Joe Celko demystifies the use of Constraints, and points out that they are an intrinsic part of SQL and are a great way of ensuring that a business rule is done one way, one place, one time. Almost all database programmers will find something new and useful in this article. Read more...

Why Join

Over 400,000 Microsoft professionals subscribe to the Simple-Talk technical journal. Join today, it's fast, simple, free and secure.