Doing Fuzzy Searches in SQL Server

A series of arguments with developers who insist that fuzzy searches or spell-checking be done within the application rather then a relational database inspired Phil Factor to show how it is done. When the database must find relevant material from search terms entered by users, the database must learn to expect, and deal with, both expected and unexpected … Read more

String Comparisons in SQL: The Metaphone Algorithm

When exploring the use of the Metaphone algorithm for fuzzy search, Phil couldn't find a SQL version of the algorithm so he wrote one. The Metaphone algorithm is built in to PHP, and is widely used for string searches where you aren't always likely to get exact matches, such as ancestral research and historical documents. It is particularly useful when comparing strings word-by-word. With a SQL version, it is easy to experiment on large quantities of data!… Read more

Handling Graphs in SQL

Many practical database problems can be tackled more simply and intuitively by graphs or networks, which in this sense are graphs in which attributes can be associated with the nodes and edges. It is a natural way to study relationships within the data. SQL databases aren't the easiest way of doing it, but it makes sense where the scale permits it. Because of the range of graphs and techniques, some Graph theory is unavoidable before you get stuck into the code, and who better to introduce graph databases than Joe Celko? … Read more

Lists With, or Without, Ranges in both T-SQL and PowerShell

Whether you are working in a procedural language like PowerShell or in T-SQL, there is something slightly bothersome about having to deal with parameters that are lists, or worse with ranges amongst the values. In fact, once you have a way of dealing with them, they can be convenient, especially when bridging the gulf between application and the database. Phil Factor shows how to deal with them.… Read more

Triggers: Threat or Menace?

Triggers are generally over-used in SQL Server. They are only rarely necessary, can cause performance issues, and are tricky to maintain If you use them, it is best to keep them simple, and have only one operation per trigger. Joe Celko describes a feature of SQL that 'gets complicated fast'.… Read more

Who the Devil Wrote This SQL Code?

The way that you format T-SQL code can affect the productivity of the people who have to subsequently maintain your work. It is never a good experience to see SQL Code, cry out “Who the devil wrote this code?”, and then realise that it was you. Grant gives some examples of bad formatting and explains why you should never check-in badly-formatted SQL code.… Read more

DELETE Operation in SQL Server HEAPs

You should stick to using tables in SQL Server, rather than heaps that have no clustered index, unless you have well-considered reasons to choose heaps. However, there are uses for heaps in special circumstances, and it is useful to know what these uses are, and when you should avoid heaps . Uwe Ricken explains, and demonstrates why you'd be unwise to use heaps rather than tables when the data is liable to change.… Read more

The SQL of Textonyms

The task of finding textonyms in SQL involves importing a list of common words and doing transformations on every word to convert it into what you'd need to type into the numeric keypad of your mobile phone to get that word. It's not that hard to do, but what is the quickest and most efficient way of doing it? Phil Factor investigates.… Read more

Predicates With Subqueries

The ALL, SOME and ANY predicates aren't much used in SQL Server, but they are there. You can use the Exists() predicate instead but the logic is more contorted and difficult to read at a glance. Set-oriented predicates can greatly simplify the answering of many real-life business questions, so it is worth getting familiar with them. Joe Celko explains.… Read more

Formatting SQL Code – Part the Second

When you're formatting SQL Code, your objective is to make the code as easy to read with understanding as is possible, in a way that errors stand out. The extra time it takes to write code in an accessible way is far less than the time saved by the poor soul in the future, possibly yourself, when maintaining or enhancing the code. There isn't a single 'best practice, but the general principles, such as being consistent, are well-established. Joe Celko gives his take on a controversial topic.… Read more

SQL Server Metadata Functions: The Basics

To be able to make full use of the system catalog to find out more about a database, you need to be familiar with the metadata functions. They save a great deal of time and typing when querying the metadata. Once you get the hang of these functions, the system catalog suddenly seems simple to use, as Robert Sheldon demonstrates in this article.… Read more


When your SQL Server database is set to have its statistics automatically updated, you will probably conclude that, whenever the distribution statistics are out-of-date, they will be updated before the next query is executed against that index or table. Curiously, this isn't always the case. What actually happens is that the statistics only gets updated if needed by the query optimiser to determine an effective query plan.… Read more

The Luhn Algorithm in SQL

The Luhn test is used by most credit card companies to check the basic validity of a credit card number. It is not an anti-fraud measure but a quick check on data corruption. It still allows any digits that are odd or even to be switched in the sequence. Most credit cards are compatible with Luhn algorithm. It is often applied to SSNs, company organization numbers, and OCR numbers for internet payments. How do you handle them in SQL?… Read more


It sometimes pays to go back and look at what you think you already know about SQL. Joe Celko gives a quick revision of the GROUP BY and HAVING clauses in SQL that are the bedrock of any sort of analysis of data, and comes up with some nuggets that may not be entirely obvious… Read more