12 May 2011

PATINDEX Workbench

The PATINDEX function of SQL Server packs powerful magic, but it is easy to get it wrong. Phil Factor returns to the Workbench format to give a tutorial of examples, samples and cookbook ideas to demonstrate the ways that this underrated function can be of practical use. It is intended to be pasted into SSMS and used as a basis for experiment

The PATINDEX function is a treasure, and it is the obvious way in SQL Server to locate text in strings. It uses the same wildcard pattern as the LIKE operator, works as fast, but gives back more information. It is closest in functionality to CHARINDEX. People worry that it does not allow a full RegEx, but that is missing the point. It will do useful string searching, as I’ll show you, and it is optimized for performance in a database. A RegEx search performs a different service and is slower at doing those routine jobs that PATINDEX is intended for.

It is great for quick checks on your data. Here, we do an elementary check on a stored IP address.

So here, the % ‘wildcard’ character has a special meaning, which is ‘any number of any character’. BOL puts this as ‘Any string of zero or more characters.’ For some reason, most developers seem to think that there is a rule that you can only use them at the start and/or end of a pattern. Not true. You can use them anywhere, as many as you like, but not adjacent as all but the first are ignored. The meaning ‘One of any character’ is denoted by the underscore wildcard character, ‘_’. So what does the IP check we have shown you comprise? Nothing more than counting that there are three dots in the string. We can do more but it gets rather more complicated.

Note that, because we are just testing for existence, and aren’t going to extract it, we could also use LIKE

Here are a few other more practical examples of using the ‘%’ wildcard.

See what we’ve done? We’ve added a default at the end so that we don’t have to cope with passing back a 0 from the PATINDEX when it hits a string without the correctly formatted number in it. We look for the transition between a character that isn’t numeric, to one that is. Then we look for three valid consecutive numbers followed by a character that isn’t a number.

The same technique can be used where you want to trim off whitespace before or after a string. You might think that RTRIM and LTRIM do this but they are slightly broken, in that they only trim off the space character. What about linebreaks or tabs?

All we’re doing here is defining the range of valid character within a number, anything from 0 to 9, or a dot in this instance, and looking for the transitions between ‘number and ‘not-number’.

Let’s try something a bit trickier, and closer to a real chore. Let’s find a UK postcode. (apologies to all other nations who are reading this)

The validation rules are that the length must be between 6 and 8 characters of which one is a space. This divides the three-character local code to the right of the space from the sorting-office code to the left of the space. The local characters are always a numeric character followed by two alphabetic characters. The Sorting Office code the left of the gap, can be between 2 and 4 characters and the first character must be alpha.

Before you get too excited, I must point out the the postcode validation is more complex. We can’t use it because PATINDEX uses only wildcards and hasn’t the OR expression or the iterators. We can do quite well though..

What if you wanted to do the more common chore of extracting the postcode from an address-line and putting it in its own field? This is where you have to stop using LIKE as it won’t cut the mustard. If you are of a nervous disposition in your SQL-writing please turn away now.

…and then we’ll populate it with 100,000 rows via SQL Data Generator (we’ll use a RegEx to fill the address column in).


Hopefully, I’ll remember to put it in the downloads at the bottom of the article for the other SQLDG freaks. Then we are going to pull out the postcode information, place the modified address without the postcode in a second column, and put the extracted postcode into its own column so we can subsequently do lightning searches based on postcode. This whole messy process runs in five seconds on my test machine. If you did a neat cursor-based process, it would take minutes.


So here’s a puzzle to end off with. You have a field with an email address somewhere in it, and you need to extract it. Here’s one way of pulling it out. It looks a bit complicated, but it is fast.

What are we doing here? The principle is simple. We look for the embedded ‘@’ sign, and then run a check forwards to get the end of the string containing the ‘@’ character. Then we reverse the start of the string and look for the beginning. When we have these, it is just a simple matter of assembling the email address. The SQL looks laborious, but looks can deceive, so it always pays to test it out. Let’s test it out by recreating our addresses table, and stocking the address column with an additional email record.

So there we have it. In summary, when using PATINDEX:

  • Specify the collation if you use character ranges
  • If a problem seems tricky, see if you can detect transitions between character types
  • Use the angle brackets to ‘escape’ wildcard characters.
  • Think of the % wildcard as meaning ‘any number of any character’, or ‘Any string of zero or more characters.’
  • Remember that you can specify the wildcard parameter as a column as well as a literal or variable.
  • You can use cross joins to do multiple searches to simulate the OR condition of RegExes.
  • Experiment wildly when you get a spare minute. Occasionally, you’ll be surprised.

For more articles like this, sign up to the fortnightly Simple-Talk newsletter.


Tags: , , ,


  • Rate
    [Total: 38    Average: 4.6/5]

Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 30 years of experience with database-intensive applications. Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career. See also :

Follow on

View all articles by Phil Factor

  • afonso freitas – afofreitas@yahoo.com.br

    did not work
    I tried this example but it did not work, this theory does not work in practice

  • afonso freitas – afofreitas@yahoo.com.br


  • Anonymous

    /*Select all objects (not tables!) in the current database which have the following three words FROM WHERE and ORDER in that order with gaps in between*/

    SELECT name FROM sys.Objects

    WHERE PATINDEX (‘%FROM%WHERE%AND%’,object_definition(object_ID))>0;

    Will also find the string ‘FROMWHEREAND’. Great article.

  • needa_flats

    about the article
    did not work to me either…


  • mjbmjbmjbmjb

    Expected 3, got a 4
    For me, this (below) is returning 3 and I was expecting, probably wrongly, 4…

    SELECT PATINDEX (‘%[0-9].%[0-9].%[0-9].%’,’′);–returns with the index of the first dot or 0 if there aren’t three in the string

  • Phil Factor

    Re: Expecting 3, got a 4
    It is telling you the index where it got the match. This will have been the number before the dot rather than the dot. I’ve amended the article as I rashly copied the previous comment!

  • Phil Factor

    Re: ‘did not work for me either’
    I contacted the first person who said it didn’t work. He then managed to get it working with some help. If you can tell me what the problem is I’ll do my best to fix it. Please remember that PatIndex doesn’t work with TEXT datatype. I should have mentioned that!

  • Phil Factor

    Re: FromWhereAnd
    You’re right. I’ve update the comments!

  • Lyndon

    I’ve just arranged a meeting for tomorrow at CB4 0WE. It must be very close.

  • mctillett

    That’s better, I had been signed out when I posted anonymously earlier.

    As an alternative to changing the comments Phil, how about using the pattern ‘%FROM_%WHERE_%AND%’ 😀

    • Phil Factor

      Re: FromWhereAnd
      or you could specify that there must be a whitespace character after each word too!.

  • Hugh Yeman

    One question…
    Thank you for this. It’s helpful to me as I learn T-SQL. One thing I haven’t been able to puzzle out: what is the purpose of concatenating a pipe symbol onto a string before calling the LEN function on it? You do things like the following several times.


    There must be an obvious reason, but I haven’t been able to suss it. @string is already declared as a VARCHAR so it can’t be to guarantee type.

  • caruncles

    selecting text between two letters or numbers
    I guess I’m a simpleton but didn’t see the answer to my question on your page. Is it possible to select just the text between two letters or numbers? for example, everthing between ‘{‘ and ‘}’, without selecting the braces.

  • sqlyoginyc

    To get the TRIM function to work add a SELECT
    SELECT dbo.Trim(‘ 678ABC ‘)
    SELECT dbo.Trim(‘ This has leading and trailing spaces ‘)
    SELECT dbo.Trim(‘ left-Trim This’)
    SELECT dbo.Trim(‘Right-Trim This ‘)