15 April 2008

SQL String User Function Workbench: part 1

Robyn and Phil go back to basics and hammer out some basic String-handling User Functions in TSQL, based on Python examples. Plenty of sample code, and TSQL programming tricks.

In this workbench, we’ll show you some fairly simple string User-Functions. Rather than invent the ones we like, we’re going to take a different angle and implement the Python string functions, as much as we can. Plenty of examples, and some programming tricks too!

Contents

  1. Capitalize
  2. Center
  3. Count Substring in String
  4. EndsWith
  5. ExpandTabs
  6. IsAlnum
  7. IsAlpha
  8. IsDigit
  9. IsLower
  10. IsTitle
  11. IsSpace
  12. LJust
  13. LStrip
  14. RFind
  15. RJust
  16. RStrip
  17. Strip
  18. SwapCase
  19. Title
  20. Zfill

Sometimes, when developing a SQL Server application, you start to want to do some string manipulation. For the beginner, the SQL String functions seem pretty unpreposessing – there seems little there in the same league as what’s available in Python.

The difference is more one of style. The basic TSQL functions have great power but it isn’t always obvious, from looking at the code, what they are doing. Nobody would attempt to argue that the famous STUFF function is intuitive! (Robyn documented the basic string functions in her Robyn Page’s SQL Server String Manipulation Workbench.)

When setting out a database project, it is always best to start out with a basic toolkit of elementary string user-functions that make your code readable. It is only when you hit a particular performance problem that you’d need to use the built-in functions rather than your own. In the heat of a team development, things generally seem to go better if the basic string user-functions are there, and ready to use.

For no particularly good reason, we like to use the PHP and Python string functions, adapted for SQL Server use. We’ve already described some of the routines we borrowed from PHP in The TSQL String Array Workbench.

The Python ones we use are…

Capitalize string Function

Return a copy of the string with only its first character capitalized.

Centre string Function

Returns a copy of @String centered in a string of length @width, surrounded by the appropriate number of @fillChar characters

Count substring in string Function

Returns the number of occurrences of substring sub in string s. Allows you to specifying the start and end position of the search.

EndsWith string Function

Return non-zero if the string ends with the specified suffix, otherwise return False. Suffix can also be a list of suffixes to look for. With optional start, test beginning at that position. With optional end, stop comparing at that position.

Expand Tabs in a string

Returns a copy of @String where all tab characters are expanded using spaces.

IsAlNum string Function

Returns Non-Zero if all characters in @String are alphanumeric, 0 otherwise.

IsAlpha string Function

Returns non-zero if all characters in @String are alphabetic, 0 otherwise.

IsDigit string Function

Returns non-zero if all characters in @string are digit (numeric) characters, 0 otherwise.

IsLower string Function

Returns non-zero if all characters in s are lowercase characters, 0 otherwise.

IsTitle string Function

Return true if the string is a titlecased string and there is at least one character, for example uppercase characters may only follow uncased characters and lowercase characters only cased ones. Return false otherwise.

IsSpace string Function

Returns non-zero if all characters in s are whitespace characters, 0 otherwise.

LJust -Left justify string Function

Returns a copy of @String left justified in a string of length width. Padding is done using the specified fillchar string (default is a space). The original string is returned if width is less than len(s).

LStrip– remove leading characters from a string

Return a copy of the string with leading characters removed. The chars argument is a string specifying the set of characters to be removed. If omitted or None, the chars argument defaults to removing whitespace. The chars argument is not a prefix; rather, all combinations of its values are stripped:

rfind– Find highest index of Substring

Return the highest index in the string where substring sub is found, such that sub is contained within s[start,end]. Optional arguments start and end are interpreted as in slice notation. Return -1 on failure.

RJust -Right justify string Function

Returns a copy of @String right justified in a string of length width. Padding is done using the specified fillchar string (default is a space). The original string is returned if width is less than len(s).

RStrip: remove trailing characters from a string

Return a copy of the string with trailing characters removed. The chars argument is a string specifying the set of characters to be removed. If omitted or None, the chars argument defaults to removing whitespace. The chars argument is not a suffix; rather, all combinations of its values are stripped:

Strip: remove trailing or leading characters from a string

Return a copy of the string with the leading and trailing characters removed. The chars argument is a string specifying the set of characters to be removed. If omitted or None, the chars argument defaults to removing whitespace. The chars argument is not a prefix or suffix; rather, all combinations of its values are stripped:

SwapCase string Function

Return a copy of the string with uppercase characters converted to lowercase and vice versa.

Title string Function

Returns a titlecased copy of @String, i.e. words start with uppercase characters, all remaining cased characters are lowercase.

zfill: left-fill the numeric string with zeros

Return the numeric string left-filled with zeros in a string of length width. The original string is returned if width is less than len(s).

All finished? Well, no, actually. We’ve now reached the point where we can tie in the String Array work we did in the TSQL String Array Workbench and implement the handful of Python functions that use tuples and lists. However, that’s certainly enough for one sitting, and the more complicated functions will take some explaining, so we’ll meet again for Episode 2 of the String User Function Workbench.

Keep up to date with Simple-Talk

For more articles like this delivered fortnightly, sign up to the Simple-Talk newsletter

Downloads

This post has been viewed 65969 times – thanks for reading.

Tags: , , , , ,

  • Rate
    [Total: 62    Average: 4.4/5]
  • Share

Robyn Page and Phil Factor

Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 20 years of experience with database-intensive applications. Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career.

View all articles by Robyn Page and Phil Factor

  • GSquared

    Interesting Workbench
    I’ve implemented a few similar functions, such as Name Case functionality, but nowhere near so many as this.

  • Phil Factor

    Re: Interesting Workbench
    Strangely, there are a few of these I don’t like, but Robyn drove me onwards through the list. To do Title Case, I rather like my own one…
    (The Python one is silly because it uppercases the first letter of Of, The And etc.)

  • Ryan Randall

    _123 TSQL functions
    This is very old but still useful – and since it’s related, I thought I’d post it.

    http://www.planet-source-code.com/vb/scripts/ShowCode.asp?txtCodeId=502
    “a collection of 123 TSQL functions”

  • Phil Factor

    Re: 123 TSQL functions
    You’re right. It looks good. At a quick glance, I don’t think there are that many duplicates!
    http://www.planet-source-code.com/vb/scripts/ShowCode.asp?txtCodeId=502&lngWId=5

  • puzsol

    Capitalise function
    Is there a reason why America seems to like to put a ‘z’ instead of an ‘s’ in so many words (Capitalize being one of them), rather than sticking to the Queen’s English?

    Now it seems that you have also decided to replace ‘x’ with ‘z’ (see the lower-case alphabet in the PATINDEX line in both the Title and Capitalize functions)

    lol (-8

  • Josh

    SwapCase function
    The code looks goofy.

  • Josh

    Lstrip function
    I just noticed and iif in Lstrip. Even with the bugs, this is my favorite simple-talk column.

  • Anonymous

    Capitalize function useless
    It doesn’t matter what character the strings starts with, just make it Upper! If it’s digit it won’t have any effect, so you don’t need all that fuss that will kill the performance, simply use SQL internal function to do that.
    ;o)

  • Robyn Page

    Re: mistake in Capitalize and phantom IIF
    OOps! I blame Phil! We decided to fix the zxz typo in the article and source file, as we didn’t want that typo to propagate!. The IIF was a burp in Phil’s prettifier.(the source was OK!) We’re not sure what to do about the goofiness of the SwapCase function. Neither of I could think of a real use for it (we thought we ought to put it in just for the sake of completeness), and so we couldn’t resist using a bitwise exclusive or in a function just so we could say we’d actually found a purpose for using it in SQL (Phil once managed to create bitmaps in an IMAGE, but he is now fully recovered.)

  • Andy

    Alternative zFill function
    IF OBJECT_ID(N’zfill’) IS NOT NULL
    DROP FUNCTION zfill
    GO
    CREATE FUNCTION dbo.zfill
    (
    @String VARCHAR(8000),
    @Width VARCHAR(255) = ‘ ‘
    )
    RETURNS VARCHAR(8000)
    AS BEGIN
    RETURN Right(Replicate(‘0’,@Width)+@String,@Width)
    END
    GO
    SELECT dbo.zFill(‘789’, 10)

  • Andy

    Centre
    That Centre function took me right back to when I was programming on the Zx Spectrum.

  • Phil Factor

    Re: Center
    Yeah. Me too. Funnily, I had to use one for an email report the other day that was text based. The user wanted the title to be centered. Shame I threw away the speccie. (a cheap Z80-based british computer of the 1980s that was reputed by its creator to be able to control a power station.)
    Presumably, now one should nowadays just do
    <div style=”text-align:center”><h1>MyTitle</h1></div>

  • Josh

    By Goofy I mean..
    There is an extra “co” in the middle, right after the between.

  • Phil Factor

    Re: Goofy
    it should read …
    IF @ThisChar BETWEEN ‘a’ AND ‘Z’ COLLATE Latin1_General_CS_AI
    … It was OK in the script but not in the article. We thought we ought to fix it so it isn’t there any more now.

    Thanks a lot for spotting that, Josh. As you could imagine, this article was a bit complex to put together! The Prettifier seems to be having the day off!

  • ALZDBA

    Nice overview
    Nice overview 😉

    apparently the link for the script download does not work.

  • Andrew Clarke

    re: nice overview
    ..does now! Mea Culpa

  • neda

    tnx
    tanks,for your atention.
    and tanks because of sending this good article for me.

  • Anny mouse

    Numeric vs string
    What defines alfanumeric ? Is ’01’ a number or string ?
    Is ’01’ the same as 1 ?
    In this case yes.
    In this case +1 is not a digit ? Doh.
    What about other characters : #@%*[}…..

  • Anonymous

    Great articale
    Found the article really good, nice to see how other people do there capitalize and title case functions.

    Thank you.

    To puzsol you might what to check the queens english out http://www.askoxford.com/asktheexperts/faq/aboutspelling/ize?view=uk

  • Derek

    New alphabet? (PATINDEX string again)
    Insisting that ‘q’ is always followed by ‘u’, even in an alphabetic list of lower case letters is a bit much, although having 2 ‘u’s obviously doesn’t affect anything.

  • A Fan

    Great work
    Hello. You are smart and smoking hot. A very rare combination.

    Thanks for all of your articles.

  • Anonymous

    Lost it
    I lost it somewhere after god and majesty in the script.

  • Robyn Page

    re: Lost it
    I suppose you think Irony is something that an axey is made from.

  • puzsol

    Corrections
    Thankyou to the anonymous person who sorted out the ize argument… though from an international point of view I still find it annoying….

    Now how about the Aluminum / Aluminium mixup?

  • Matjaz Justin

    Nice overview, a lot of useful things on one place
    Function dbo.Count needs a little modification.

    SELECT dbo.COUNT(‘yyy’, ‘y’, null, null)
    — 2 ??

  • Craig Hathaway

    Function dbo.Count needs a little more modification.
    SELECT dbo.Count(‘if something”s worth doing, it is worth doing badly’, ‘worth doing’, 17, 46)
    — 1?