Click here to monitor SSC

An argument with SQL Server

Last post 04-22-2009, 9:24 AM by kisifi. 13 replies.
Sort Posts: Previous Next
  •  08-10-2006, 11:11 AM Post number 1595

    • Nige is not online. Last active: 11-07-2006, 12:59 PM Nige
    • Top 150 Contributor
    • Joined on 07-14-2006
    • Level 1: Deep thought

    An argument with SQL Server

    Can anyone explain why this happens? It doesn't seem rational to me, and has caused havoc in a complicated function that looks at web-logs

    Select len ('This is a long string with blanks on the end                        ')
    --44
    Select len ('This is a long string without blanks on the end')
    --47

    Basically, SQL Server thinks the first string is shorter than the second one. whereas I dont think so at all. C'mon guys, back me up!

  •  08-10-2006, 12:09 PM Post number 1598 in reply to post number 1595

    Re: An argument with SQL Server

    Read the documentation:

    http://msdn2.microsoft.com/en-us/library/ms190329.aspx

    "Returns the number of characters of the specified string expression, excluding trailing blanks."

    Looking for an advanced SQL Server 2005 book?

    Expert SQL Server 2005 Development
  •  08-10-2006, 1:25 PM Post number 1600 in reply to post number 1598

    • Nige is not online. Last active: 11-07-2006, 12:59 PM Nige
    • Top 150 Contributor
    • Joined on 07-14-2006
    • Level 1: Deep thought

    Re: An argument with SQL Server

    Why should I need to 'read the documentation' when I know from Visual Basic, VBA, and VB scripting edition, Microsoft's core languages, how LEN works? They should have called it CHAR_LENGTH maybe, after the Sybase function that showed the same behaviour, and did a LEN that works the same way as other Microsoft products
  •  08-10-2006, 3:45 PM Post number 1606 in reply to post number 1600

    Re: An argument with SQL Server

    SQL is not Visual Basic.



    Looking for an advanced SQL Server 2005 book?

    Expert SQL Server 2005 Development
  •  08-11-2006, 3:11 AM Post number 1608 in reply to post number 1595

    Re: An argument with SQL Server

    To go back to Nige's original request for an explanation......

    This was an attempt by the SQL Server team to keep the behaviour of tests for string equality, and string comparisons, consistent. The people who then implemented unicode then muddied the water horribly, as you'll see in these tests. Before unicode, the '=', LEN and like all reated trailing spaces in a string as if they didn't exist, simply because that was the way that they treated the older CHAR, where all strings were padded with blanks. The introduction if unicode has meant that  LIKE works in a different way with unicode strings, leaving the behavious of LEN inconsistent. Still awake? Look at this and run it. You'll see the issues

    DECLARE @testData TABLE (MyID INT IDENTITY(1,1), TestString VARCHAR(100))
    

    INSERT INTO @testdata (TestString
           
    VALUES ('This is a string                          ')
    INSERT INTO @testdata (TestString
           
    VALUES ('This is a string                  ')
    INSERT INTO @testdata (TestString
           
    VALUES ('This is a string          ')
    INSERT INTO @testdata (TestString
           
    VALUES ('This is a string')

    PRINT 'check the way the string equality operator works'        
    SELECT FROM @TestData WHERE teststring 'This is a string'

    PRINT 'check the way the ''Like'' operator works'       
    SELECT FROM @TestData WHERE teststring LIKE 'This is a string'

    PRINT 'check the way the ''Like'' operator works with unicode SQL-92 standard'  
    SELECT FROM @TestData 
           
    WHERE CONVERT(NVARCHAR(100),teststring) LIKE 'This is a string'


    PRINT 'check the way the ''Like'' operator works with wildcards'        
    SELECT FROM @TestData WHERE teststring LIKE '%This is a string          %'

    PRINT 'does the equality operator work consistently?'   
    SELECT CASE WHEN teststring 'This is a string'
           
    THEN ' matched' ELSE ' didn''t match' END
                   FROM 
    @TestData

    PRINT 'Show the original problem with LEN'
    SELECT LEN(TestStringFROM @TestData

    PRINT 'Show the original problem with LEN, using unicode'
    SELECT LEN(CONVERT(NVARCHAR(100),teststring)) FROM @TestData

    To quote from BOL....
    'LIKE supports ASCII pattern matching and Unicode pattern matching. When all arguments (match_expression, pattern, and escape_character, if present) are ASCII character data types, ASCII pattern matching is performed. If any of the arguments are of Unicode data type, all arguments are converted to Unicode and Unicode pattern matching is performed. When you use Unicode data (nchar or nvarchar data types) with LIKE, trailing blanks are significant; however, for non-Unicode data, trailing blanks are not significant. Unicode LIKE is compatible with the SQL-92 standard. ASCII LIKE is compatible with earlier versions of SQL Server.'

    And the work-round for LEN?

    PRINT 'and the fix if you want the real length of the string'
    
    SELECT LEN(REPLACE(TestString,' ','|')) FROM @TestData
  •  08-11-2006, 9:55 AM Post number 1619 in reply to post number 1608

    Re: An argument with SQL Server

    Why not just use DATALENGTH (or DATALENGTH/2 for Unicode)?

    Personally, I like the trimming, in most cases.  The only time I've ever had a problem with it was when working on TSQLMacro, because as I chunk up routines there might be some trailing spaces that are needed to keep formatting intact.  But that's pretty much an edge case, IMO...

     


    Looking for an advanced SQL Server 2005 book?

    Expert SQL Server 2005 Development
  •  08-11-2006, 10:07 AM Post number 1621 in reply to post number 1619

    Re: An argument with SQL Server

    With DATALENGTH, you have to remember whether your string is unicode or not, and divide by two accordingly, whereas the LEN 'Replace' trick works consistently for all VARCHARs. Mind you, there is still the problem of CHARs where LEN does the right thing without the 'Replace' trick. Ah well, whatever you do you need to keep your wits about you, and RTFM.

  •  08-11-2006, 10:12 AM Post number 1623 in reply to post number 1621

    Re: An argument with SQL Server

    Yes, you do need to remember to do the right thing in the right place... But it's not like variables or columns in T-SQL are dynamically typed, so it's pretty easy to go find out what the type is and do the right thing.

    And, yes, RTFM (which I was too nice to say at first ;))

     


    Looking for an advanced SQL Server 2005 book?

    Expert SQL Server 2005 Development
  •  08-11-2006, 11:35 AM Post number 1624 in reply to post number 1608

    • Nige is not online. Last active: 11-07-2006, 12:59 PM Nige
    • Top 150 Contributor
    • Joined on 07-14-2006
    • Level 1: Deep thought

    Re: An argument with SQL Server

    Thanks,

    Now you've explained how it all came about, it all makes perfect sense, up to the point that LIKE works differently for Unicode data. Talk about turning a bug into a feature! The trouble about the FM is that it is too F large for the average brain, otherwise I'd have R it. Actually, Phil's explanation isn't in it as far as I can find, so cheers!

  •  08-15-2006, 8:19 PM Post number 1671 in reply to post number 1624

    • drsql is not online. Last active: 2012-05-22, 9:12 AM drsql
    • Top 50 Contributor
    • Joined on 06-28-2006
    • Blue Gene

    Re: An argument with SQL Server

    I cannot disagree with you about the manual being too darn big.  I remember back in the early days reading the entire books online.  No way would I do that now.  I would also warn that:

    "Why should I need to 'read the documentation' when I know from Visual Basic, VBA, and VB scripting edition, Microsoft's core languages, how LEN works?"

    This is a horrible thing to think.  SQL Server has grown up on a different path, and there is little similarity in the functions. I personally hit the key many times a day looking up the syntax for commands :)

    "They should have called it CHAR_LENGTH maybe, after the Sybase function that showed the same behaviour, and did a LEN that works the same way as other Microsoft products "

    Actually it would have been nicer if they have followed all of the standards for SQL based products.

    Another "trick" that I would use would be to add two values to the string and use len:

    declare @charCol nvarchar(100)
    set @charCol = '       '
    select len('*' + @charCol + '*') - 2

    I

  •  09-29-2006, 2:35 PM Post number 2264 in reply to post number 1671

    Re: An argument with SQL Server

    Of course, you can just use the SQL-92 standard character string function "CHAR_LENGTH" or "CHARACTER_LENGTH", which is defined to "return the total length of your character string argument including any trailing (or leading) spaces". 

    OOPS! I forgot... MS SQL Server does not support this standard function. (Why not?)

    Implementing CHAR_LENGTH like Sybase (presumably dropping trailing spaces) would be an even worse implementation, since it would be using a SQL-92 standard function name with an incompatible implementation.

    It's true we should all RTFM, but it would be nice if DBMS developers also RTFM'd (the SQL-92 standard).

     

  •  09-30-2006, 3:08 AM Post number 2268 in reply to post number 2264

    • Nige is not online. Last active: 11-07-2006, 12:59 PM Nige
    • Top 150 Contributor
    • Joined on 07-14-2006
    • Level 1: Deep thought

    Re: An argument with SQL Server

    Thanks a lot for that.
    I suspect I'm not the only one who's confused. My point about the silliness of LEN having a different behaviour in SQL Server to the rest of the Microsoft product range is strengthened by carofers point that it would then even conform to the SQL-92 standard for CHAR_LENGTH. Why call it LEN? Once they decided that the new length function had to conform with the existing SYBASE behavior of ignoring trailing spaces, why didn't they call it something else?
  •  11-05-2008, 11:49 AM Post number 70357 in reply to post number 1595

    Re: An argument with SQL Server

    Try this:

    Select DATALENGTH('This is a long string with blanks on the end                        ')

    Select DATALENGTH('This is a long string without blanks on the end')


  •  04-22-2009, 9:24 AM Post number 73224 in reply to post number 1595

    Re: An argument with SQL Server

    Nige:


    Select len ('This is a long string with blanks on the end                        ')
    --44
    Select len ('This is a long string without blanks on the end')
    --47

    Basically, SQL Server thinks the first string is shorter than the second one.


    Amazing. It depends on the language. In french  the second string is shorter :
    -- 50
    Select len ('Ceci est une long chaine avec des espaces à la fin                       ')

    Select len ('Ceci est une long chaine sans espaces à la fin')
    --46


View as RSS news feed in XML