Forums (RSS 2.0)" href="http://www.simple-talk.com/community/forums/rss.aspx?ForumID=-1&Mode=0" />
Click here to monitor SSC

Writing Efficient SQL: Set-Based Speed Phreakery

Last post 02-03-2011, 10:36 PM by hellobenchen. 47 replies.
Page 2 of 4 (48 items)   < Previous 1 2 3 4 Next >
Sort Posts: Previous Next
  •  02-08-2010, 7:24 AM Post number 89708 in reply to post number 89496

    download is not case sensitive compatible

    Why are scripts never written for case sensitive databases? Commence find and replace on the download.....
  •  02-08-2010, 8:39 AM Post number 89713 in reply to post number 89496

    case sensitive scripts

    this is probably not done because a) I would venture that a very small fraction of installed databases are case sensitive and b) even if you have a server that is, can't you build a database with a case insensitive collation for testing?
  •  02-08-2010, 9:05 AM Post number 89714 in reply to post number 89496

    Quirky

    Tony and I debated on whether we should use the term "quirky" or not but went with it in the end. Once you get the data aggregated down to just a handfull of rows, it doesn't make as much difference how you do the last step. The real beauty of the winning entry is the "one pass" pre-aggregation.
  •  02-08-2010, 2:14 PM Post number 89734 in reply to post number 89496

    • MrDee is not online. Last active: 26 Jul 2010, 5:27 PM MrDee
    • Top 500 Contributor
    • Joined on 01-15-2009
    • Auckland, NZ
    • Level 1: Deep thought

    Brilliant

    Excellent solution explained extremely well.
    I will be recommending this article to my fellow workers.
  •  02-08-2010, 2:15 PM Post number 89735 in reply to post number 89496

    • Peso is not online. Last active: 2011-09-29, 8:46 AM Peso
    • Top 500 Contributor
    • Joined on 09-19-2009
    • Level 1: Deep thought

    Breaking issues

    I am happy you mentioned the discussion here http://www.sqlservercentral.com/Forums/Topic802558-203-3.aspx, TheSQLGuru. As written by Hugo, the ordered update can only work if a) or b) is satisfied. If you look at the execution plan for the second query of my solution, you can see that a) is satisfied.
    Also, according to http://www.sqlservercentral.com/Forums/Topic802558-203-11.aspx, my solution has no WHERE clause.
  •  02-08-2010, 2:45 PM Post number 89737 in reply to post number 89496

    Breaking Issues

    Still a bad idea:

    1) You cannot guarantee that you (nor some other machine with different settings, configuration, data, etc) always get the type of plan you currently get

    2) There are other things than what Hugo posted that can break it

    3) Microsoft states the order isn't guaranteed

    4) If you address and/or caveat all the known points that currently can break it that still leaves someone finding yet another scenario where it breaks and people not getting the new rule.

    Besides, people will take this type of logic and use it where it is GUARANTEED to give them the WRONG ANSWERS but they won't know it.

    Signing off from this one since we appear to have to agree to disagree. Sorry, but I must state without reservation that it is inappropriate and I believe irresponsible to promulgate this type of processing at this time.
  •  02-08-2010, 3:36 PM Post number 89742 in reply to post number 89496

    Took only 16 Ms

    select * from Registrations

    declare @starTime datetime
    declare @minDate datetime
    declare @runTotal int
    select @runTotal = 0
    select @starTime = GETDATE()

    select @minDate = MIN(DateJoined)from Registrations

    SELECT
    DATEDIFF(MM,@minDate,Datejoined) as s1,
    COUNT(*) as counts,
    @runTotal as runningTotal
    INTO #1
    FROM
    Registrations
    GROUP BY DATEDIFF(MM,@minDate,Datejoined)
    order by DATEDIFF(MM,@minDate,Datejoined) asc

    select
    DATEDIFF(MM,@minDate,DateLeft) as s1,
    COUNT(*) as counts,
    @runTotal as runningTotal
    INTO #2
    FROM Registrations
    where DateLeft is not null
    GROUP BY DATEDIFF(MM,@minDate,DateLeft)
    order by DATEDIFF(MM,@minDate,DateLeft) asc



    update #1
    set @runTotal= runningTotal = @runTotal + c.counts - ISNULL(#2.counts,0)
    from #1 as c
    left outer join #2 on c.s1 = #2.s1



    SELECT
    DATEADD(MM,#1.s1,@minDate) as Monthly ,
    #1.counts as NewJoins ,
    #2.counts as LeftUs,
    #1.runningTotal
    FROM
    #1
    LEFT JOIN #2 on #1.s1 = #2.s1

    order by runningTotal Asc

    select DATEDIFF(mcs,@starTime,getdate())
  •  02-08-2010, 5:06 PM Post number 89746 in reply to post number 89496

    what if using a CTE for #stage?

    Thanks for the article, definitely worth reading.

    I would appreciate if somebody explain what happens if we use a CTE instead of the #Stage table in the solution "Peso 4e"? Will it be possible at all and how performance will be affected?

    Just curicity from my side, and general perseption that keeping everything in memory (avoiding writing to the tempdb) is beneficial.

    Appreciate your thoughts.
  •  02-08-2010, 5:53 PM Post number 89754 in reply to post number 89496

    • puzsol is not online. Last active: 05 Mar 2012, 6:20 PM puzsol
    • Top 75 Contributor
    • Joined on 11-01-2007
    • Melbourne Victoria
    • Level 1: Deep thought

    Conditions for Quirky Update to fail

    It's not just that the CTE has too few rows to use a parallel operation... in my experience (at least on SQL 2005) table variables (and by extension CTEs?) will never use parallel operations. Spent a bit of time converting all table variables to temporary tables for just that reason!

    If my assertion is still correct (and I suspect it is, and will remain so), then all three criteria are met by using a CTE/table variable:
    - it is ordered
    - it will not parellelise itself (no matter how big)
    - it will not partition.

    So it won't break in this use.... just because someone could misuse a technique doesn't mean you shouldn't use it. Hey I've seen insert/update triggers written as if they were for only one row... so should we stop using triggers altogether or is that a different discussion?
  •  02-08-2010, 5:57 PM Post number 89755 in reply to post number 89496

    what if using a CTE for #stage?

    Thanks for the article, definitely worth reading.

    I would appreciate if somebody explain what happens if we use a CTE instead of the #Stage table in the solution "Peso 4e"? Will it be possible at all and how performance will be affected?

    Just curicity from my side, and general perseption that keeping everything in memory (avoiding writing to the tempdb) is beneficial.

    Appreciate your thoughts.
  •  02-08-2010, 6:13 PM Post number 89758 in reply to post number 89496

    • puzsol is not online. Last active: 05 Mar 2012, 6:20 PM puzsol
    • Top 75 Contributor
    • Joined on 11-01-2007
    • Melbourne Victoria
    • Level 1: Deep thought

    Conditions for Quirky Update to fail

    It's not just that the CTE has too few rows to use a parallel operation... in my experience (at least on SQL 2005) table variables (and by extension CTEs?) will never use parallel operations. Spent a bit of time converting all table variables to temporary tables for just that reason!

    If my assertion is still correct (and I suspect it is, and will remain so), then all three criteria are met by using a CTE/table variable:
    - it is ordered
    - it will not parellelise itself (no matter how big)
    - it will not partition.

    So it won't break in this use.... just because someone could misuse a technique doesn't mean you shouldn't use it. Hey I've seen insert/update triggers written as if they were for only one row... so should we stop using triggers altogether or is that a different discussion?
  •  02-08-2010, 7:21 PM Post number 89767 in reply to post number 89496

    It didn't break... it was forced to not work.

    >>>There are other things than what Hugo posted that can break it

    Hugo posted things that wouldn't allow it to work to begin with. The rest of it, I posted to show that my poor explanation of why it worked and when it won't needed to be changed.

    So far as "could break" when a future CU or SP comes out, so what? That could happen with anything and if you don't regression test before such installations, you're begging for trouble. An example of a change that was not on the deprecation list that broke a lot of code was when they changed the very heavily documented sp_MakeWebTask to require "SA" privs in SQL Server 2000 SP4. Although Microsoft is pretty good about following "the plan" in the form of deprecation lists, Microsoft can, has, and will continue to change anything they need to without warning... documented or not.
  •  02-08-2010, 8:52 PM Post number 89769 in reply to post number 89496

    what if using a CTE for #stage?

    Thanks for the article, definitely worth reading.

    I would appreciate if somebody explain what happens if we use a CTE instead of the #Stage table in the solution "Peso 4e"? Will it be possible at all and how performance will be affected?

    Just curiosity from my side, and general perception that keeping everything in memory (avoiding writing to the tempdb) is beneficial.

    Appreciate your thoughts.
  •  02-09-2010, 2:21 PM Post number 89805 in reply to post number 89496

    CTE instead of #Stage

    "I would appreciate if somebody explain what happens if we use a CTE instead of the #Stage table in the solution "Peso 4e"? Will it be possible at all and how performance will be affected?"

    You can try it, but I don't think you can nest a CTE inside another CTE which is what you would have to do if I understand your proposal.
  •  02-09-2010, 6:58 PM Post number 89813 in reply to post number 89496

    Temp Tables can live in memory, too...

    -----------------------------------------------
    >>> "Just curicity from my side, and general perseption that keeping everything in memory (avoiding writing to the tempdb) is beneficial."
    -----------------------------------------------
    Using a Temp table doesn't mean it's not in memory. The first place a Temp table will try to live is the same place a Table variable or the result set of a CTE may try to live... memory. When a Table Variable or Temp Table get to big for memory, they both do the same thing... start using disk space in TempDB.
Page 2 of 4 (48 items)   < Previous 1 2 3 4 Next >
View as RSS news feed in XML