Click here to monitor SSC
  • Av rating:
  • Total votes: 94
  • Total comments: 14
Remi Gregoire

RBAR: 'Row By Agonizing Row'

26 July 2007

RBAR, is an acronym for 'Row By Agonising Row'. It was first coined by Jeff Moden, a very regular poster on

If, like me, you have not had the valuable experience of working in a Very Large Database Environment (VLDB), you may have done some 'Row by agonising row' (RBAR) programming without seeing too many bad results. But as time goes by, and as the Databases grow larger, the consequences of this style of coding will start to become increasingly felt: You will start getting help calls about the applications because they are getting slower by the minute, or even, apparently, not working at all.RBAR is a consequence of coding in a strictly procedural way, rather than in a set-based way. It is different from poor coding; it is the result of adopting a mindset that one always has to tell the computer, step by step, how to do something rather than, in a set-based approach, merely specifying the result one is aiming for. With relational databases, one tells the database what one wants, not how to do it, row by agonising row.

RBAR in Front-end code

When I first joined a company, my very first job was to do programming work on projects that used Access as front-end, and SQL Server as the back-end. I found several examples of RBAR processing on the front-end of the code There was almost no back end code on the server except for views and stored procedures that only did


The team of programmers were all beginners, and they believed that they had programmed a system that was working correctly. It did the job that they wanted it to do correctly in a timely manner. It gave the output they needed. This, however, was a few years before my arrival 500 000 inserts later. The basis logic in every function was something like this:

Open all the tables where we may or may not need to do anything to or from, download all columns and all rows, then loop through all that to do the job.

The code was certainly working when there were 1000 rows in the database tables. Now, with nearly 1million rows, the loops inside the loops inside the loops... nested 19 times, started to take more and more time, both on the server and application, and on the nerves of the people who used it.The application was being forced to process the data in a particular way that was fine for a small data set but ridiculous for a large one. My solution back then (4-5 years ago), was to reduce the number of rows that were downloaded from tables on the server to a minimum. Since I did not want to mess too much with this critical part of the application, I stopped my optimisation there. This minor alteration made the application run almost 10 times faster.I thought that this was not bad for a junior and, since everyone else was happy, I didn't push the envelope further.

Now fast forward three to four more years and 400 000 more inserts in the various tables. The performance problem is as bad, if not worse, than it was when I first joined.

A simple example may give you an idea. We have a 'Bill of Work' system where four buttons are present on a form:: 'Bill hours', 'Bill the rest', 'Print', and 'Cancel'. The 'Cancel' button just changes a few columns in the base table and deletes a few rows in a handful of others tables. Simple. However, because of the way the system was designed, it took eleven seconds to run this process on one of the larger Bills of Work, and around six seconds for the rest. This was a part of the application I had never touched at all, but I clearly remember that, when I first joined, this process took no more than a second. I would now consider even this to be extremely slow, but nobody at the time saw it as a problem

I got back into this code last week to see what I could do with this button. The first thing that I noticed was that six recordsets were being opened at the very start of the function. Those recordsets then downloaded four of the biggest tables in the database. This part, by itself, took around five seconds.

There was a two-way logic in this function depending on which flags are set in the Bill. So depending on the flags, 3 and only 3 tables can be used when running this function. You would never require all 6 tables. I therefore moved the table downloads to their respective IF statement. This shaved a few seconds off the time. I then started to analyse the code. Four of the six tables were opened merely to delete rows . The code looped through all the rows, one row at the time, in order to check to see if the ID matched the ID of the current Bill of Work. When it did, it deleted the row. Seeing this, I deleted about 100 lines of code and inserted 4 lines of my own code which reads something like this:

Connection.EXECUTE "Delete FROM dbo.TableName WHERE idCol = "Me.IdCol    

 (the logic was more complex than that but you get the idea).

After I deleted the code that downloaded those 4 tables, I made another test run and was pleased to see that the code now ran in less than 2 seconds.
"That's great, it's a 5 fold increase in speed and time. Let's go 10 fold now".
I changed the rest of the code so that the updates were now done in a single batch rather than row by row. This shaved about 1.5 seconds of the remaining time.

Having done that, I moved all that logic into a stored procedure on the server, and changed the front-end code to call that stored procedure. After a quick check of the execution plans and, as a result, adding two more indexes, this procedure now runs in less than 0.10 second (this also includes auditing and a re-query of the bill of work to display the changes). My final version of the code takes about 4-5 times less written lines of code to do the exact same thing. And also now that the code is server side, a new change of this procedure will not required a redeployment of the application

The conclusion is that SQL Server was designed to work with sets of data. This means that it is more efficient in deleting all rows where FK = 10, than deleting them one at the time (firstly delete the row that has a PK = 1, then the one where PK = 2). The set-based approach will be faster than the RBAR over 99.99% of the time. I know of only a couple of exceptions where RBAR is at worst as fast as set-based, and at best, outruns the hell out of it.

RBAR in a trigger

So what is RBAR in a trigger? Let's say that you want to set a flag to true with all new inserted rows. The RBAR code could look something like this :

--Set up the tables  
            USE SSC --Test Database  
    SELECT *   
        FROM dbo.SysObjects   
        WHERE Name 'Employees'   
            AND XType 'U'   
            AND USER_NAME(uid'dbo')  
    DROP TABLE dbo.Employees  
CREATE TABLE dbo.Employees  
                FName VARCHAR(50) NOT NULL  
                , LName VARCHAR(50) NOT NULL  
                , HourlyRate DECIMAL(5,2) NOT NULL  
                , SomeFlag BIT NOT NULL CONSTRAINT DF_Employees DEFAULT (0)  
INSERT INTO dbo.Employees (FNameLNameHourlyRate)   
        VALUES ('Ninja''Rus'50INSERT INTO dbo.Employees (FNameLNameHourlyRate)   
        VALUES ('Some''Consultant'100GO  
    FROM dbo.Employees  
    --All rows show a flag at off (0).  
    --Now let's say that a new business requirement needs you to flag
      all new employees because they need to do something with those
      at a later time.  
    --You could create a trigger that would look something like this :   
    SELECT *   
        FROM dbo.SysObjects   
        WHERE Name 'TR_Employees_A_I_FlagNewEmployees'   
            AND XType 'TR'   
            AND USER_NAME(uid'dbo')  
    DROP TRIGGER dbo.TR_Employees_A_I_FlagNewEmployees  
CREATE TRIGGER dbo.TR_Employees_A_I_FlagNewEmployees ON dbo.Employees  
    --Get employee id  
    FROM Inserted  
    --Update the base table  
UPDATE dbo.Employees SET SomeFlag 1   
    WHERE Empid @Empid  
    --Now let's see if this work :   
INSERT INTO dbo.Employees (FNameLNameHourlyRate)   
        VALUES ('Test''SomeFlagUpdate'100SET @NewEmpID SCOPE_IDENTITY()  
    FROM dbo.Employees   
    WHERE EmpID @NewEmpID  
EmpID       FName                      LName                      HourlyRate SomeFlag   
----------- -------------------------- -------------------------- ---------- --------   
4           Test                            SomeFlagUpdate        100.00     1  
(1 row(s) affected)  


--So, happy in the knowledge that seems to have worked, you go home and have a good night. The next morning you come in with the boss waiting in your car-parking space. He's fuming because the code you put into production is not working. So after arguing that you tested the code and it was working yesterday, you ask him to show you what query he ran. He shows you this query:

INSERT INTO dbo.Employees (FNameLNameHourlyRate)   
        SELECT 'Multiple''rows'50  
        UNION ALL  
        SELECT 'Trigger update''Fails'100  
EmpID       FName                      LNam                       HourlyRate SomeFlag   
----------- -------------------------- -------------------------- ---------- --------   
5           Trigger update             Fails                      100.00     1  
4           Multiple                   rows                       50.00      0  
(2 row(s) affected)  
*/   */ 

 --To your dismay, you see that only 1 row, the last, is updated in the trigger. No matter how many rows will be put in that insert statement, only 1, the last one, will be updated by the trigger. You go back and check your work and realize that, since you only set the variable once and do one update, you can't possibly update more than one row. So you now come up with this solution :

--I will disable the first trigger here so that it doesn't fire with the new one :   
ALTER TABLE dbo.Employees DISABLE TRIGGER TR_Employees_A_I_FlagNewEmployees  
                    WHERE Name 'TR_Employees_A_I_FlagNewEmployees_MultiRows' 
                    AND XType 'TR' AND USER_NAME(uid'dbo')  
        DROP TRIGGER dbo.TR_Employees_A_I_FlagNewEmployees_MultiRows  
CREATE TRIGGER dbo.TR_Employees_A_I_FlagNewEmployees_MultiRows ON dbo.Employees  
        DECLARE @EmpID AS INT  
        --Get employee id  
        SELECT TOP @EmpID EmpID FROM Inserted ORDER BY EmpID  
        WHILE @EmpID IS NOT NULL  
                        --Update the base table  
                        UPDATE dbo.Employees SET SomeFlag WHERE Empid @Empid  
                        --Reset the Empid  
                        SET @EmpID (SELECT TOP 1 EmpID 
                                FROM Inserted WHERE EmpID @EmpID ORDER BY EmpID)  
INSERT INTO dbo.Employees (FNameLNameHourlyRate)   
        SELECT 'Multiple 2''Rows 2'50  
        UNION ALL  
        SELECT 'RBAR''Version'100  
EmpID       FName                      LName                      HourlyRate SomeFlag   
----------- -------------------------- -------------------------- ---------- --------   
7           RBAR                       Version                    100.00     1  
6           Multiple 2                 Rows 2                     50.00      1  
(2 row(s) affected)  


This works perfectly, you think, and so you push this into production. Now, six months later, you again find your boss waiting in your parking spot. Your company has recently merged with a huge corporation with over 100 000 employees. The new company decides to merge their data into your system and use your systems from now on. However, when running a statement to insert all of their employees' information, they find that the server hangs for several minutes without any response. They kill the query and try again and again with the same results.You retest your trigger with a few rows and it seems to work fine. However you realize that, in order to update a single row, the server first has to do a SELECT FROM INSERTED (which is actually read from the logs, and is a costly operation), then an update. This works fine for a few rows, but with an insert of 100 000 rows, the server actually has to run over 200 000 queries to produce the correct results and finish the insert. I will not talk about all other pitfalls of administration in this article (like file growth or tempdb), but there are quite a few to avoid.

This insertion operation is not small by any standards, but it should not take several minutes, or even seconds, to run. You do some research and find a way to do a single UPDATE statement that should speed this thing up a lot :

--I will disable the seconds trigger here so that it doesn't fire with the new one :   
ALTER TABLE dbo.Employees DISABLE TRIGGER TR_Employees_A_I_FlagNewEmployees_MultiRows  
           WHERE Name 'TR_Employees_A_I_FlagNewEmployees_MultiRows_SetBased' 
             AND XType 'TR' AND USER_NAME(uid'dbo')  
        DROP TRIGGER dbo.TR_Employees_A_I_FlagNewEmployees_MultiRows_SetBased  
CREATE TRIGGER dbo.TR_Employees_A_I_FlagNewEmployees_MultiRows_SetBased 
dbo.Employees  AFTER INSERT  AS          UPDATE             EMP           SET                EMP.SomeFlag 1           FROM               Inserted I           INNER JOIN         dbo.Employees EMP                                   ON I.EmpID EMP.EmpID  GO  INSERT INTO dbo.Employees (FNameLNameHourlyRate)           SELECT 'Multiple 3''Rows 3'50          UNION ALL          SELECT 'SET BASED''Version'100  GO  SELECT TOP FROM dbo.Employees ORDER BY EmpID DESC  /*  EmpID       FName                      LName                      HourlyRate SomeFlag   ----------- -------------------------- -------------------------- ---------- --------   9           SET BASED                  Version                    100.00     1  8           Multiple 3                 Rows 3                     50.00      1  (2 row(s) affected)  */

Again, you see that the results are correct with this version of the trigger. But what about the speed? I will enable the second trigger and run a 10 rows inserts and I'll let you judge the difference in the execution plans of each trigger :

ALTER TABLE dbo.Employees ENABLE TRIGGER TR_Employees_A_I_FlagNewEmployees_MultiRows  
INSERT INTO dbo.Employees (FNameLNameHourlyRate)   
        SELECT 'Speed ''Compare 1'50  
        UNION ALL  
        SELECT 'Speed ''Compare 2'50  
        UNION ALL  
        SELECT 'Speed ''Compare 3'50  
        UNION ALL  
        SELECT 'Speed ''Compare 4'50  
        UNION ALL  
        SELECT 'Speed ''Compare 5'50  
        UNION ALL  
        SELECT 'Speed ''Compare 6'50  
        UNION ALL  
        SELECT 'Speed ''Compare 7'50  
        UNION ALL  
        SELECT 'Speed ''Compare 8'50  
        UNION ALL  
        SELECT 'Speed ''Compare 9'50  
        UNION ALL  
        SELECT 'Speed ''Compare 10'50  
--The results :   
--The insert takes 1.37% of the total time of all the execution plans  
--The final set based trigger takes 8.50% of the time of all the execution plans  
--So that leaves over 90% for the 2nd version of the trigger.  That's over 10 times 
-- slower over 10 rows.  I will now show you what happens with 10 000 rows :   
--Disable RBAR trigger  
ALTER TABLE dbo.Employees DISABLE TRIGGER TR_Employees_A_I_FlagNewEmployees_MultiRows  
INSERT INTO dbo.Employees (FNameLNameHourlyRate)   
        SELECT TOP 10000 
                 'SPEED TEST' AS FName
                 '10K rows' AS LName
                  55 AS HourlyRate 
                 FROM Master.dbo.SysColumns C1 
        CROSS JOIN Master.dbo.SysColumns C2  
--This runs in about 1 second with the last trigger  
--Now let's see with the RBAR version  
--ENABLE RBAR trigger  
ALTER TABLE dbo.Employees 
       ENABLE TRIGGER TR_Employees_A_I_FlagNewEmployees_MultiRows  
--DISABLE Set based trigger  
ALTER TABLE dbo.Employees 
       DISABLE TRIGGER TR_Employees_A_I_FlagNewEmployees_MultiRows_SetBased  
INSERT INTO dbo.Employees (FNameLNameHourlyRate)   
        SELECT TOP 10000 
            'SPEED TEST' AS FName
            '10K rows' AS LName
             55 AS HourlyRate FROM Master.dbo.SysColumns C1 
        CROSS JOIN Master.dbo.SysColumns C2  
--Final Run time : 44 Min 24 Sec  
                 WHERE Name 'Employees' AND XType 'U' AND USER_NAME(uid'dbo') 
        DROP TABLE dbo.Employees 

Unfortunately, even with a minor insert of 10 000 rows, the final run time is just unacceptable. In a VLDB environment, this trigger would have been a bottleneck for the application right from the very start because, even at one row, the trigger still has to do 2 operations instead of one (if you don't count declaring and destroying the variable as an operation). Now, if you have the courage or a very powerful machine, I will let you try the RBAR trigger at 1 M rows. I won't attempt this on my machine because it would take a few days to run (assuming I don't run out of RAM first).

RBAR Bad, Set-based good!

To conclude, I gave you a little taste here of what RBAR processing can do to a server. I'm using an archaic P III 1.3 GHz with 512 MB of RAM to do those tests. Even though it's a 10 years old machine, it's still powerful enough to run fairly complex statements at a very good speed. I'm sure most of you will get much better results that I did even on your personal notebooks. However I feel I clearly made my point in this article. I often see questions on the forums about this subject on the forums of, and I just can't seem to get my point across that any type of RBAR is often a very bad idea if you want any kind of performance out of your application. The RBAR concept applies to any looping technique and cursors, to code wherever it stands, whether it's sitting on the server or on a client machine like it was the case for me at my first job. Here I'm only referring to client code that interacts with the server from within the loop. I'm not saying, by any means, that looping should be banned from programming, but rather that programming a database generally requires you to specify the result you want accurately, rather than telling it how to go about getting it, row by agonising row.

Remi Gregoire

Author profile:

Remi operates from Sherbrooke, Quebec, Canada. My formal training as a web programmer specialised in Database systems. I've been working with sql server 2000 for 3 years now and I've also been helping on for about the same amount of time. I'm currently rebuilding the system of a company going from Access ADPs + SQL server 2000 to C# + SQL server 2005.

Search for other articles by Remi Gregoire

Rate this article:   Avg rating: from a total of 94 votes.





Must read
Have Your Say
Do you have an opinion on this article? Then add your comment below:
You must be logged in to post to this forum

Click here to log in.

Subject: When I'm interviewing...
Posted by: Eric Russell (view profile)
Posted on: Monday, July 30, 2007 at 4:24 PM
Message: When I'm interviewing someone for a database developer position, I will ask:
"How much experience have you had developing T-SQL cursors?"
How they answer this question can be very revealing.

Subject: When I was interviewed...
Posted by: Mal Daughtree (view profile)
Posted on: Wednesday, August 8, 2007 at 6:48 PM
Message: Yup, There is a better way. My mission in life. Educate people to the set based way.
Remi, Excellent article, should be compulsory reading for those who believe otherwise.

Subject: typo
Posted by: Jesse (not signed in)
Posted on: Wednesday, August 8, 2007 at 8:41 PM
Message: "Jeff Modem" should be "Jeff Moden" (oops -now fixed Ed:)

I don't care what you say about me, as long as you say something about me, and as long as you spell my name right. --George Cohan

Subject: RBAR
Posted by: GSquared (view profile)
Posted on: Wednesday, August 8, 2007 at 10:08 PM
Message: The only case I know of where RBAR is faster than set-based, is running totals. I've seen several tests on this, and several articles on several pages, and I've tried it out myself. But other than that kind of case, where data depends on the data in the rows preceding it in some specific order, set-based is definitely the way to go.

Subject: RBAR: 'Row By Agonizing Row'
Posted by: Kilo Bravo (not signed in)
Posted on: Thursday, August 9, 2007 at 8:13 AM
Message: How many times have you seen Management throw hardware at the problem. They do so, of course, because it is usually cheaper to do that than to rewrite RBAR code into set-based code.

Subject: Re: RBAR
Posted by: Ken3 (view profile)
Posted on: Thursday, August 9, 2007 at 1:29 PM
Message: Not even Running totals is better in RBAR. This can be done quicker and Set-based with the 'Quirky Update trick' described by Robyn Page in her Cursor Workbench

Subject: Running totals
Posted by: RgR'us (view profile)
Posted on: Friday, August 10, 2007 at 11:09 AM
Message: There are 2 things about running totals :

1 - The quick update method is very fast and will at worst, be only a tad faster than the cursor (from memory).

2 - Do you really need to do running total on the server? I know that answer will be yes quite often, especially in the case where there is no application to present the data. But I've seen more than my share of users who think they need to do this stuff on the server when the application could do it more easily than the server and save the server some cpu cycles.

Subject: Excellent Article - Should be a required read for all Procedural Devlopers
Posted by: Bubba01 (not signed in)
Posted on: Friday, August 10, 2007 at 12:02 PM
Message: This should be a required read for all developers/programmers who used/worked with any non-SQL based language before having worked with SQL. I recently worked for a company that in its wisdom decided DBA's and other SQL savvy developers were a waste of money because the VB guys with the company knew SQL and ADO. This arrogance/ignorance has lead to performance problems and very bad design which clients are now paying the price for. The company officials seem to have finally realized the error in their thinking however sadly enough its now too late. The DB design is so convoluted in many areas and the program so extensive that a major over haul is the only way to get the DB design as it should be and doing that would assuredly break the program and require a re-write of the front end as well as the back end.

Procedural developers, those who code in OOP languages like C, VB and the '.Net' languages should never be allowed to make SQL/DB related decisions until they have received proper training from a certified SQL/DBA. So many of the mistakes in the current system could have been easily avoided with the most basic of training/education in SQL/Set based development. So sad, so very sad.

Subject: Running Totals
Posted by: Joe Celko (not signed in)
Posted on: Friday, August 10, 2007 at 12:16 PM
Message: You might want to look at the SUM() OVER() function in the SQL-99 standards. A running total is easy to write, but you should test on your product.

Subject: RBAR
Posted by: Jeff Moden (view profile)
Posted on: Thursday, August 16, 2007 at 1:37 AM
Message: So THAT's what you look like!

Very nice article on RBAR, Remi. And for those that don't know... RBAR is pronounced "ree-bar"... kinda like the metal rods they stick in cement... an appropriate analogy (stuck in cement), too, I think ;)

Subject: RBAR
Posted by: Randy In Marin (view profile)
Posted on: Thursday, August 16, 2007 at 7:03 PM
Message: It still "depends". If the set based query will hold too many locks and adversly affect existing processes (e.g., blocking or deadlocks), breaking up the query into smaller transactions with a cursor might be a valid option. The new set-based query might be perfection, but it still has to content with existing code - which can be a bit less than perfect.

Subject: OOP Languages
Posted by: Chris (view profile)
Posted on: Friday, August 17, 2007 at 7:44 AM
Message: Good point but VB is not an OOP Language. VB.Net is but not any version prior to that.

"Procedural developers, those who code in OOP languages like C, VB and the '.Net' languages should never be allowed to make SQL/DB related decisions until they have received proper training from a certified SQL/DBA."

Subject: No, it doesn't "depend"
Posted by: Jeff Moden (view profile)
Posted on: Sunday, August 19, 2007 at 12:23 AM
Message: Breaking up large transactions to smaller transactions doesn't mean you have to resort to either a cursor or any form of RBAR. Yes, you can use a loop of sorts to limit the number of rows for a transaction, but the number of rows never needs to be just one and you certainly don't need a cursor to limit the number of rows.

Subject: RBAR....
Posted by: Rich (view profile)
Posted on: Friday, August 31, 2007 at 9:29 AM
Message: I went back to IT to help out on SqlSvr databases.
They had two guys who knew everything. They were refreshing some JDE Sales orders and general ledger tables to a Business Objects data base (100,000 rows by 216 columns). During the four hour refreshes they experienced many locking issues, customers complained on deaf ears. When I suggested four hours was bit much for a refresh job; the answer was always, we don't have time for that and they sqaushed my suggestions.
After going to the IT Director with my ideas, he gave me carte blanch to do whatever it takes. By adding one index, and copying ONLY changed rowws, the job now takes less than one minute. The former Data base manger is not longer here and works as a consultant!

Simple-Talk Database Delivery

Patterns & Practices Library

Visit our patterns and practices library to learn more about database lifecycle management.

Find out how to automate the process of building, testing and deploying your database changes to reduce risk and make rapid releases possible.

Get started

Phil Factor
How to Build and Deploy a Database from Object-Level Source in a VCS

It is easy for someone who is developing a database to shrug and say 'if only my budget would extend to buying fancy... Read more...

 View the blog

Top Rated

Clone, Sweet Clone: Database Provisioning Made Easy?
 One of the difficulties of designing a completely different type of development tool such as SQL Clone... Read more...

Database Lifecycle Management: Deployment and Release
 So often, the unexpected delays in delivering database code are more likely to happen after the... Read more...

The PoSh DBA: Assigning Data to Variables Via PowerShell Common Parameters
 Sometimes, it is the small improvements in a language that can make a real difference. PowerShell is... Read more...

Issue Tracking for Databases
 Any database development project will be hard to manage without a system for reporting bugs in the... Read more...

Releasing Databases in VSTS with Redgate SQL CI and Octopus Deploy
 You can still do Database Lifecycle Management (DLM) workflows in the hosted version of Team foundation... Read more...

Most Viewed

Beginning SQL Server 2005 Reporting Services Part 1
 Steve Joubert begins an in-depth tour of SQL Server 2005 Reporting Services with a step-by-step guide... Read more...

Ten Common Database Design Mistakes
 If database design is done right, then the development, deployment and subsequent performance in... Read more...

Temporary Tables in SQL Server
 Temporary tables are used by every DB developer, but they're not likely to be too adventurous with... Read more...

Concatenating Row Values in Transact-SQL
 It is an interesting problem in Transact SQL, for which there are a number of solutions and... Read more...

SQL Server Index Basics
 Given the fundamental importance of indexes in databases, it always comes as a surprise how often the... Read more...

Why Join

Over 400,000 Microsoft professionals subscribe to the Simple-Talk technical journal. Join today, it's fast, simple, free and secure.