Do Alpacas Dream of farming DBAs?

When IT starts to stand for 'Implement This', even the keenest DBA begins to dream of farming Alpacas as a career-change.
...and we ask our readers how they would have solved the DBA's dilemma.



ost days I truly love being a SQL DBA. Then there are the days when I seriously consider making the jump to Alpaca farming.

You see, my company recently purchased an application that the Human Resources department uses in order to monitor the Internet usage of employees.

Like most turnkey applications it was not vetted properly by anyone in the Technology department. Unfortunately, I.T., to some, does not stand for Information Technology, but rather Implement This. So, contracts were signed and we were handed the specification sheet from the vendor.

According to this specification, we should be able to keep a running 90 days of data to report against, data older than that is archived and purged from the database. Under this plan, the database should consume no more than 50 GB of space.

We set things up according to this specification and after the initial eight months we were topping out at 150 GB of consumed disk space for the database; even whilst purging the data that was more than three months old. Currently, ten months after implementation, we hit crisis point when the database is 270 GB in size, there is a mere 1 GB free on the disk devoted to the data file, and we are purging all data older than 30 days rather than the planned 90 days. Clearly, something was very wrong

A series of calls to the vendor establishes the fact that the purge process is not working correctly. There are two tables that still hold orphaned records from archived data. These records are overlooked in the purge process controlled via the application because of the lack of referential integrity between these tables and any others in the database which would have fired an error, or which could have been modified for cascading deletes.

The vendor’s level-two support provided our analyst with two scripts, which they rather optimistically believed would successfully run to clear out unnecessary data from each table. It looks something like this:

DELETE FROM tblOrphans 
    WHERE FieldX NOT IN (SELECT FieldX FROM tblParent1)
    AND FieldX NOT IN (SELECT FieldX FROM tblParent2)

This is one of those situations that every DBA encounters occasionally. We all know from experience that it takes time and space to make space. Balancing log and tempdb growth against reclaimed space, i/o, locking, and many other variables are not taken into consideration by most application vendors when they provide you with scripts like this one. Understanding the root cause of a problem and formulating the correct plan for its resolution is arguably one of the most important traits of a good DBA.

In order that you should share our feelings of dread at this point, I’ll sketch in a few details. The schema of tblOrphans is thus:

CREATE TABLE dbo.tblOrphans 
    (FieldX INT NOT NULL, FieldY VARCHAR(750) NOT NULL) 

There are two indexes on tblOrphans:

   (FieldX ASC, FieldY ASC)


   (FieldY ASC, FieldX ASC)

Other items of note:

  • None of the three tables in question have foreign key constraints
  • Neither parent table has a relevant index
  • tblOrphans consists of 205 million records, consuming 50GB of space
  • 191 million records would be deleted as a result of the delete query
  • Customers are accommodating of downtime. 24×7 access is not required
  • The server consists of two logical drives
    • C: 136 GB RAID 5 with 50 GB available space
      • Application files
      • all user database log files
    • D: 273 GB RAID 5 with 1 GB available space
      • system databases’ data and log files
      • user database data files
      • browser logs

The problem is this. We can see that the script as it was given us is most unlikely to run successfully. We need to release enough space to the O/S to allow the nightly 2GB worth of browsing logs to be input into the database until the next monthly purge; retaining enough data file space allowing for growth from the nightly input from these logs. The bright side is that we will be replacing this product instead of upgrading to a new version of their software. This means that modifying schemas, adding/removing indexes, and implementing referential integrity is acceptable so long as reporting and data entry is not adversely affected.

I know the steps (and failures) I encountered along the way to resolving this matter and I’ll outline the solution I ended up utilizing in a future article. In the meantime, I’d be very curious to know what you would have done with the information that has been presented.

Simple-Talk offer an iPod shuffle to the best proposed solution, submitted by 1st September 2007 as a comment to this article, and there will be three runners-up prizes of a Simple-Talk gift bag, which includes the highly desirable Red-Gate USB memory drive, and higly-collectable Simple-Talk pen..

I must be going, the Alpacas are getting restless!

(as you see below in the comments, the Simple Talk readers did not spare their energies in giving Timothy their sage advice. We continue the story, and Timothy’s award of the prizes, in the sequel…. Greetings from the Alpaca Mill)


Tags: ,


  • Rate
    [Total: 0    Average: 0/5]
  • imassi

    TRUNCATE TABLE dbo.tblOrphans
    Then blame it on the Alpaca that you let loose in the server room. You could always blame the script provided by the vendor. The business obviously isn’t impressed with the application if they are replacing it, so you could probably get away with it. The data will be gone when that happens anyway. I wouldn’t do something like this with important data, but if it’s so unimportant it’s being truncated after a month as it is.

    I suppose the solution that would fulfill the requirements would involve partioning the table, moving the partition onto its own file and moving the file onto the drive with the free space. But that seems like more work than the vendor put into their solution, so the hell with them.

  • KatanaCS

    1st question – Version of SQL Server & DB
    What version of SQL Server is this on? And what is the compatibility mode for the specified database?

  • Rodney

    Good to See
    Good to see this article. I am nursing a fractured right hand so writing has been slow-going. I deal with the same issues daily. I will post my solution for this (don’t need the goodies). Deletes are out of the question for this many records. The best thing to do is BCP a million recors out at a time of unwanted data, then move the remaining “wanted” data to a staging table and then truncate the original, move the “wanted” data back and load the historical data off to archive (if needed). Then some clean up for indexes. Do not run the vendor scripts. In fact…burn them. And further…Stop those users from surfing so much at work.
    Anyway…good to see you here and I look forward to more articles in the future. This is the good stuff.
    Rodney Landrum

  • WBrewer

    My solution.
    If Rodney wins a goodie bag, I want it if he doesn’t. The pens really work.

    My favourite solution to this would be a quiet one. I’d put the recovery mode for the database into ‘simple’, and shrink the log. I’d get enough working space by moving stuff around. Then I would run a routine on the agent that took a chunk out of the two tables every minute. (How much of a chunk would have to be established by experiment) I reckon that there are sufficient indexes on the table to allow this. This chunk would have to be more than the rate of insertions per minute. (I’ve popped in the figure 2000 as a guess as to what would work without causing problems)

    DELETE FROM tblOrphans
    WHERE fieldx IN (
    SELECT TOP 2000 fieldx
    FROM tblOrphans
    LEFT OUTER JOIN tblParent1
    ON tblOrphans.FieldX = tblParent1.FieldX
    LEFT OUTER JOIN tblParent2
    ON tblOrphans.FieldX = tblParent2.FieldX
    WHERE tblParent2.fieldx IS NULL
    AND tblParent1.fieldx IS NULL )

    The average DBA will want to rush in, having changed in a telephone booth, and do something totally radical to gasps of admiration from the onlookers, but when I came across a similar problem I figured out that, whereas an increasing table size meant misery, a decreasing table size meant happiess.

    Alternatively, one could nip down to the local computer store, grab a nice big hard disk and copy the existing one onto it using one of these clever drive copying applications. (I won’t mention names as I’m damned if I’ll give them a plug)

    A slightly more radical solution would be to stop the database, copy the mdf off onto a nice big server with plenty of room to breathe. Restart it. Attach the file on the new server and work your wicked way on it without fear of locking, logfile growth etc etc. Once you have the database short-back-and-sides and fully tested, then return it to the errant server.

  • Adam Machanic

    Another possible solution…
    Throw the database in single user mode, to turn off the app (middle of the night, or whatever)

    Create a new filegroup with a file on the C: drive–plenty of room there

    Create a new table with the same columns as tblOrphans, on the new filegroup

    Insert all of the qualifying rows:

    INSERT NewTable
    SELECT *
    FROM tblOrphans
    WHERE FieldX IN (SELECT FieldX FROM tblParent1)
    AND FieldX IN (SELECT FieldX FROM tblParent2)

    Drop tblOrphans.

    Move the data back to the primary filegroup and the table tblOrphans, using SELECT INTO.

    Put the database back into multiuser mode and move on with life.

  • Adam Machanic

    wait a moment…
    I screwed up the “qualifying rows” query in the last post. Try this instead:

    INSERT NewTable
    SELECT *
    FROM tblOrphans
    WHERE FieldX IN (SELECT FieldX FROM tblParent1)
    OR FieldX IN (SELECT FieldX FROM tblParent2)

  • david.buckingham

    My solution

  • david.buckingham

    My solution — take 2

    It butchered my code formatting. I hope you can follow it.

    (Ed: I liked the code so much I went in and reformatted it
    to make it easier to follow. I hope you don’t mind!)

  • LukCAD

    good example
    Thank you for your description of problem. Now i can image what will happend with programs where must be purge process executed periodically. It is like half-automatization or better to say manual mode by operator. But on my mind indx2 must be cutted off at all. I think it eats your server’s space and executing time. Better to do materialized view (increase space of course of your database) but it will works fast and no needing the purge process many years.

  • FrankAu

    This seems likely to moi …
    I actually prepared this on ASE 12.5.3 hence the use of ROWCOUNT … basically the number of rows smacked each iteration controls the impact on your log. I would look at using something like 10,000 although you would have to do some arithmetic to make sure the log demands can be met.

    SELECT @rc = 1
    WHILE @rc > 0
    DELETE #tblOrphan
    FROM (#tblOrphan O
    LEFT OUTER JOIN #tblParent1 P1
    ON O.FieldX = P1.FieldX
    LEFT OUTER JOIN #tblParent2 P2
    ON O.FieldX = P2.FieldX
    SELECT @rc = @@rowcount

    This should be OK to run while the system remains online although if transactions were being processed I would prolly be quite conservative on my ROWCOUNT setting to minimise lock contentions etc!

    Cheers Frank.

  • Babu Ambati

    Quite a few solutions.
    Read through all the solutions and many are forgetting that there is limited space and there is no mention if a new server is available. So based on all the solutions dleteing in small chunks should work but will require quite some effort and time. The other would be BCP out the data as suggested. Either way its a big problem as you have space crunch. Would like to see a solution that is way different and efficient then the regular DBA minds would thing of and do post the solution you will use to accomplish this issue. Good Luck!!!

  • GeoD

    Space considerations
    Figuring on an average data size, the remaining 14 million rows would consume about 3.5GB. Dropping the indexes from the tables and adding them back later should free up plenty of space like David recommended. Moving truncating and replacing would be my best course recommendation and could be done in short order without the need to shuffle data around to different disks.

  • Everett Music

    My solution
    1) Backup database holding tblOrphan to C: drive.
    2) Create temporary database on C: drive recovery model simple.
    3) Script out tblOrphan table.
    4) Create tblOrphan table in new database on C: drive.
    5) Insert data to be saved into new tblOrphan table (anticipate 14mil records) at the rate of 100,000 per insert (or more if tempdb can handle it).
    6) Truncate tblOrphan table.
    7) Backup log file (for database with tblOrphan) to C: drive.
    8) Shrink the database (with tblOrphan table) to remove all empty space (minus amount of space needed to put 14mil records back in) + 10% (or whatever your growth rate is).
    9) Import tblOrphan data from tblOrphan table on C: drive (anticipate 14mil rows) 100,000 rows at a time (more if possible).
    10) Delete temporary database on C: drive.
    11) Create nightly job to run tblOrphan delete statement.

  • Lewis Moten

    My solution
    It looks like you need to get rid of at least 8 million records a day to permit 2 additional GB of data.

    I’m concerned about you using an INT for your primary key and using millions of records. I would consider changing it to a big int or even a unique identifier.

    For the future, I would recommend looking into creating a partitioned view and then redefine it when it’s time to delete a table. Just add a column to partion on that is filled with a unique number each day (number of days since jan 1, 1970) so that after 30 days, you can drop them.

    To let the database operate on cleaning itself up on its own, I reccomend creating a job to remove data in chunks to prevent minimal performance cost to the end-users.

    Turn on auto-shrink

    Do a Full Backup.

    Create a job to run the following at least every 10 minutes:

    SET ROWCOUNT 60000
    Delete from tblOrphans where FieldX in(
    SELECT TOP 60000 T1.FieldX
    FROM tblOrphans T1 WITH(NOLOCK)
    left outer join tblParent1 T2 WITH(NOLOCK) on
    T1.FieldX = T2.FieldX
    left outer join tblParent2 T3 WITH(NOLOCK) on
    T2.FieldX is null
    and T1.FieldX = T3.FieldX

    where isnull(T2.FieldX, T3.FieldX) is null

    add an additional step to the job to backup the transaction log for both the current and tempdb databases.

    You can also add the following to help get your schema situated and orphaned data removed in the future:


    FK_tblParent1_tblOrphans FOREIGN KEY
    ) REFERENCES dbo.tblOrphans
    FK_tblParent2_tblOrphans FOREIGN KEY
    ) REFERENCES dbo.tblOrphans
    CREATE TRIGGER dbo.tblParent1Delete
    ON dbo.tblParent1
    delete tblOrphans where
    exists (select top 1 0 from deleted with(nolock) where tblOrphans.FieldX = deleted.FieldX)
    and not exists(select top 1 0 from tblParent2 WITH(NOLOCK) where tblOrphans.FieldX = tblParent2.FieldX)

    CREATE TRIGGER dbo.tblParent2Delete
    ON dbo.tblParent2
    delete tblOrphans where
    exists (select top 1 0 from deleted with(nolock) where tblOrphans.FieldX = deleted.FieldX)
    and not exists(select top 1 0 from tblParent1 WITH(NOLOCK) where tblOrphans.FieldX = tblParent1.FieldX)


  • Michael

    Permanent Fix
    Outsource HR
    Delete database

  • Rodney

    Regular DBA Mind
    That’s It…I succumb to self-mediocrity. I will never be more than regular. Thanks Babu. Wait I am feeling regular. Must leave…Pronto ASAP.

  • David

    About those Alpaca’s
    This example struck a chord with me. I’m a DBA in Johannesburg, South Africa. I’m fortunate enough to live outside of town, on 4 hectares. I have 800 free range hens, and sell the eggs. I grow organic veg which I sell too, and I’m busy with a fish farming project and a biodiesel project.

    I’ve recently been through the Implement This scenario with a CRM project and MIS hanging off the other end. And while these two projects are going fairly well, we are looking at replacing our core OLTP systems with hosted, “hands off”, systems!

    The fish farming is looking good. 🙂

  • David

    Back to the DBA problem
    Think “out of the box”, literally. Get one 500GB External Hard Drive, SATA if possible. Very affordable. Plug it into the firewire port, or USB 2.0. Space problem? Nada. Move Tempdb. Move data, create additional data files, do what every you want. Truncate, delete, BCP. Another idea, get a 4GB or the biggest memory stick you can get, get 2 of these. Plug them in. Now, question is, can you create additional data files for Tempdb on these memory sticks? Will be very fast!! I’ve just tested this on my laptop and it works! :-). 4 USB ports x 4GB = 16GB. I’ve just read that there are up to 32GB memory sticks. Wow. So think out of the box people. Cheers.

  • GilaMonster

    Quick and dirty
    Having recently been through an exercise of deleting a large number of unwanted rows from a very large table, I can feel the pain.

    Here’s what I would do.

    Attach an external drive/USB drive to the server, or map a network drive if that isn’t possible.

    BCP out the rows to the external/network that are still wanted, as identifed by the following query

    SELECT * FROM tblOrphans
    WHERE FieldX IN (
    SELECT FieldX FROM tblParent1
    SELECT FieldX FROM tblParent2)

    Then truncate the table tblOrphans.

    That’s got rid of the unwanted rows, but the db is still far too big.

    Run a shrink on the DB to bring it down to a reasonable size, allowing space for the rows that were exported to be replaced, and for some work space for things like reindexing.

    Rebuild all the indexes that were scrambled by the shrink

    BCP in the rows that were exported in step 1.

    add in the foriegn keys that were ‘missed’

    Go and have a beer or two.

    Oh, while you’re at it, change that second nonclustered index on tblOrphans to a non-unique index on fieldY and put X as an included column.

    David, I don’t know where you work currently, but I know of a DBA opportunity in JHB if you’re interested. Send me a PM if you want to know more

  • Rodney

    Regular DBA Mind
    That’s It…I succumb to self-mediocrity. I will never be more than regular. Thanks Babu. Wait I am feeling regular. Must leave…Pronto ASAP.

  • Granted

    No solution
    But I love the PKD title.

    Nice article and interesting problem.

  • Andrew Torgerson

    Proposed new delete statement
    — remove useless index
    DROP INDEX dbo.tblOrphans.idx2;

    — proposed new delete
    DELETE FROM dbo.tblOrphans
    dbo.tblOrphans a
    LEFT JOIN dbo.tblParent1 b
    ON (a.FieldX = b.FieldX)
    LEFT JOIN dbo.tblParent2 c
    ON (a.FieldX = c.FieldX)
    b.FieldX IS NULL
    OR c.FieldX IS NULL

  • Tore

    Don’t forget the parents…
    Many good suggestions here. We could definitely drop at least idx2 on tblOrphans. That should help create some breathing room. We could also use the 50 Gb on C: for a new file or database to hold some temporary results.

    Below, SPACEMADEAVAILABLE can refer to either freed up space on PRIMARY (or other) or new file on C:. Consider it a virtual FileGroup.

    Then I would do the following (remember, there is no relevant index on the parents):

    CREATE TABLE newOrphans (FieldX INT NOT NULL, FieldY VARCHAR(750) NULL)

    (FieldX ASC, FieldY ASC)

    INSERT INTO newOrphans (FieldX)
    SELECT FieldX FROM tblParent1

    INSERT INTO newOrphans (FieldX)
    SELECT FieldX FROM tblParent2

    NOTE: If there is overlap between tblParent1.FieldX and tblParent2.FieldX, do a single insert with a UNION on the select instead.

    Then we need to get the FieldY values. We are still dealing with 14 Mill rows, although small, so BCP comes in handy. BCP export to a file on C: using the following SELECT:

    SELECT New.FieldX, Old.FieldY
    FROM newOrphans as New
    INNER JOIN tblOrphans as Old
    ON Old.FieldX = New.FieldX

    Finally, truncate tblOrphans, shrink database, and BCP import from the file produced. And rebuild the indexes on tblOrphans.

    Do your cleanup:
    Drop newOrphans, add back the second index on tblOrphans (if needed), and get rid of the temporary SPACEMADEAVAILABLE unless it is part of PRIMARY.

    Also – this may be a good time to add some referential integrity… :->

  • Tore

    OOPS… correction
    After typing in this cramped little space, I copied my post to Notepad and made some changes. It would have been good to include those… :-<

    There is no need for FieldY in the new/scratch table, so replace “newOrphans” with “newOrphanList” in the above and use:

    CREATE TABLE newOrphanList (FieldX INT NOT NULL)

    (FieldX ASC)

  • CathyP

    Hmmmm – how about……..
    The table is called dbo.tblOrphans.

    I’d investigate whether I could just truncate it on the basis that a table of orphaned records are just that……

  • Jonas

    My five cents
    Vow.. child table referencing two parents on the same key:) forget standard referential integrity.. if you enforce it ugrrh… all the possible errors.. forget cascading deletes. I wonder what was a logic behind this kind of design: was it a. both parents have same key but share different attribute(column) set or b. parent1 doesn’t have to have matching key in parent1 and the rest of columns differ on some predefined business rules or c. there is no logic. Anyway, it is amazing, that somebody can come up with this stuff. In this situation, in my opinion, there is nothing fancy you can do about, just regular dba routines. I would do the following, most of it is already mentioned above, but this is my routine in these kind of situations (hardware solutions out of question as I understand):

    1.Backup the db, enable Simple recovery

    2.Drop both indexes (if heavily fragmented, if not drop after step 3)

    3.Export ‘active/wanted’ data to the file on C drive using bcp variable length. 14 mil records or about 3.4GB yeah that will take few hours

    4.Truncate the table tblOrphans

    5.For this step I would need more details.

    a.Db could be shrunk to release the space but running it on 270Gb size could take a very long time, particularly if we need to release over 100GB. We might be able to speed up if indices are heavily defragged and could be dropped and recreated after shrink is done.

    b.It was mentioned by vendor that db should be about 50GB size. How much truth is in it? Can we repeat same purge procedures for other tables and get actual db size under 100GB? If yes maybe we could release enough space on C from log files and make actual export of data to new db on C (w/o indices), once done script all sql objects and drop original db, detach new db and copy to D drive, attach and run indices along with other appropriate scripts e.g. permissions, etc In my experience solution B is usually faster for dbs under 100GB vs. shrinking from at least twice the size needed, but it requires a lot of attention.

    6.Import the data back from the file using either BULK INSERT

    7.Build same indexes dropped in step2 (they are not redundant as some suggested)

    8.Enable required recovery model and issue Full backup

    9.As for the future, there are few options either setup delete triggers (as suggested by David B.) either schedule a job the either runs appropriate delete statements or better executes procedure that has the same statements. Note: if this is SQL 2005 server, I would also disable indexes before the deletes and enable back afterwards to speed up things. Also, if possible would turn of logging before deletes OR if above takes way too long, would look into partitioning the table

  • Andrew

    Eating elephants
    How do you eat an elephant?
    One bite at a time 🙂

    Had an issue with defragmenting / reindexing a 120GB database on a production server, since the normal maintenance plan tries to be really clever and do everything in parallel in one BIG transaction, with backup policies backing up on the hour every hour between 6am & 6pm (meaning the log was not cleared until 6am) 🙁

    In concept we set a small table to track which table had been reindexed when, then set up a nightly job to reindex tables starting with the oldest and continue for a set period of time.

    I’d approach this in a similar manner – decide on an acceptable batch size (i.e. # of rows – base it on avg bytes per row + fudge factor, figure out size in MB and check your available disk space) to process each night, note the start time, then SET ROWCOUNT X and DELETE in a loop until allotted time period has passed.
    Set this up in a nightly job and let it go…may take a few days to come back down, but it will.
    You can issue checkpoints / truncate log / set recovery to simple if you’re allowed to do that.

    BTW if you planned on 50GB and 270GB is too much, I’d say you had a problem with disk space to begin with…Windows likes 20% free disk space…unless you install your DB on a raw partition.

    Hope this helps someone.

  • Big CT

    Lousy App
    Simple. Bitch-slap the vendor, shutdown the app and go on with applications that really matter.

  • Mormonboy the “want to be farmer”

    Pictures are of llamas, not alpacas
    This is only one problem with your illustrations in the article, the pictures are of llamas, not alpacas. Alpacas have more wool on their forehead than llamas. Please check this link to see the differences amongst Camelidea family of fauna.

    Llamas are of the Genus species Lama glama, while alpacas are from the Genus species Vicugna pacos. There is a difference.

  • Anonymous

    Orphaned Data is Already Archived
    Archive all the data using the application. Truncate the orphan table. Shrink the database files. Since the application is no longer to be used for new data, have it look at the archived data only.

  • Anonymous

    And the winer is?
    Which solution did you implement and how did it go?

  • Anonymous

    Brute force works
    Here’s the question – how is the data used? I assume truncating tblOrphan would break reports. Does it matter? Is it worth it to the organization to pay for the hours and hours of this mop-up, to preserve data that will be deleted in 90 days anyway? If the answer is no, then:

    1. Keep a CYA backup of the database.
    2. Drop the useless index
    3. Truncate tblOrphan.
    4. Set up a nightly/weekly/whatever delete process to prevent the table from mushrooming again.

  • Rockstar

    nice article
    Tim, thanks for the article. We see similar issues with many vendor products. I am beginning to wonder if many of these shops even have a DBA on staff.

    Keep up with writing good articles such as this one.

  • Tim Ford

    Alpacas v. Llamas
    Good job Mormonboy! You were the first to catch that the sketch was of a llama. My editor actually chose to add these sketches and made the comment that he expected quite a few posts concerning the fact that llamas were involved.

  • Anonymous

    Use Link Server
    As ITS,I guess more than one SQL Server in your hand, you can create a link server,and move the data (2000 Rows for every time) to another server,
    If you think some data moved is still useful,you can move them back

  • RichB

    Do what you are told – nyp.
    Second guessing the vendor? Really, that is way past your pay grade.

    Sure, it will all run out of space, the lgo will grow, the tables will lock – with a bit of luck the drive with master and tempdb etc will run out of space, the sql server will go down and stay down.

    Course, thats not your problem. Sadly the monitoring software will have to be turned off, and an expensive contractor brought in to sort it out. The bill for that would presumably be sent to the solution provider, as would other expenses.

    What would I do if it were my problem? What I do every couple of weeks … or just go on holiday!