Click here to monitor SSC
  • Av rating:
  • Total votes: 38
  • Total comments: 27
Hugo Kornelis

Painless management of a logging table in SQL Server

11 June 2013

Tables that log a record of what happens in an application can get very large, easpecially if they're growing by half a billion rows a day. You'll very soon need to devise a scheduled routine to remove old records, but the DELETE statement just  isn't a realistic option with that volume of data.  Hugo Kornelis explains a pain-free technique for SQL Server.

Many databases have them: a table that is used to log “everything” that happens. Rows are added to these tables all the time, and we all know to prevent page splits by clustering on an ever increasing column, like the date/time of the event, right? We just keep adding data, usually in a “write-only” way. The only time these tables are ever read is when troubleshooting a weird issue, or when the blame game has to be played.

Of course, if you keep adding data to a table, it will grow. And it will eventually grow too big. At that time (or, hopefully, before that time), you’ll have to do something to tackle the problem. Unless you are under some legal obligation to retain this logging information "forever", you can usually just clear out old stuff. For small tables, this is easy:

DELETE FROM dbo.Logging

WHERE LogDate < DATEADD(day, -14, CURRENT_TIMESTAMP);

But for large tables, this is not good. Even if the LogDate column is the clustered index key (which it should be – please don’t add an IDENTITY column to these tables!), you still don’t want to run this if it would delete several millions of rows. It will bloat the transaction log because all changes have to be logged, and until the transaction has finished (which can take a while), the transaction log will not be cleared, even after a log backup. Locks on the clustered index will not cause a problem (you are deleting rows on one side of the index and adding on the other side), but unless you disable lock escalation, you’ll soon have an exclusive table lock blocking all other access. And even if you do disable lock escalation, locks on the nonclustered indexes will soon start to block other access as well!

The usual solution is to batch the delete:

DECLARE @TrimDate AS datetime = DATEADD(day, -14, CURRENT_TIMESTAMP);

-- This DECLARE assignment sets @@ROWCOUNT to 1

WHILE @@ROWCOUNT > 0

BEGIN;

  DELETE TOP(4000) FROM dbo.Logging   -- 5,000 is the default lock escalation threshold

  WHERE LogDate < @TrimDate;

END;

This prevents lock escalation, and because the transaction autocommits after each statement, it will never hold locks for a long time, and it will not prevent transaction log reuse. It will still cause a significant strain on resources, as all rows have to be deleted, nonclustered indexes have to be updated, and all changes are logged in the transaction log. However, if you schedule this as a task during low traffic hours and backup your log often enough, this will work nicely for deleting a few million rows.

Really, really, really large!

But what if the table is more than just large? What if you have to delete hundreds of millions of rows? What if the table grows by half a billion rows per day, and by the time you notice the problem, you have to delete 15 days’ worth of data? Do the math – that’s 7.5 billion rows to be deleted from a table that is probably somewhere between 10 and 15 billion rows big. Do you think you can still use the batched delete technique? Good luck with that! The first time I encountered a logging table this size, the client found that the only way for them to free up space was to use TRUNCATE TABLE to remove all the rows. Of course, if something happened that required them to check the logs right after that, they were hosed…

My solution was to partition the logging table. That enabled them to switch out partitions with old data to a helper table (a change of metadata only, so very fast), truncate the helper table (also very fast because complete pages are deallocated all at once instead of deleting individual rows, and only the deallocation is logged), then repeat until enough space was available. Of course, this also requires them to regularly create new partitions. And though this all worked like a charm, they really wanted this to be setup to work completely automatically. That was a bit harder – but it’s possible, as this article shows.

Getting started

If you are lucky, you can set up the table with partitioning before the system goes live and the table starts filling with data. So, how often are you that lucky? Yeah, thought so. Same for me. Hundreds of millions of rows already in the table.

If you search the internet for how to implement partitioning on an existing table, you’ll be told that you’ll have to rebuild (or even drop and recreate) the clustered index. Do you have any idea how much time and resources that takes when you do that for a table this size? All that time, the table will be locked; that alone forbids this action in a 24x7 shop. If you do have a sufficiently large maintenance window, you’ll still have to deal with a transaction log that will grow, grow, grow, and grow even more.

Before I tell you how I managed to avoid that performance hit, allow me to first set up a demo database and a fake logging table:

-- PREPARE SANDBOX ENVIRONMENT

-- Drop and recreate sandbox database

USE tempdb;

go

ALTER DATABASE Sandbox SET SINGLE_USER WITH ROLLBACK IMMEDIATE;

DROP DATABASE Sandbox;

go

CREATE DATABASE Sandbox;

go

USE Sandbox;

go

 

-- Assume we already have an existing logging table

CREATE TABLE dbo.Logging

   (LogDate datetime NOT NULL,

    LogPerson varchar(20) NOT NULL,

    LogInfo varchar(4000) NOT NULL)   -- Real table will have a bunch of actual columns instead

ON [PRIMARY];

 

CREATE CLUSTERED INDEX CI_Logging

ON dbo.Logging(LogDate)

ON [PRIMARY];

 

-- Assume we also have a nonclustered index

CREATE NONCLUSTERED INDEX NC_Logging_Person

ON dbo.Logging(LogPerson)

ON [PRIMARY];

 

-- Assume we also already have some data

-- (generated test data spans period Jun 1 - Jun 4)

INSERT INTO dbo.Logging (LogDate, LogPerson, LogInfo)

SELECT d.Date, 'Hugo Kornelis', REPLICATE('x', 4000)

FROM (VALUES ('20130601'), ('20130602'), ('20130603'), ('20130604')) AS d(Date)

CROSS JOIN (SELECT TOP(200) * FROM master..spt_values) AS der;

Instead of changing the table to partitioned, or copying all data (which effectively has the same problems), I want to create a new, empty table with the correct schema and the required partitioning scheme. I can then use the technique of partition switching to switch in the data as a metadata-only change. Unfortunately, switching a table into a partition of a partitioned table requires a trusted CHECK constraint to be in place, and adding a trusted CHECK constraint requires SQL Server to read all the rows in the table. During this operation, there is a schema-stability lock on the table, excluding all access to the table. And though SQL Server is smart enough to avoid reading all rows from the clustered index, it will read all rows from one of the nonclustered indexes – which on a table this size, can still take quite some time. I was unable to find a way around this. It’s better than having to rebuild the entire clustered index, obviously, but if anyone knows a way to avoid this hit as well, it would be a great benefit for this scenario.

Because I didn’t want to hardcode the date boundary, I use a query with MAX to find the current highest value for LogDate (for this query, SQL Server is smart enough to read only the last row from the clustered index – something it should do when adding the CHECK constraint, but unfortunately doesn’t). If this is earlier then today, I increase it to today (so that additional log data coming in after the change will still go to the first partition and the second partition remains empty – I’ll explain why this is important later in the article). I then use this to calculate the boundary for the partition function, and for the CHECK constraint. Here is the code:

-- INITIAL SETUP

-- Find lowest date boundary after log entry with highest LogDate

DECLARE @MaxLogDate datetime = (SELECT MAX(LogDate) FROM dbo.Logging);

IF @MaxLogDate < CURRENT_TIMESTAMP

BEGIN;

  SET @MaxLogDate = CURRENT_TIMESTAMP;

END;

DECLARE @PartBoundary datetime = CAST(@MaxLogDate + 1 AS date);

 

-- Switching in requires a trusted CHECK constraint;

-- creating this requires all data in the table to be read.

-- (It also requires dynamic SQL...)

DECLARE @sql nvarchar(4000);

SET @sql = N'ALTER TABLE dbo.Logging

ADD CONSTRAINT ck_CheckDate CHECK (LogDate < ''' + CONVERT(char(8), @PartBoundary, 112) + ''');';

EXECUTE (@sql);

 

-- Create partition function and partion scheme,

-- such that all existing data goes in first partition,

-- and second partition is empty.

CREATE PARTITION FUNCTION pfLog (datetime)

AS RANGE RIGHT FOR VALUES(@PartBoundary);

 

CREATE PARTITION SCHEME psLog

AS PARTITION pfLog ALL TO ([PRIMARY]);

 

I can now create a helper table, making sure to use the exact same columns and also to create the exact same indexes as the original table. The only difference is that the tables and all the indexes have to be allocated to the partition scheme instead of a filegroup.

-- Create the new table (will rename later)

-- IMPORTANT: Schema and indexes need to be identical!

-- IMPORTANT: New table and indexes should be allocated on partition scheme!

CREATE TABLE dbo.Logging_New

   (LogDate datetime NOT NULL,

    LogPerson varchar(20) NOT NULL,

    LogInfo varchar(4000) NOT NULL)   -- Real table will have a bunch of actual columns instead

ON psLog(LogDate);

 

CREATE CLUSTERED INDEX CI_Logging_New

ON dbo.Logging_New(LogDate)

ON psLog(LogDate);

 

-- Identical nonclustered index, but has to be partitioned as well

CREATE NONCLUSTERED INDEX NC_Logging_Person_New

ON dbo.Logging_New(LogPerson)

ON psLog(LogDate);

 

And now, finally, I can make the change. I switch the old table into the first partition of the new one; this change is implemented by changing metadata only, but it will result in all data effectively being in the new table, and the original table being empty. I then drop the old table, and rename the new table and all its indexes. (Renaming indexes can be skipped if the company has a naming policy that allows for freedom in the choice of index names). I do all this in a single transaction so that all access to the table is blocked until all changes are done, and to ensure that in case of an error, all changes are undone. Regardless of the size of the table, this transaction should not take more than a few milliseconds – though it can take some time on a busy system before it gets the locks that are required for the partition switch.

-- The following statements have to be in a single transaction!

BEGIN TRAN;

BEGIN TRY

  ALTER TABLE dbo.Logging

     SWITCH TO dbo.Logging_New PARTITION 1;

  DROP TABLE dbo.Logging;

  EXEC sp_rename 'dbo.Logging_New', 'Logging';

  EXEC sp_rename 'dbo.Logging.CI_Logging_New', 'CI_Logging', 'INDEX';

  EXEC sp_rename 'dbo.Logging.NC_Logging_Person_New', 'NC_Logging_Person', 'INDEX';

  COMMIT TRAN;

END TRY

BEGIN CATCH

  ROLLBACK TRAN;

  THROW;

END CATCH;

After executing this code, your Logging table will have two partitions. The first one contains all the existing data, and data flowing in will also go to this partition. The second partition will be empty, and stay empty until midnight. After that, new data will go to this partition. You must make sure to start regular maintenance of the partitioned logging table before that, or you’ll get new problems!

I recommend immediately invoking the MaintainLogTable stored procedure (explained below) to ensure that new partitions are made for the next few days. The table will still grow for some time, because the original contents are now all in the first partition, which will not be removed until all the data in the partition expires. So you might want to consider running with a lower retention period for the first period, or investigating if you can temporarily use some extra storage, until this first, large partition is removed. After that, the dbo.Logging table will have, and keep, its normal working size – provided you ensure that regular maintenance is done at set times (preferably daily).

Regular maintenance

So now you have a logging table that is partitioned, either because you allocated an empty table that way before the database goes live, or because you used the method described above to create such a table and get all the data in one of the partitions. Either way, it is very important to ensure that the “last” partition always remains empty. So assuming that the logging date is always equal to (or, maybe, less than) the current time, you should ensure that the last partition boundary is set to a date in the future. The reason that this partition has to be empty is that it is very easy to create new partitions by simply splitting an existing partition. If the partition is empty, this is a metadata operation. But if there is data in the partition, SQL Server will have to process all the rows and put each in the correct new partition – something we will try to avoid, given the size of the table!

In order to ensure smooth operation of the partitioned logging table, we have to create new, empty partitions to receive new incoming data every day (BEFORE data starts going into the last partition!), and remove partitions that are old enough to only hold expired data. All this is done in the MaintainLogTable stored procedure. (The code of this procedure is attached to this article). So all you have to do is create that stored procedure on your system (ensure that you change the names of the table and the partition function to match those on your system!), create the helper table used for purging expired data, and create a job that invokes this procedure on a daily schedule.

The stored procedure

I’ll describe how the code works later in this article; this section is only on how to use it. The stored procedure is called with three parameters, all of which have defaults.

The first parameter is @DaysToKeep. This specifies how long old data is kept. This is often a tradeoff. Keeping data for a longer time can be useful for some troubleshooting (or blame-gaming) scenarios – but with hundreds of millions rows of data coming in each day, that incurs a significant cost. The default of 14 for this parameter ensures that data is only removed when it’s at least two weeks old.

The second parameter is @DaysInFuture. This parameter determines how many new partitions will be made. It should always be at least 1, so that a new partition is made for tomorrow’s data. If you want some resilience to possibly skipping the maintenance job a day, set it higher. This will create a few extra empty partitions. Aside from a few rows in the system tables, there is no cost involved with this – that’s why I’ve set the default to 7, so that you are still okay if you forget to run the maintenance job for a week – remember that the last partition should always remain empty; creating partitions for several days in the future ensures that this condition will be met, even when the maintenance job doesn’t run the next few days. (If you plan to run the job only once a week, set it to a higher number!)

The third parameter, @CurrentDate, should never be used in normal operation. I used it to help me test the code without having to wait for the next day, and I decided to leave it in for if you ever run into a case where you want the procedure to behave as if it was running on a different date.

So if your logging table stores data that has to be kept for at least a month, and you think that the default of preparing 7 days in advance is okay, you can simply add this line of code to your daily maintenance plan:

EXEC MaintainLogTable @DaysToKeep = 31;

If you plan to run the procedure once a week and think the default retention period of 14 days is okay, you use:

EXEC MaintainLogTable @DaysInFuture = 10;

And if you need to override both defaults, your maintenance job would include something like this:

EXEC MaintainLogTable @DaysToKeep = 10, @DaysInFuture = 4;

Helper table

The procedure requires the existence of an empty helper table with the exact same schema and indexes as the logging table – except that this helper table and indexes are not partitioned, but allocated on a normal filegroup. This filegroup has to be on the same filegroup that houses the partition scheme.

-- Create the helper table used for quickly purging old data

CREATE TABLE dbo.Logging_Purge

   (LogDate datetime NOT NULL,

    LogPerson varchar(20) NOT NULL,

    LogInfo varchar(4000) NOT NULL)   -- Real table will have a bunch of actual columns instead

ON [PRIMARY];

 

CREATE CLUSTERED INDEX CI_Logging_Purge

ON dbo.Logging_Purge(LogDate)

ON [PRIMARY];

go

 

This table will be used by the MaintainLogTable stored procedure for fast purging of expired data – partitions are switched in (a metadata operation), and then the data is quickly purged by using TRUNCATE TABLE.

Manual intervention is required

Did I already mention that the last partition must always be empty? Well, if it isn’t, then the stored procedure will fail. It will throw an error, with this message:

Msg 50000, Level 16, State 1, Procedure MaintainLogTable, Line 25
Rightmost partition of the LogEntries table is not empty; manual intervention required!

The most probable cause of this is that you skipped maintenance for a few days, and data has started to flow in the last partition. If this is the case, the best way to solve it is to simply split the partition manually, with the new boundary such that all existing data goes to the left half of the split, and the right half is empty. SQL Server will need to scan the data to check if any rows need to move to the new partition, but since this is not the case, the operation should still finish fairly quickly – at least a lot quicker than when you use a boundary such that part of the data actually has to move!

After this, you will have a partition that spans a few days, and none of the data in that partition will be purged until the most recent log entry it holds expires. So your table will temporarily require some extra storage space.

Another cause can be an application error that caused a row to be entered with a LogDate that is far in the future. The fix for this is to change or delete that row. You cannot use the same procedure as above, because otherwise you’d get a single partition for maybe a year, and that would take way too much storage space!

The first query below will show you all your partitions, with the amount of rows in them and their boundaries. The second query will show you the highest actually used value. Use these these (after changing the names of the table, partition scheme, and partition function to match your database) to assess the appropriate way to fix the problem.

-- First, check the data:

SELECT     p.partition_number, p.rows,

           prv.boundary_id, prv.value

FROM       sys.partition_range_values AS  prv

INNER JOIN sys.partition_functions    AS  pf

      ON   pf.function_id              =  prv.function_id

      AND  pf.name                     = 'pfLog'

RIGHT JOIN sys.partitions             AS  p

      ON   p.partition_number          =  prv.boundary_id

WHERE      p.object_id                 =  OBJECT_ID('dbo.Logging')

AND        p.index_id                  =  1      -- Clustered index

ORDER BY   p.partition_number;

 

-- Check the current highest value:

SELECT MAX(LogDate) FROM dbo.Logging;

If you need to manually create a new partition, use this code (don’t forget to put in the appropriate value in the last statement!):

-- Highest value is on June 11, so we need to split at June 12.

-- Manually split the last partition

-- (Note - this is now NOT a metadata-only operation - but it is very cheap.)

ALTER PARTITION SCHEME psLog

       NEXT USED [PRIMARY];

ALTER PARTITION FUNCTION pfLog()

       SPLIT RANGE ('20130710');

 

If you find that there is suspect data, with dates in the far future, you’ll really have to fix this based on your specific situation; I cannot give you the code for that.

How it works

Of course, you are not the kind of DBA who would ever blindly copy code from the internet. You first want to know exactly how it works. Not only because of a healthy dose of paranoia, but also because that would help you learn, and ultimately get better at your job. Good for you!

In this section, I’ll show the code of the stored procedure and explain what it’s doing.

CREATE PROC MaintainLogTable @DaysToKeep int = 14,

                             @DaysInFuture int = 7,

                             @CurrentDate datetime = NULL

AS

 

-- @CurrentDate defaults to today

IF @CurrentDate IS NULL

BEGIN;

   SET @CurrentDate = CURRENT_TIMESTAMP;

END;

This is the first part of the procedure. (The actual code also contains a comment block with a short explanation of how to use the code, and a copyright notice). It is not possible to use a system function like CURRENT_TIMESTAMP in the procedure’s parameter declaration, hence the extra conditional code to set @CurrentDate to CURRENT_TIMESTAMP if no other value was specified.

-- Check if last partition is empty (otherwise, manual intervention is required)

DECLARE @rows bigint;

SET @rows = (SELECT   TOP(1) rows

             FROM     sys.partitions

             WHERE    object_id     =  OBJECT_ID('dbo.Logging')

             AND      index_id      =  1      -- Clustered index

             ORDER BY partition_number DESC);

IF @rows > 0

BEGIN;

   -- Report error and exit

   RAISERROR ('Rightmost partition of the dbo.Logging table is not empty; manual intervention required!', 16, 1);

   RETURN (-1);

END;

 

This is also fairly obvious. The DMO sys.partitions is used to find the number of rows in the rightmost partition, and if it’s not zero an alert is raised and execution halts. Someone will have to intervene manually before automated management of the partitioned table can resume. If your logging table has a different name or is allocated in a different schema, you’ll have to change this in the query, and in the error message.

There will be rows in sys.partitions for each of the indexes on the table; I filter on number 1 (the clustered index) because there always will be a clustered index, and I can be 100% sure that this is not a filtered index – so a rowcount of 0 really means that there are no rows in the partition.

-- REGULAR MAINTENANCE PART 1: CREATE NEW PARTITION(S)

-- Get highest boundary currently used

DECLARE @HighBound datetime;

SET @HighBound = (SELECT      TOP(1) CAST(value AS datetime)

                  FROM        sys.partition_range_values AS prv

                  INNER JOIN  sys.partition_functions AS pf

                        ON    pf.function_id = prv.function_id

                  WHERE       pf.name = 'pfLog'

                  ORDER BY    boundary_id DESC);

 

-- Repeat until we are @DaysInFuture in the future

WHILE DATEDIFF(day, @CurrentDate, @HighBound) < @DaysInFuture

BEGIN;

   -- Increase by a day and split partition

   SET @HighBound = DATEADD(day, 1, @HighBound);

   ALTER PARTITION SCHEME psLog

         NEXT USED [PRIMARY];

   ALTER PARTITION FUNCTION pfLog()

         SPLIT RANGE (@HighBound);

END;

To find the current highest boundary value (the threshold between the last and the second to last partitions), I query the DMO’s sys.partition_functions (to find the internal numeric identifier for the partition function by name – please don’t forget to change the filter in the query if you have named your partition function something other than ‘pfLog’) and sys.partition_range_values (to get the actual boundary values).

I then start a loop that will continue executing until there are partitions up to @DaysInFuture in the future. In the loop, I compute the new boundary value, then use ALTER PARTITION SCHEME and ALTER PARTITION FUNCTION (don’t forget to change the names here as well!) to tell SQL Server where to allocate the next partition, then split the currently highest partition based on this new boundary date.

The reason I do this in a loop is resilience to skipping the maintenance jobs. If you missed a day, then run this procedure the second day, the loop will run twice, creating two new partitions for two new days. With a simple IF, you’d get a single partition for two days, which would make it harder to manage the storage requirements of the logging table. An added benefit of this is that it’s very easy to get started with this method if you’re setting up a new system – just create the logging table using a partition function with a single date boundary that can be anywhere in the past; the maintenance procedure will immediately create partitions for all days since, up to the specified number of days in the future. (And the second half of the procedure will then promptly remove those that are already older than the specified retention period). Just make sure to pick a date that is less than 40 years in the past, or you’ll run into problems with SQL Server’s 15,000-partition limit.

-- REGULAR MAINTENANCE PART 2: REMOVE OLD PARTITION(S)

-- Get lowest boundary currently used

DECLARE @LowBound datetime;

SET @LowBound = (SELECT      TOP(1) CAST(value AS datetime)

                 FROM        sys.partition_range_values AS prv

                 INNER JOIN  sys.partition_functions AS pf

                       ON    pf.function_id = prv.function_id

                 WHERE       pf.name = 'pfLog'

                 ORDER BY    boundary_id ASC);

 

 

-- Repeat until we are @DaysToKeep in the past:

WHILE DATEDIFF(day, @LowBound, @CurrentDate) > @DaysToKeep

BEGIN;

   -- Swap out the oldest partition

   ALTER TABLE dbo.Logging

         SWITCH PARTITION 1 TO dbo.Logging_Purge;

   -- Get rid of unwanted data

   TRUNCATE TABLE dbo.Logging_Purge;

   -- Remove oldest partition

   ALTER PARTITION FUNCTION pfLog()

         MERGE RANGE (@LowBound);

   -- Get lowest boundary currently used

   SET @LowBound = (SELECT      TOP(1) CAST(value AS datetime)

                    FROM        sys.partition_range_values AS prv

                    INNER JOIN  sys.partition_functions AS pf

                          ON    pf.function_id = prv.function_id

                    WHERE       pf.name = 'pfLog'

                    ORDER BY    boundary_id ASC);

END;

This is the final part of the procedure. You will, again, have to modify the name of the partition function (in three places), the name of the table (in one place), and the name of the helper table used for purging data (in two places). I use a query very similar to the one in the first part of the procedure but with a different ORDER BY direction to find the date that defines the boundary between the two oldest partitions. If this is more than the specified retention period in the past, I can purge the oldest partition – and I again repeat this in a loop, so that I catch up automatically even after skipping maintenance a few days.

The actual swapping uses ALTER TABLE to switch some pointers in the metadata – the pointers to the actual data for the first partition of dbo.Logging (which contains a lot of data that should all be removed) and those for the actual data of dbo.Logging_Purge (that should be empty, because it is never used) are swapped around. After this, dbo.Logging_Purge now contains the data that was in the first partition of dbo.Logging and that we want to purge, which is done by the TRUNCATE TABLE statement. The first partition of the actual logging table should now be empty, so that the ALTER PARTITION statement that merges the first two partitions can be performed as a metadata-only change.

Because this is done in a loop, and because manual intervention with the partition ranges can cause partition boundaries to be less or more than a day apart, I end the loop with a copy of the query to get what, after the merge, is the boundary between the partitions that are now the two oldest. This value is then used to determine if the loop has to be executed again.

Variations

I created this method to deal with a logging table that received hundreds of millions of rows per day. But it can be used for other purposes as well. Smaller logging tables might not really need this method, but it doesn’t hurt – so you can choose to use it for all logging tables, regardless of the daily volume of rows. Or even for all other tables that need to be partitioned by day.

And it doesn’t stop there. If you don’t need daily partitions, but would like to have partitions for each week or each month (or, with an even higher volume of logging, maybe even every hour), you can easily change the procedure to support that. Unfortunately, SQL Server does not support using a variable for the “ datepart” parameter of DATEADD and DATEDIFF functions, otherwise I would already have added this. But a quick search and replace can easily fix this for you.

If you want to use automatically maintained partitioning on multiple tables, you could try to change this procedure to support a “tablename” parameter. But that would require you to use dynamic SQL, and for lots of reasons I try to avoid that as much as possible. It’s probably much easier to just make a few copies of this stored procedure, give them different names, use search and replace to change the names of the table, the helper table used for purging, the partition function, and the partition scheme, and then create all those stored procedures in SQL Server. There is no measurable cost involves in having a few extra stored procedures, so really, there is no need at all to resort to dynamic SQL!

Conclusion

Partitioning is an ideal way to manage very big tables where old rows regularly have to be purged. But it does require that new partitions are created and old partitions are actually removed at regular intervals. The stored procedure I presented in this article helps you  by completely automating these tasks. The only thing you still need to do is to make sure the procedure is regularly executed, e.g. by scheduling it as part of your maintenance tasks.

Moving an existing, non-partitioned table into a partitioning scheme normally requires an enormous amount of resources and a considerable downtime. This can be reduced by a huge amount by using partition swapping instead of rebuilding the index. However, this does require that a CHECK constraint is created first, which will still take a lot of resources and lock the logging table for a prolonged period – though not quite as long as rebuilding the index would take.

Hugo Kornelis

Author profile:

Hugo is co-founder and R&D lead of perFact BV, a Dutch company that strives to improve analysis methods and to develop computer-aided tools that will generate completely functional applications from the analysis deliverable. The chosen platform for this development is SQL Server. In his spare time, Hugo likes to share and enhance his knowledge of SQL Server by frequenting newsgroups and forums, reading and writing books and blogs, and attending and speaking at conferences. Hugo has been a SQL Server MVP since January 2006.

Search for other articles by Hugo Kornelis

Rate this article:   Avg rating: from a total of 38 votes.


Poor

OK

Good

Great

Must read
Have Your Say
Do you have an opinion on this article? Then add your comment below:
You must be logged in to post to this forum

Click here to log in.


Subject: What's the problem with using IDENTITY on a logging table?
Posted by: Anonymous (not signed in)
Posted on: Thursday, June 13, 2013 at 11:16 AM
Message: I can't guarantee that the timestamp will be unique when multiple log statements occur in quick succession. Therefore I included an IDENTITY in the PK/Clustered-Index (subordinate to the timestamp). It's still increasing, so avoiding page-splits. You emphasized "please don’t add an IDENTITY column to these tables!)" and I'd like to know why.

thanks

Subject: I would like to know too!
Posted by: Anonymous (not signed in)
Posted on: Monday, June 17, 2013 at 2:57 PM
Message: I second the question about using an identity column! What are the reasons for not having one?

Subject: IDENTITY
Posted by: Hugo Kornelis (view profile)
Posted on: Wednesday, June 19, 2013 at 9:11 AM
Message: Thanks for you comments, Anonymous 1 and Anonymous 2!
Good questions - and I'll answer them.

First poster: Indeed, you cannot guarantee uniqueness on the timestamp. That's why I don't declare it as a PRIMARY KEY on the table. I create a CLUSTERED INDEX without the optional UNIQUE keyword, so it is declared as non-unique; duplicates are okay.
So there is no primary key on this table. For a relational purist, that is awful. This is a violation of first normal form, and of Codd's second principle. But I could argue that a logging table is not really part of my relational design, it's a borderline case. Like a traditional sequential .log file on the file system, but stored inside the database. Or I could just say that in this case I pick pragmatics over purism. Yes, I can add an IDENTITY column and make it a PRIMARY KEY. But an IDENTITY is not a true key, it's a surrogate key - so for what would it be a surrogate in this case? Where is the real key? As you point out, once the frequency of events is high enough, you will get duplicates on datetime, and even on datetime2(7).

The second poster asks what are the reasons for not having one; allow me to reverse the question: what are the reasons for having an IDENTITY column for this table?
There are normally four major reasons for using IDENTITY. One is to prevent page splits by having an ever increasing key. The timestamp is ever increasing to, so this argument does not apply here.
The second reason is to allow easier references for foreign keys. But foreign keys to a logging table? Well, if you do need them (really? why?) then you can't use this article anyway, as you won't be able to use truncate table (and I don't think partition switching will fly either - not tried, though. Like I said - a foreign key to a logging table??????)
The third reason is to ensure that the key is immutable, which helps maintain foreign keys (already covered) and nonclustered indexes. But the timestamp of a logged event is immutable too, so again no reason for identity.
Which leaves us with the fourth reason: to save space in the nonclustered indexes. But let's do the math. With 500 million rows per day, you'll burn through the entire range of integers in about eight days, so you will need a bigint. That's eight bytes. So is datetime, and so is datetime2(7). My design adds an 8 byte clustered key to all nonclustered indexes, and then has to add an additional 4 bytes only for rows with a duplicate. If 500 million rows per day are evenly spread out, you'll have about 6,000 per second, so lots of duplicates - but switching to datetime2(7) fixes that. Hardly any duplicates now; the average clustered index key length that is added to all the nonclustered indexes will be below nine bytes. With a bigint IDENTITY, that will be eight bytes, so you save less than a byte per row per nonclustered index. At 500 million rows, that still sounds good - until you factor in that you also just ADDED 8 bytes to each row in the clustered index!

There are definitely lots of tables that do benefit from adding an IDENTITY column to serve as a surrogate key. But please don't make it a habit to routinely add such a column to each table you create. It has to be a conscious decision every time.

I hope this answers your questions?

Subject: Agree with your reasoning Hugo
Posted by: Anonymous1 (not signed in)
Posted on: Wednesday, June 19, 2013 at 10:06 AM
Message: and thanks for taking the time to detail your decision process when defining the clustering key. That said, I don't believe you've justified your original emphasis against using IDENTITY, which originally caught my eye.

You've laid out the pros and cons of including IDENTITY, but certainly no showstoppers. I'm using a BIGINT, and not overly concerned about the penalty over 500 million rows, since the log will be purged long before that total is reached.

This discussion has been useful, a reminder that the best solution to a problem is not necessarily the first that comes to mind.

Subject: making good points
Posted by: opc.three (not signed in)
Posted on: Monday, June 24, 2013 at 2:37 AM
Message: Hugo, you made some good points regarding why an IDENTITY column is sometimes not necessary. Thank you for taking the time to write the article and for responding to the initial comments in such detail. This article and discussion happen to be very timely for me.

Regarding whether or not to use an IDENTITY, I think the decision [must] depend on the purpose and usage of the logging table. It is worth noting that it can be viewed as very valuable to be able to uniquely, and explicitly, refer to each row in the logging table. It's the "explicitly" part we give up when we rely on a uniquifier versus adding an IDENTITY column.

Subject: NON UNIQUE CLUSTERED INDEX drawback
Posted by: Wilfred (not signed in)
Posted on: Monday, June 24, 2013 at 3:29 AM
Message: In my opinion, you're abusing the clustered index concept. For performance reasons, a clustered index should be a monotome increasing column (ask Robert Sheldon) and certainly not a datetime field. If a clustered index is not unique, MSSQL will add a (hidden) identity field to the index to make it unique, blowing up your table and fragmenting it at every insert.

So in your case, make it a HEAP table if indexing is not part of your database concept.
But I prefer a clustered index on an identity column and a non-clustered index on the datetimefield, which is in my opinion the best of 2 worlds.

Subject: My reply
Posted by: Hugo Kornelis (view profile)
Posted on: Monday, June 24, 2013 at 3:51 AM
Message: @Wilfred: I don't suggest clustering on datetime instead of identity in all cases. I suggest deciding on a case by case basis.
In this case, the clustered index is on a datetime column that is the timestamp of the logging event. Due to the nature of a logging table, this is a monotonously increasing column, so you won't get page splits.
A nonunique clustered index does not add a hidden identity; it adds a hidden "uniquefier". This adds four bytes, but only to rows that are actual duplicates (so the first row with a given key valye won't have a uniquefier; the second, third, etc will all have the 4-byte overhead). These are added as a secondary column in the index and increasing, so you still get no table splits. I do agree that I should have used datetime2(7) instead of datetime in the table design to minimize the number of such uniquefiers.
I have not conisdered the heap option, and I don't know if partitioning is supported on heaps. From head, I though you can only partition on clustered index columns, but I didn't check this. If partitioning a heap is possible as well, then in this specific case, a heap might be a very good choice.

@opc.three: You do make a good point about being able to identify individual log records. My experience so far is that usually all log records for a (hopefully very short) time span are selected, sometimes with additional filters (hence the extra nonclustered indexes). But if you often have to return only one specific row, then you do have a case for adding an IDENTITY column.

Subject: typo
Posted by: Wilfred (not signed in)
Posted on: Monday, June 24, 2013 at 3:59 AM
Message: You're right, it should be a hidden uniqueidentifier instead of an identity. I was probably too focussed on this issue :)

Subject: Why not IDENTITY column?
Posted by: Jiggaboo (not signed in)
Posted on: Monday, June 24, 2013 at 5:58 AM
Message: I just checked my table. It contains IDENTITY column. Why it is bad?

Subject: Not always bad!
Posted by: Hugo Kornelis (view profile)
Posted on: Monday, June 24, 2013 at 7:09 AM
Message: @Jiggaboo: There are no absolute "good" and "bad" things in SQL Server. There are many cases where IDENTITY is an ideal solution to a problem - but you need to check if you have that problem before applying the solution.

For a table such as the one in this article, I say that an IDENTITY column is usually not required, because the table has none of the problems that IDENTITY solves.

See the previous comments for a lengthy discussion of pros and cons of IDENTITY columns, including a very good point brought up by opc.three that shows that, in some specific cases, IDENTITY can be useful for this kind of table. Let me know if you have specific questions after reading that.

And please, please, please do not take my words in this article to imply that you should never use IDENTITY. Religiously avoiding all IDENTITY columns is exactly as bad as routinely adding them to every table. IDENTITY columns are like antibiotics - great for patients that suffer a bacterial infection, but not to be applied routinely to all patients. (And no, this analogy does not imply that tables develop resistance to IDENTITY columns).

Subject: IDENTITY (again)
Posted by: thensley (view profile)
Posted on: Monday, June 24, 2013 at 2:36 PM
Message: I guess my impulse to use an identity column would be to have a field that indicates the actual order the rows were written in. If there is a lot of activity, the date/time itself might not give a good indication of the actual order of the inserts.

Having said that, I'm not 100% certain an identity column *really* gives you that either.

Subject: Order of insertion
Posted by: Hugo Kornelis (view profile)
Posted on: Monday, June 24, 2013 at 3:15 PM
Message: @Thensley:
As far as I know, the IDENTITY column will indeed give an accurate indication of the order the rows were inserted. (Unless you tamper with IDENTITY values, of course).
However, there is no guarantee that the order in which rows are inserted actually corresponds with the order in which requests were received. Or with the order in which transaction are committed.

Subject: Enterprise Edition
Posted by: MatthewMonroe (view profile)
Posted on: Monday, June 24, 2013 at 4:06 PM
Message: Very good presentation of the concept. I gave this a try, but it turns out you must have Enterprise Edition of Sql Server (I have 2008 R2 but the requirement appears to still be true for 2012). I did find an alternative on Simple-Talk: Partitioned views, discussed at https://www.simple-talk.com/sql/sql-tools/sql-server-partitioning-without-enterprise-edition/

Not as elegant as Partitioned Tables, and likely not as optimized, but it works.

Subject: Superb
Posted by: Anonymous (not signed in)
Posted on: Tuesday, June 25, 2013 at 8:06 AM
Message: I have been using partitioning for purging of Log tables, but the most irritating part was to convince the manager to go for partition because we had to create a new table and awful lot scripts for migrating the data. It never occurred to me to use "SWITCH" (how silly of me, it was right there in front of my eyes).
Awesome article, and very beautifully explained.
Thanks a lot.
--
Regards,
Karthik

Subject: Excellent
Posted by: Anonymous (not signed in)
Posted on: Tuesday, June 25, 2013 at 1:43 PM
Message: So many good points Hugo, fantastic article. Storage maintenance is important, and I might experiment with this method. As far as identity column is concerned: Hugo has already made wonderful points in the responses. I encourage anybody with identity issues to read and understand his comments, and then ask. I think the main point is to understand WHY you are using an identity column in the first place. You have to be able to rationalize its use, as (pointed out) there are pros and cons and alternatives to consider. "convenience" should rarely (if ever) be your first reason. An identity column holds no real meaning in the context of the data which it represents, and should not be the "go to" clustered index in the majority of situations.

Subject: Good Job
Posted by: Anonymous (not signed in)
Posted on: Tuesday, June 25, 2013 at 11:05 PM
Message: great post!

Subject: Brute force Archive
Posted by: Henrik Staun Poulsen (not signed in)
Posted on: Wednesday, June 26, 2013 at 3:15 AM
Message: We do have Enterprise Edition, and we do use partitioning on some tables.
But:
With log tables, we often use a Brute Force Archive strategy; we archive 100% of the data :-)

Rename the table to OldTable.
Create a new empty table instead (often with new improvements, such as columns, partitioning)
Drop OldTable after a few months.

Subject: Brute force Archive - now that makes sense
Posted by: Keith Gresham (not signed in)
Posted on: Wednesday, June 26, 2013 at 10:21 AM
Message: This conversation quickly devolved into an argument over keys. Datetime is the proper choice because you need to partition by date if you want keep a days worth of log entries together in a single partition. However Henrik Staun Poulsen's post titled Brute force Archive is a better answer to this problem. It is much like what SQL Server does with it's own log files: rename and start a new one. Simple is elegant.

Subject: Only available in Enterprise or Datacenter editions
Posted by: amchen (not signed in)
Posted on: Thursday, June 27, 2013 at 11:22 AM
Message: I was excited to learn of using this technique. However, it should be noted that Table and Index partitioning is only available in the Enterprise or Datacenter editions of SQL Server. Unfortunately, most of our customers are using the Standard edition.

Subject: Only available in Enterprise or Datacenter editions
Posted by: amchen (not signed in)
Posted on: Thursday, June 27, 2013 at 1:20 PM
Message: I was excited to learn of using this technique. However, it should be noted that Table and Index partitioning is only available in the Enterprise or Datacenter editions of SQL Server. Unfortunately, most of our customers are using the Standard edition.

Subject: Only available in Enterprise or Datacenter editions
Posted by: amchen (not signed in)
Posted on: Thursday, June 27, 2013 at 2:18 PM
Message: I was excited to learn of using this technique. However, it should be noted that Table and Index partitioning is only available in the Enterprise or Datacenter editions of SQL Server. Unfortunately, most of our customers are using the Standard edition.

Subject: Thanks!
Posted by: Hugo Kornelis (view profile)
Posted on: Thursday, June 27, 2013 at 3:02 PM
Message: Thanks, all, for your kind words and valid comments.

To those pointing out that partitioning is an Enterprise Edition feature - yes, that is true. For lower editions, this technique can not be used. The "brute force" idea described by Henrik Staun Poulsen is a valid alternative for them.

Subject: Nice article, Hugo
Posted by: Jeff Moden (view profile)
Posted on: Thursday, June 27, 2013 at 5:37 PM
Message: Excellent article. It takes a lot of work to write such a thing. Well done.

Shifting gears a bit, you mention the following...
-----------------------------------------------
But for large tables, this is not good. Even if the LogDate column is the clustered index key (which it should be – please don’t add an IDENTITY column to these tables!),
-----------------------------------------------
I'd like to recommend that you actually SHOULD add an Identity column to such a table because the best clustered indexes are also UNIQUE indexes which will prevent all of the related non-clustered indexes from being appended with a 8 byte uniqueifier rather than just the clustered index columns. As you well know, about the only place where a date-related column is unique is in a Calendar Table.

There are some huge advantages to adding the IDENTITY column to such a table. Rather than rewrite what has so eloquently been stated before, please see the following "movie" by none other than Kimberly Tripp.
http://technet.microsoft.com/en-us/sqlserver/gg508879.aspx

The end result of the movie is that a composite clustered index based on a date column and an identity column is incredibly beneficial to performance and conservation of resources. I also use the technique on certain otherwise non-unique non-clustered indexes just to make them unique for performance reasons. It's absolutely NOT a panacea insofar as non-clustered indexes go but it can work a miracle here and there.

--Jeff Moden

Subject: You CAN do paritioning in SQL Server Standard Edition
Posted by: Jeff Moden (view profile)
Posted on: Thursday, June 27, 2013 at 5:54 PM
Message: For those asking the question, you can, in fact, to a type of partitioning in the Standard Edition. The technique is called "Partitioned Views".

Make no doubt about it, it's a PITA to setup because you need to manually (or programmatically but you have to write the code to do so) create the underlying tables and the underlying indexes.

You also have the issue that if you want to write to the underlying tables through the view, you cannot use an IDENTITY column. You can, however, adopt a dual naming convention where you write to the "current month" (for example) table, which has an IDENTITY column, and read from a differently named partitioned view.

One HUGE advantage that Partitioned Views have over partitioned tables is that the underlying tables DON'T need to all live in the same database. The obvious advantages there are they greatly reduced backup requirements (you're only going to write to the audit tables once a month and so the backups only need be once per month) and they move all but a couple/three of the "month" tables out of the main database which greatly reduces the backup requirements of the main database especially when you consider that such audit/logging tables frequently make up the larger portion of the space used in a database over time.

The disadvantages of using Partitioned Views also include the fact that you have to copy and delete a month's worth (for example) of data rather than just jabbing a new value into a partitioning function but the extreme advantages of moving such a huge volume of entirely static data out of the main database may make it worth it all for you.

Subject: Thanks!
Posted by: Hugo Kornelis (view profile)
Posted on: Sunday, June 30, 2013 at 4:27 AM
Message: Thanks, Jeff, for your comments.

I will not address the IDENTITY debate again. A lot has been said in the previous comments, and I don't think repeating the same arguments will add any value. Only one thing - I thought this was mentioned by someone else in a comment, but I can't find it now so maybe I was wrong. And I know I failed to mention it before. But if you are going to partition by date, you *have* to have the datetime column as the leading column in the clustered index, so the best the IDENTITY can be is a tiebreaker between conflicting time stamps. Why not let SQL Server hamndle that itself with its uniqefier?

You are totally right that partitioned views can be a good alternative for partitioned tables. For this specific article, that dealt with volumns of 500 millions inserts per day in the logging table alone, I think it's safe to assume enterprise edition. But if you have a similar problem, at smaller scale, on standard edition, partitioned views can be a very smart solution. Thanks for adding that!

Subject: Identity Response
Posted by: Jeff Moden (view profile)
Posted on: Friday, July 12, 2013 at 2:20 AM
Message: Agreed on the Date being the leading column... just like in the "movie" at the URL I posted. Have you ever watched it?
;-)

Subject: Great Post
Posted by: vinaykedu (view profile)
Posted on: Friday, July 19, 2013 at 4:00 AM
Message: very informative article.

 

Phil Factor
Searching for Strings in SQL Server Databases

Sometimes, you just want to do a search in a SQL Server database as if you were using a search engine like Google.... Read more...

 View the blog

Top Rated

Continuous Delivery and the Database
 Continuous Delivery is fairly generally understood to be an effective way of tackling the problems of... Read more...

The SQL Server Sqlio Utility
 If, before deployment, you need to push the limits of your disk subsystem in order to determine whether... Read more...

The PoSh DBA - Reading and Filtering Errors
 DBAs regularly need to keep an eye on the error logs of all their SQL Servers, and the event logs of... Read more...

MySQL Compare: The Manual That Time Forgot, Part 1
 Although SQL Compare, for SQL Server, is one of Red Gate's best-known products, there are also 'sister'... Read more...

Highway to Database Recovery
 Discover the best backup and recovery articles on Simple-Talk, all in one place. Read more...

Most Viewed

Beginning SQL Server 2005 Reporting Services Part 1
 Steve Joubert begins an in-depth tour of SQL Server 2005 Reporting Services with a step-by-step guide... Read more...

Ten Common Database Design Mistakes
 If database design is done right, then the development, deployment and subsequent performance in... Read more...

SQL Server Index Basics
 Given the fundamental importance of indexes in databases, it always comes as a surprise how often the... Read more...

Reading and Writing Files in SQL Server using T-SQL
 SQL Server provides several "standard" techniques by which to read and write to files but, just... Read more...

Concatenating Row Values in Transact-SQL
 It is an interesting problem in Transact SQL, for which there are a number of solutions and... Read more...

Why Join

Over 400,000 Microsoft professionals subscribe to the Simple-Talk technical journal. Join today, it's fast, simple, free and secure.