SQL Server 2005 Snapshots

Find out about Database Snapshots, new to SQL 2005 Enterprise Edition, which provide a read-only, "virtual" copy of a database, at a given point in time.

A DBA’s view on SQL 2005 Database Snapshots

Database Snapshots are a new Enterprise Edition-only tool, introduced in SQL 2005, that provide a read-only, “virtual” copy of a database, at a given point in time. This article will discuss how, as a DBA, you might use snapshots in a production environment and why you might not consider them suitable for some of the commonly-stated use cases. These include:

  • Protecting your system from user or administrator error
  • Offloading reporting
  • Maintaining historical data

However, I’ll also highlight two other areas where I believe snapshots can be a really effective tool, namely:

  • System upgrades
  • Materialising data on standby servers

Creating Snapshots

One of the most infuriating things I have found is that you can not use SQL Server Management Studio (SSMS) to create snapshots. I’m a big fan of scripts but I also like the convenience of using SSMS, and most junior DBAs tend to use SSMS over scripts, so I feel that this is quite a significant oversight. In any event, to create your first snapshot you will need to run something similar to the following T-SQL code:

You will notice that the code is very similar to a script you would use to create a database but you should note that the logical device name, in this case andrew_test_datafile, has to match the logical name defined on the source database.

In this example, I’ve created a snapshot of the database andrew_test called andrew_test_snapshot_250206_1800. Note the use of the <database name>_<date>_<Time> naming convention. It’s wise to name your snapshots clearly otherwise you may struggle unnecessarily when trying to identify when the snapshot was created. Note also that I gave the snapshot a meaningful file extension: .snap.

From an organisational and administrative point of view I recommend that you store your snapshots in a separate directory from your database files. It’s not such a big deal when you’re working with a small number of files, but if you have hundreds of other files in the same directory you soon start wishing you had files of one type in one directory. Additionally, this makes exclusions or filters even easier as you can now do them at a folder level. If you plan to use snapshots for reporting I would seriously consider putting the snap files on their own drive. This way, once they start populating, you will be keep some of the IO away from the drive that is hosting the database files, and therefore reduce disk contention.

How Snaphots Work

It’s essential that a DBA understands at least something of what actually happens “behind-the-scenes” when a snapshot is created, and of how snapshots actually work. There is a great explanation of how snapshots work on MSDN2, but I will summarize some of the particularly relevant points here.

A database snapshot is a read-only static view of a database. When you create a database snapshot, as described in the previous section, the following process is invoked.

  1. An empty file, called a sparse file, is created for each source database data file.
  2. The database is checkpointed (all dirty pages in the buffer cache are written to disk).
  3. If uncommitted transactions are in progress in the database, the snapshot will reflect the state of the database before these transactions started in other words, in the context of the snapshot, these transactions will be rolled back – but transactions in the database itself are not.
  4. The snapshot is brought online and is ready for use.

The empty file that is created is an NTFS sparse file. A NTFS sparse file is a file that contains no user data and has not yet been allocated space for user data. In Windows Explorer, such files are shown with a size matching that of the source database. However, when you go into the file properties you will see that the size on disk is 0KB or a multiple of 64KB (representing 1-8 data pages) if some data has been transferred:

137-image001.jpg

NOTE: This is where the meaningful file extension (.snap) comes into its own. If I’m navigating through Explorer and come across a bunch of 10GB .snap files I will hopefully remember that, while the size is showing 10GB in Explorer, .snap files are sparse files so I need to check what space is really being used. Use of a standard extension is also very important when you are, for example, setting up anti-virus scan exclusions, or backup exclusions, across a suite of servers.

Once the snapshot is online, a copy-on-write mechanism kicks in. Just before a page is updated for the first time in the source database, since creation of the snapshot, it is copied to the snapshot datafile. Now any queries to the snapshot that access the changed pages will get those changed pages from the snapshot. If the snapshot is queried for data on any page that has not been modified in the source database, since the time the snapshot was created, then the request is simply redirected to the source database files. In this way, the snapshot presents a consistent view of the data at the point in time that the snapshot was created.

The NTFS sparse files are at the heart of snapshots because only a small amount of disk space is initially required and creation is extremely fast, which means snapshots are generally created in a matter of seconds. However we must consider the disk space and ensure we have enough available for the files to grow without the drives running out of space. I discuss this in the next section.

Snapshot Pros and Cons

Following is a list of what I consider the main pluses and minuses associated with snapshots:

Pros:

  1. They provide a convenient, read-only point-in-time copy of your data.
  2. When you query a snapshot, you will not experience blocking due to update/insert operations that you might have to contend with when querying the source database (assuming snapshot isolation is not in use).
  3. Initially the snapshot data files are small and are very quick to create. They should only become large if your database is subject to frequent modifications.

Cons:

  1. You can not backup a snapshot so if you have to restore your source database your snapshots are lost.
  2. Addressing database index fragmentation negates the benefit of the files being small if the sparse files are kept for a period of time.
  3. Where data pages have not changed you will be accessing the source database file, which may cause contention at the file level since both the source database and the snapshot will be accessing the same MDF.
  4. Every update/insert transaction on the source server potentially bears the overhead of the page copy operation to maintain the snapshot(s).
  5. Because of the reliance on the source database, the snapshot has to exist on the same server as that source database.
  6. You cannot grant a new user access to the data in a snapshot. Permissions are inherited form the source database as it existed at the time of snapshot creation and subsequent changes to security in the source database do not filter down to the snapshots.

In the following section, I’ll review these pros and cons in the specific context of a given usage scenario, and also mention other potential issues, specific to each case. However, I’ll say upfront that the real snake-in-the grass here is the effect on snapshot files of defragmenting indexes. One of the most attractive benefits of using snapshots is that, because they use NTFS sparse files, they will use very little space and, unless you have a highly active database, should only ever grow to a fraction of the size of your database.

However, a DBA has to maintain indexes and this involves defragmenting them from time to time. Now defragmenting an index does not change the actual data but it does move it to different pages in an attempt to make it contiguous. Since the data is on the move any snapshot(s) have to take a copy of the page so as soon as you defragment your indexes for the first time all your snapshots are going to grow to be pretty close in size to your source database. Since it’s highly likely that we will be defragmenting our indexes, this virtually wipes out the benefit of sparse files.

Uses for Snapshots: Winners and Losers

When reviewing database snapshots I took a pessimistic approach and looked for what they might break, what might break them and, when considering other SQL technologies, where are they going to add value.

Losers

In this section I’ll review three of the commonly stated use cases for snapshots, namely to:

  • Offload Reporting
  • Protect against user/DBA error
  • Maintain historical data

In each case, I’ll summarize the pros and cons of a snapshot solution and indicate the main reasons why, generally, I would not consider them as a solution in these scenarios.

One of the frustrating things about snapshots, in my opinion, is their lack of availability in editions other than Enterprise. It seems to me that snapshots are most likely to be viable, and add value, in low throughput systems, which tend to have continuous spare capacity. Use of snapshots to offload reporting or protect from user error become much more viable in such systems, because the additional overhead of snapshots, in terms of page copying and additional query load, is easily borne. However, these same systems are the ones most likely to be using Standard edition, and therefore won’t have access to snapshots in the first place!

Enterprise edition servers often run high IO databases, with resources that may be further constrained in certain time windows, and so may find unacceptable the additional overhead of having snapshots on the Server. So, ironically, the installations that have access to snapshots are those that would be most likely to look towards other solutions, such as log shipping, for reporting and protection from user or administrator error.

Offloading Reporting

Below are what I consider the main pros and cons of using snapshots as a means of offloading reporting:

Pros:

  1. You have a read-only point-in-time copy of your data.
  2. You will not be contending with blocking due to update/insert operations (let’s assume snapshot isolation is not in use).
  3. Initially the snapshot data files are small.

Cons:

  1. You cannot grant a new user access to the data in a snapshot.
  2. You can not backup the snapshots so if you have to restore your source database your reporting snapshots are lost.
  3. Where data pages have not changed you will be accessing the source database file which may cause contention at the file level since both the source database and the snapshot will be accessing the same MDF.
  4. Full text indexes are not available on snapshots so if you require full text searching for your reporting then snapshots are not an option.

Additionally, you should bear in mind that snapshots have to be on the same server as the source database: you do not have a physical dedicated, reporting server. Therefore, physical resources on the database server, such as CPU and memory, still have to be shared.

One other thing to note is that NTFS sparse files grow in 64kb increments and these increments are likely to occur at different times. As such, it’s highly likely that a populated sparse file will not be contiguous at the NTFS level. The effect of this on reporting could lead to significant strain on your disks.

Ultimately, the main thing that would stop me from using snapshots for report offloading is the likelihood of increased physical resource contention.

Maintaining Historical Data

Below are what I consider the pros and cons.

Pros:

  1. Potentially, you can maintain a significant amount of history with little disk space usage.

Cons:

  1. If the drive hosting the snapshot runs out of space causing an update to fail, the snapshot is marked suspect and can not be recovered.
  2. Deframenting your indexes negates the benefits associated with sparse files (if they are kept for a period of time).
  3. Every update/insert transaction on the source server potentially bears the overhead of the page copy operation to maintain the snapshot(s).
  4. You can not backup your snapshots.

Although maintaining historical data is listed as a typical use I would never use snapshots for this simply because I can not back them up.

Protecting from User or Administrator Error

In theory, you could use a snapshot to restore a database to a point before a user or administrator error occurred simply by issuing:

Below are what I consider the pros and cons.

Pros:

  1. The snapshot process is very quick.
  2. The data and code is available immediately for restoration.

Cons:

  1. The snapshot is at a point in time so is unlikely to be suitable for data recovery.
  2. If you restore a snapshot you can’t then restore any transaction logs so you potentially lose a large amount of data.
  3. When you restore a snapshot all other snapshots of the database have to be deleted which would be a problem if you are maintaining reporting snapshots.
  4. If your administrative error involved damage to a physical file (say a drive was accidentally wiped) the snapshot will not help you.

To get any real value from snapshots in this area, you would have to take several snapshots throughout the day. I feel that the benefits of snapshots in this scenario will be far outweighed by the overhead of maintaining multiple snapshots, especially when considering other available options such as log shipping or even restoring from backup.

It goes without saying (but here it is anyway) that snapshots should not be considered as part of or a replacement for a proper backup. To use them in this way would be disastrous when you consider their limitations, such as their dependency on the source database.

Winners

In this section, I move on to the two scenarios where I do believe that snapshots can provide a real benefit for enterprise systems:

  • System upgrades
  • Materialising data on standby servers

Performing System Upgrades

The scenario that I consider a real winner for snapshots has to be when performing a system upgrade. Typically a system upgrade involves a release of code that changes the underlying schema and or data. It may also involve updating related application code. When performing such an upgrade the system administrator will ensure that full backups are taken of everything that will change as part of the upgrade, including the database. This process can be time consuming and requires additional space to store the backups, whilst the upgrade is in progress.

By creating a snapshot, the database “backup” time is reduced to seconds and you only require the amount of space required might be very small, assuming only a small amount of data is changing.

Once the system upgrade is complete, and after performing system checks, if you should you find yourself in the unfortunate position where you must restore to the point prior to the upgrade the recovery is greatly speeded and simplified. To restore your database to the point prior to the upgrade you would issue the RESTORE T-SQL command shown previously. The restore time is likely to be less than a minute, depending on how much data changed, as opposed to potentially hours when restoring a large database.

I have to point out though that there will still be some system upgrades where I would want a full backup prior to starting work. Upgrading the operating system is such an example – but with most upgrades a snapshot will suffice. I will also state that I am assuming the last full backup and transaction logs up until the upgrade are available should the worst happen when using the snapshot approach.

Materialising Data on your Standby Servers

The other winner for me is using snapshots to “materialise” (make available for querying/reporting) data on your standby servers. Due to business requirements, or to avoid connections interfering with log shipping, you may have chosen to keep your log shipped database in NORECOVERY mode. This means you cannot read from the database. By creating a snapshot of your log shipped database, you can read the data and effectively use the standby server to offload reporting requirements.

More importantly, this technique also works with mirrored databases and since a mirror destination can only ever be in NORECOVERY mode, it’s an excellent way to access the mirrored data for the purpose of reporting. However there is always a trade off and the key point to keep in mind when using snapshots on a mirrored database is that, in synchronous mode, transactions have to commit on both servers before SQL can mark the transaction as complete. By introducing the snapshot we have introduced a third step before a transaction will commit. Why? Well below are the steps you now have to take if you have a snaphot:

  1. Write transaction on server A source database.
  2. Write transaction on server B destination database.
  3. Copy page on server B destination database to server B snapshot.

Without the snapshot you only have to do steps one and two.

Of course, all this assumes that snapshots work smoothly and seamlessly with high availability solutions such as log shipping and database mirroring. So, what might you have to consider when using snapshots with these two HA technologies?

Snapshots and Log Shipping

  • There are no restrictions on creating a snapshot on a log shipped source (or destination) and log shipping is not disrupted when creating a snapshot.
  • You are not prevented from restoring the snapshot when log shipping is present.
  • When you restore the snapshot you must remember that the process is similar to a database restore so it is not a logged operation and as such log shipping will then fail.

Snapshots and Database Mirroring

  • There are no restrictions on creating a snapshot on a mirrored source (or destination) and mirroring is not disrupted when creating a snapshot.
  • You are prevented from restoring a snapshot whilst mirroring is active. You must first stop mirroring and once you have restored you must then restore the destination database and enable mirroring again.
  • Although you can snapshot the source and destination databases, when you issue a snapshot restore the recovery option is ignored. This means you can not restore a snapshot to a recovering state so your destination mirror database must be restored from a full backup as mirroring requires the destination to be in a recovering state.

Conclusion

Despite how it may seem I actually welcome snapshots with open arms to my DBA tool box. I do believe it’s a pity that this is an Enterprise edition only feature, but I do see it as a tool I will make good use of when doing system upgrades on my Enterprise edition servers. Snapshots are in their infancy but I feel that with a little more work they will become a widely-used tool.

Tags: , , , , , ,

  • 167768 views

  • Rate
    [Total: 1    Average: 3/5]
  • Anonymous

    Feedback
    Hi, a fantastic article that has helped me plan our Snapshot implementation. A lot of real worlds scenarios given here over and abouve the info found in BOL.
    Cheers!
    Kyle – DBA
    U.K.

  • Anonymous

    Thanks
    This really helped clarify some of the pitfalls and advantages of snapshots.

  • Anonymous

    Great analysis
    Thanks for thorough analysis. It’s so neatly written yet informative.

    Ji
    Atlanta U.S

  • Anonymous

    great Doc
    Everyone who is implementing snapshots for reporting purpose must read this doc.

  • Anonymous

    Wonderful Article
    I was looking for an enterprise solution to reduce some of the stresses on couple of my production databases having red this articles I know how to go about it. I have forwarded the links to couple of my mates this is MUST read article keep up the good work.

    Cheers Mate.

  • Anonymous

    Very Good Article!
    This has been very helpful in my analysis and decision to use or not use snapshots. The information was broken down in an easy-to-read order. Really like the Pros and Cons!

    Best of other articles on this subject!

  • Anonymous

    I need explanation of two points
    1. “Initially the snapshot data files are small and are very quick to create. They should only become large if your database is subject to frequent modifications”

    I just needed to know how come a snapshot will grow on-the-fly when source database get modified?
    I think that snapshot will stay of the same size no matter how big the database file become unles we are taking snapshot again of bigger database.

    Snapshot DB file will be of exactly the same size (at least show in explorer window) as of source database source file.

    2. “If the snapshot is queried for data on any page that has not been modified in the source database, since the time the snapshot was created, then the request is simply redirected to the source database files”

    You mean to say that Snapshot has a “Pointer-type-link” with the database like an anchor?
    Are the Snapshot not independant of DB once they are created?

    Abid Malik

  • Anonymous

    Explenation
    1. Check your file properties. You will see two sizes on you property sheet. One size is the size you see in the explorer windows, the second size, is the size on disk. Similar to the size on disk in compressed files. So the file is really only a few kb in size when created, and will grow by every change in the source database.

    2. Yes, it will grow only by changes, so the pages from the source are copied to the snapshot before the change in the source take place. The snapshot will only hold copies of the original pages, that are changed in the source.

  • Anonymous

    Backing up a snapshot of a log shipped destination
    Your article is very nicely written with great content!

    We’re trying to run backups off of a log shipped destination instead of the primary (to off-load the backup activity from the primary server and perform reporting on the secondary).

    Like you said, though, it’s unfortunately that snapshots are an Enterprise-edition only feature since we only use Standard edition, so we can’t backup the destination from a snapshot.

    Log-shipping was added to SQL Server 2005 Standard, but there apparently is no way to backup the data the way we want. Since logs are truncated on the primary during the log shipment, and there is no way to backup the destination (backup won’t work against a database in standby mode like log shipping requires), we see no way to make a tape backup of the log-shipped destination database in Standard edition.

    Just thought I’d point this out in case anyone finds it helpful. If someone finds a way to do it, please let me know at emiller at(@) thecreation.com.

    Thanks!

    Eric

  • Sri

    Excellent Article on snapshot
    We are using the disk snapshot so far and this is same what we do there. Explaination is fantastic!!

  • Suresh

    Security Permission Required to create snapshot
    We need to create a snapshot every night. SSIS or batch process will be used to perform this task. What is the minimum permission/role required to create database snapshot?
    I believe you should have dbcreator role as least permission set to perform this. How safe it is to grant a account with dbcreator?

  • Anonymous

    Cannot restore log on a standby database that has a snapshot
    It is a great article and I am eager to test out creating a snapshot on a log-shipped standby server. However, after creating the snapshot database, the log restore to the standby database failed with the error “The operation cannot be performed on a database with database snapshots or active DBCC replicas.” Any advice?

  • Anonymous

    Cannot restore log on a standby database that has a snapshot
    It is a great article and I am eager to test out creating a snapshot on a log-shipped standby server. However, after creating the snapshot database, the log restore to the standby database failed with the error “The operation cannot be performed on a database with database snapshots or active DBCC replicas.” Any advice?

  • Anonymous

    Snapshots as reporting servers
    Great analysis. Thanks!

    Regarding the use of a snapshot as a reporting tool, I agree that its uses are limited due to resource contention. To alleviate this problem, take snapshots of a mirrored database. With a bit of creativity, a script can be created that will generate a new snap periodically (e.g.: every hour) and will drop all previous snaps that are currently unused.

  • Anonymous

    EMC and Storage Vendors have a solution
    We are using EMC replication Manager to create HQ-based clones to materialize (to use your words) our clones to production systems. We can do this across all versions of SQL, and it gets us copies that are read/writable, able to surface on a different server, and work great for DR scenarios also. They are also application consistent and logs can be replayed against them!

  • Anonymous

    typo
    oops.. I meant HW (hardware) based clones in the above (as opposed to DB snapshots within SQL).

  • Nikhil

    Snapshot size
    I was perplexed when i saw the size of database snapshot same as database.
    Thanks for reminding me that it was a sparse file.

  • ALZDBA

    snapshot access only
    Nice article.
    Are there any other means than a db-trigger challenging a proprietary db-role membership to restrict access so users can only use the data via the snapshot database and not directly on the source db ?
    We don’t want the reporting users to have any chances of locking data and hinder ongoing production.

  • DBARohit

    Explanation
    Very well written article….

    Lets say I’m working on some dev task and I might need to restore same backup again n again…. I assume this can help… but as per BOL…. it says
    “When a page getting updated on the source database is pushed to a snapshot, if the snapshot runs out of disk space or encounters some other error, the snapshot becomes suspect and must be deleted.”

    Does this mean that I cannot revert back to the time when I created my snapshot???