In this sample chapter from his book SQL Backup and Restore, Shawn McGehee discusses partial backups, how they're used, and how to avoid potential problems when restoring from a partial backup.
Partial backups are similar to file and filegroup backups in that they allow us to back up only a subset of the data files that comprise a database. However, whereas file and filegroup backups allow us to back up specific, individual files and filegroups, partial backups will make a backup copy of the primary filegroup and all read/write filegroups, omitting by default any filegroups designated as READONLY. This means that partial backups are only relevant for databases that contain read-only filegroups; otherwise a partial backup will capture exactly the same data and objects as an equivalent full database backup.
Partial backups are available in SQL Server 2005 and later, and were designed to reduce the backup footprint for large databases that contain a high proportion of read-only data. After all, why would we back up the same data each night when we know that it cannot have been changed, since it was in a read-only state? We wouldn't, or shouldn't, be backing up that segment of the data on a regular schedule.
In this article, we will discuss:
- Why and where partial backups may be applicable in a backup and restore scheme
- How to perform partial and differential partial backups
- How to restore a database that adopts a partial backup strategy
- Potential problems with partial backups and how to avoid them
Why Partial Backups?
Partial backups allow us to backup, on a regular schedule, only the read-write objects and data in our databases. Any data and objects stored in read-only filegroups will not, by default, be captured in a partial backup. We can achieve this effect by capturing separate file backups of each of the read-write filegroups. However, suppose we have a database that does not require support for point-in-time restores; transaction log backups are required when performing file-based backups, regardless, and so this can represent an unnecessarily-complex backup and restore process.
With partial backups, we only need to take transaction log backups if they are needed for point-in-time recovery, and so they are ideal for use with SIMPLE recovery model databases, as a means to trim database backups down to a more manageable size, without introducing the added administrative overhead of log backups. Note that partial backups are valid for databases operating in any recovery model; it's just that they were specifically designed to simplify back up for very large, SIMPLE recovery model databases that contain a large portion of read only data.
So, in what situations might we want to adopt partial backups on a database in our infrastructure? Let's say we have a SQL Server-backed application for a community center and its public classes. The database holds student information, class information, grades, payment data and other data about the courses. At the end of each quarter, one set of courses completes, and a new set begins. Once all of the information for a current quarter's courses is entered into the system, the data is only lightly manipulated throughout the course lifetime. The instructor may update grades and attendance a few times per week, but the database is not highly active during the day. Once the current set of courses completes, at quarter's end, the information is archived, and kept for historical and auditing purposes, but will not be subject to any further changes.
This sort of database is a good candidate for partial backups. The "live" course data will be stored in a read-write filegroup. Every three months, this data can be appended to a set of archive tables for future reporting and auditing, stored in a read-only filegroup (this filegroup would be switched temporarily to read-write in order to run the archive process). We can perform this archiving in a traditional data-appending manner by moving all of the data from the live tables to the archive tables, or we could streamline this process via use of partitioning functions.
Once the archive process is complete and the new course data has been imported, we can take a full backup of the whole database and store it in a safe location. From then on, we can adopt a schedule of, say, weekly partial backups interspersed with daily differential partial backups. This way we are not wasting any space or time backing up the read-only data.
Also, it may well be acceptable to operate this database in SIMPLE recovery model, since we know that once the initial course data is loaded changes to the live course data are infrequent, and so an exposure to potential data loss of one day may be tolerable.
In our example, taking a full backup only once every three months may seem a little too infrequent. Instead, we might consider performing a monthly full back up, to provide a little extra insurance, and simplify the restore process.
Performing Partial Database Backups
In most DBAs' workplaces, partial backups will be the least-used of all the backup types so, rather than walk through two separate examples, one for native SQL Server backup commands and one for SQL Backup, we'll only work through an example of partial backup and restore using native T-SQL commands, since partial backups are not supported in either SSMS or in the Maintenance Plan wizard. The equivalent commands for SQL Backup (the tool my organization uses for backups) will be presented, but outside the context of a full worked example.
Preparing for Partial Backups
Listing 1 shows the script to create a DatabaseForPartialBackups database with multiple data files. The primary data file will hold our read-write data, and a secondary data file, in a filegroup called Archive, will hold our read-only data. Having created the database, we immediately alter it use the SIMPLE recovery model.
USE [master] GO CREATE DATABASE [DatabaseForPartialBackups] ON PRIMARY ( NAME = N'DatabaseForPartialBackups' , FILENAME = N'C:\SQLData\DatabaseForPartialBackups.mdf' , SIZE = 10240KB , FILEGROWTH = 10240KB ), FILEGROUP [Archive] ( NAME = N'DatabaseForPartialBackups_ReadOnly' , FILENAME = N'C:\SQLData\DatabaseForPartialBackups_ReadOnly.ndf' , SIZE = 10240KB , FILEGROWTH = 10240KB ) LOG ON ( NAME = N'DatabaseForPartialBackups_log' , FILENAME = N'C:\SQLData\DatabaseForPartialBackups_log.ldf' , SIZE = 10240KB , FILEGROWTH = 10240KB ) GO ALTER DATABASE [DatabaseForPartialBackups] SET RECOVERY SIMPLE GO
The script is fairly straightforward. The Archive filegroup will eventually be set to read-only, but first we are going to need to create some tables in this filegroup and populate one of them with data, as shown in Listing 2.
USE [DatabaseForPartialBackups] GO CREATE TABLE dbo.MainData ( ID INT NOT NULL IDENTITY(1, 1) , Message NVARCHAR(50) NOT NULL ) ON [PRIMARY] GO CREATE TABLE dbo.ArchiveData ( ID INT NOT NULL , Message NVARCHAR(50) NOT NULL ) ON [Archive] GO INSERT INTO dbo.MainData VALUES ( 'Data for initial database load: Data 1' ) INSERT INTO dbo.MainData VALUES ( 'Data for initial database load: Data 2' ) INSERT INTO dbo.MainData VALUES ( 'Data for initial database load: Data 3' ) GO
The final preparatory step for our example is to simulate an archiving process, copying data from the MainData table into the ArchiveData table, setting the Archive filegroup as read-only, and then deleting the archived data from MainData, and inserting the next set of "live" data.
Before running Listing 3, make sure there are no other query windows connected to the DatabaseForPartialBackups database. If there are, the conversion of the secondary file group to READONLY will fail, as we need to have exclusive access on the database before we can change filegroup states.
USE [DatabaseForPartialBackups] GO INSERT INTO dbo.ArchiveData SELECT ID , Message FROM MainData GO ALTER DATABASE [DatabaseForPartialBackups] MODIFY FILEGROUP [Archive] READONLY GO DELETE FROM dbo.MainData GO INSERT INTO dbo.MainData VALUES ( 'Data for second database load: Data 4' ) INSERT INTO dbo.MainData VALUES ( 'Data for second database load: Data 5' ) INSERT INTO dbo.MainData VALUES ( 'Data for second database load: Data 6' ) GO
Finally, before we take our first partial backup, we want to capture one backup copy of the whole database, including the read-only data, as the basis for any subsequent restore operations. We can take a partial before taking a full database backup, but we do want to make sure we have a solid restore point for the database, before starting our partial backup routines. Therefore, Listing 4 takes a full database backup of our DatabaseForPartialBackups database. Having done so, it also inserts some more data into MainData, so that we have fresh data to capture in our subsequent partial backup.
USE [master] GO BACKUP DATABASE DatabaseForPartialBackups TO DISK = N'C:\SQLBackups\Chapter10\DatabaseForPartialBackups_FULL.bak' GO INSERT INTO DatabaseForPartialBackups.dbo.MainData VALUES ( 'Data for third database load: Data 7' ) INSERT INTO DatabaseForPartialBackups.dbo.MainData VALUES ( 'Data for third database load: Data 8' ) INSERT INTO DatabaseForPartialBackups.dbo.MainData VALUES ( 'Data for third database load: Data 9' ) GO
The output from the full database backup is shown in Figure 1. Notice that, as expected, it processes both of our data files, plus the log file.
Processed 176 pages for database 'DatabaseForPartialBackups', file 'DatabaseForPartialBackups' on file 1. Processed 16 pages for database 'DatabaseForPartialBackups', file 'DatabaseForPartialBackups_ReadOnly' on file 1. Processed 2 pages for database 'DatabaseForPartialBackups', file 'DatabaseForPartialBackups_log' on file 1. BACKUP DATABASE successfully processed 194 pages in 0.043 seconds (35.110 MB/sec).
Partial Database Backup using T-SQL
We are now ready to perform our first partial database backup, which will capture the data inserted in our third data load, as shown in Listing 5.
BACKUP DATABASE DatabaseForPartialBackups READ_WRITE_FILEGROUPS TO DISK = N'C:\SQLBackups\Chapter10\DatabaseForPartialBackups_PARTIAL_Full.bak' GO
The only difference between this backup command and the full database backup command shown in Listing 4 is the addition of the READ_WRITE_FILEGROUPS option. This option lets SQL Server know that the command is a partial backup and to only process the read-write filegroups contained in the database.
The output should be similar to that shown in Figure 2. Notice that this time only the primary data file and the log file are processed. This is exactly what we expected to see: since we are not processing any of the read-only data, we shouldn't see that data file being accessed in the second backup command.
Processed 176 pages for database 'DatabaseForPartialBackups', file 'DatabaseForPartialBackups' on file 1. Processed 2 pages for database 'DatabaseForPartialBackups', file 'DatabaseForPartialBackups_log' on file 1. BACKUP DATABASE...FILE=<name> successfully processed 178 pages in 0.039 seconds (35.481 MB/sec).
Differential Partial Backup using T-SQL
Just as we can have differential database backups, which refer to a base full database backup, so we can take differential partial database backups that refer to a base partial database backup, and will capture only the data that changed in the read-write data files, since the base partial backup was taken.
Before we run a differential partial backup, we need some fresh data to process.
USE [DatabaseForPartialBackups] GO INSERT INTO MainData VALUES ( 'Data for fourth database load: Data 10' ) INSERT INTO MainData VALUES ( 'Data for fourth database load: Data 11' ) INSERT INTO MainData VALUES ( 'Data for fourth database load: Data 12' ) GO
Listing 7 shows the script to run our partial differential backup. The one significant difference is the inclusion of the WITH DIFFERENTIAL option, which converts the command from a full partial to a differential partial backup.
USE [master] GO BACKUP DATABASE [DatabaseForPartialBackups] READ_WRITE_FILEGROUPS TO DISK = N'C:\SQLBackups\Chapter10\DatabaseForPartialBackups_PARTIAL_Diff.bak' WITH DIFFERENTIAL GO
Once this command is complete, go ahead and check the output of the command in the messages tab to make sure only the proper data files were processed. We are done taking partial backups for now and can now move on to our restore examples.
Performing Partial Database Restores
We will be performing two restore examples; one restoring the DatabaseForPartialBackups database to the state in which it existed after the third data load, using the full database, and full partial backup files, and one restoring the database to its state after the fourth data load, using the full database, full partial and differential partial backup files.
Restoring a Full Partial Backup
Listing 8 show the two simple steps in our restore process. The first step restores our full database backup file, which will restore the data and objects in both our read-write and read-only filegroups. In this step, we include the NORECOVERY option, so that SQL Server leaves the database in a state where we can apply more files. This is important so that we don't wind up with a database that is online and usable before we apply the partial backup.
The second step restores our full partial backup file. This will overwrite the read-write files in the existing database with the data from the backup file, which will contain all the data we inserted into the primary data file, up to and including the third data load. We specify that this restore operation be completed with RECOVERY.
USE [master] GO RESTORE DATABASE [DatabaseForPartialBackups] FROM DISK = N'C:\SQLBackups\Chapter10\DatabaseForPartialBackups_FULL.bak' WITH NORECOVERY GO RESTORE DATABASE [DatabaseForPartialBackups] FROM DISK = N'C:\SQLBackups\Chapter10\DatabaseForPartialBackups_PARTIAL_Full.bak' WITH RECOVERY GO
The output from running this script is shown in Figure 3. We should see all files being processed in the first command, and only the read-write and transaction log file being modified in the second command.
Processed 176 pages for database 'DatabaseForPartialBackups', file 'DatabaseForPartialBackups' on file 1. Processed 16 pages for database 'DatabaseForPartialBackups', file 'DatabaseForPartialBackups_ReadOnly' on file 1. Processed 2 pages for database 'DatabaseForPartialBackups', file 'DatabaseForPartialBackups_log' on file 1. RESTORE DATABASE successfully processed 194 pages in 0.760 seconds (1.986 MB/sec). Processed 176 pages for database 'DatabaseForPartialBackups', file 'DatabaseForPartialBackups' on file 1. Processed 2 pages for database 'DatabaseForPartialBackups', file 'DatabaseForPartialBackups_log' on file 1. RESTORE DATABASE ... FILE=<name> successfully processed 178 pages in 0.124 seconds (11.159 MB/sec).
Everything looks good, and exactly as expected, but let's put on our "Paranoid DBA" hat once more and check that the restored database contains the right data.
USE [DatabaseForPartialBackups] GO SELECT ID , Message FROM dbo.MainData SELECT ID , Message FROM dbo.ArchiveData
Hopefully, we'll see three rows of data in the ArchiveData table and six rows of data in the read-write table, MainData, as confirmed in Figure 4.
Restoring a Differential Partial Backup
Our restore operation this time is very similar, except that we'll need to process all three of our backup files, to get the database back to its state after the final data load. You may be wondering why it's necessary to process the full partial backup in this case, rather than just the full backup followed by the differential partial backup. In fact, the full database backup cannot serve as the base for the differential partial backup; only a full partial backup can serve as the base for a differential partial backup, just as only a full database backup can serve as a base for a differential database backup. Each differential partial backup holds all the changes since the base partial backup, so if we had a series of differential partial backups, we would only need to restore the latest one in the series.
Listing 10 shows the script; we restore the full database and full partial backups, leaving the database in a restoring state, and then apply the differential partial backup and recover the database.
USE [master] GO RESTORE DATABASE [DatabaseForPartialBackups] FROM DISK = N'C:\SQLBackups\Chapter10\DatabaseForPartialBackups_FULL.bak' WITH NORECOVERY GO RESTORE DATABASE [DatabaseForPartialBackups] FROM DISK = N'C:\SQLBackups\Chapter10\DatabaseForPartialBackups_PARTIAL_Full.bak' WITH NORECOVERY GO RESTORE DATABASE [DatabaseForPartialBackups] FROM DISK = N'C:\SQLBackups\Chapter10\DatabaseForPartialBackups_PARTIAL_Diff.bak' WITH RECOVERY GO
Once again, check the output from the script to make sure everything looks as it should, and then rerun Listing 9 to verify that there are now three more rows in the MainData table, for a total of 9 rows, and still only be three rows in the ArchiveData table.
Special Case Partial Backup Restore
Here, we'll take a quick look at a special type of restore operation that we might term a "partial online piecemeal restore", which will bring the database online by restoring only the read-only filegroup in the database (requires Enterprise edition), as shown in Listing 11. This type of restore can be done with both full partial backups as well as full file backups, provided the full file backup contains the primary filegroup, with the database system information. This is useful if we need to recover a specific table that exists in the read-write filegroups, or want to view the contents of the backup without restoring the entire database.
-- restore the read-write filegroups RESTORE DATABASE [DatabaseForPartialBackups] FROM DISK = N'C:\SQLBackups\Chapter10\DatabaseForPartialBackups_PARTIAL_Full.bak' WITH RECOVERY, PARTIAL GO
Now, we should have a database that is online and ready to use, but with only the read-write filegroup accessible, which we can verify with a few simple queries, shown in Listing 12.
USE [DatabaseForPartialBackups] GO SELECT ID , Message FROM MainData GO SELECT ID , Message FROM ArchiveData GO
The script attempts to query both tables and the output is shown in Figure 5.
We can see that we did pull 6 rows from the MainData table, but when we attempted to pull data from the ArchiveData table, we received an error, because that filegroup was not part of the file we used in our restore operation. We can see the table exists and even see its structure, if so inclined, since all of that information is stored in the system data, which was restored with the primary filegroup.
SQL Backup Partial Backup and Restore
In this section we will, without restarting the whole example from scratch, take a look at the equivalent full partial and differential partial backup commands in SQL Backup, as shown in Listing 13.
USE master GO -- full partial backup with SQL Backup EXECUTE master..sqlbackup '-SQL "BACKUP DATABASE [DatabaseForPartialBackups] READ_WRITE_FILEGROUPS TO DISK = ''C:\SQLBackups\Chapter10\DatabaseForPartialBackups_Partial_Full.sqb'' WITH DISKRETRYINTERVAL = 30, DISKRETRYCOUNT = 10, COMPRESSION = 3, THREADCOUNT = 2"' -- differential partial backup with SQL Backup EXECUTE master..sqlbackup '-SQL "BACKUP DATABASE [DatabaseForPartialBackups] READ_WRITE_FILEGROUPS TO DISK = ''C:\SQLBackups\Chapter10\DatabaseForPartialBackups_Partial_Diff.sqb'' WITH DIFFERENTIAL, DISKRETRYINTERVAL = 30, DISKRETRYCOUNT = 10, COMPRESSION = 3, THREADCOUNT = 2"'
Listing 14 shows the equivalent restore commands for partial backups; again, they are very similar to what we have seen before in other restore scripts. We restore the last full database backup leaving the database ready to process more files. This will restore all of the read-only data, and leave the database in a restoring state, ready to apply the partial backup data. We then apply the full partial and differential partial backups, and recover the database.
-- full database backup restore EXECUTE master..sqlbackup '-SQL "RESTORE DATABASE [DatabaseForPartialBackups] FROM DISK = ''C:\SQLBackups\Chapter10\DatabaseForPartialBackups_FULL.sqb'' WITH NORECOVERY"' -- full partial backup restore EXECUTE master..sqlbackup '-SQL "RESTORE DATABASE [DatabaseForPartialBackups] FROM DISK = ''C:\SQLBackups\Chapter10\DatabaseForPartialBackups_Partial_Full.sqb'' WITH NORECOVERY"' -- differential partial backup restore EXECUTE master..sqlbackup '-SQL "RESTORE DATABASE [DatabaseForPartialBackups] FROM DISK = ''C:\SQLBackups\Chapter10\DatabaseForPartialBackups_Partial_Diff.sqb'' WITH RECOVERY"'
Possible issues with Partial Backup and Restore
The biggest issue when restoring a database using partial backups is the storage and management of the read-only data. This data is captured in a full database backup, directly after the data import, and then not backed up again. As such, it is easy for its existence to slip from a DBA's mind. However, just because this portion of the database is backed up less frequently, it does not mean it is less important; it is still an integral piece of the recovery strategy, so manage each file in your backup strategy carefully and make sure the full database backup doesn't get lost in the archival jungle.
When performing one of the full database backups on a database where the backup strategy includes subsequent partial backups, it's a good idea to perform a checksum on that full database backup. This can, and most likely will, slow down the backup speed but if you can take the speed hit, it's nice to have the reassurance that this full backup is valid, since they will be captured infrequently. Of course, performing a test restore with the newly created full backup file is even better!
If you do find that your full database backup has not been stored properly or that the file is somehow corrupted, what can you do? Well, not a whole lot. Hopefully you have multiple copies stored in different locations for just this type of situation. Keeping a copy on long term storage, as well as locally on a robust disk, are other good ideas when dealing with data that is not backed up on a regular basis. Be diligent when you are managing your backup files. Remember that your data is your job!
Partial Backups and Restores in the SLA
Like file backup and restore, partial backup and restore will most likely form a very small part of your overall recoverability strategy; partial backups are a very useful and time saving tool in certain instances, but they will not likely be the "go to" backup type in most situations, largely because most databases simply don't contain a large portion of read-only data.
However, if a strategy involving partial backups seems a good fit for a database then, in the SLA for that database, you'll need to consider such issues as:
- Frequency of refreshing the full database backup
- Frequency of full partial and differential partial backups
- Are transaction log backups still required?
As noted earlier, partial backups are designed with SIMPLE recovery model databases in mind. When applied in this way, it means that we can only restore to the last backup that was taken, most likely a full partial or a differential partial backup, and so we do stand to lose some data modifications in the case of, say, a midday failure. That is something you have to weigh against your database restore needs to decide if this type of backup will work for you. Like every other type of backup, weigh up the pros and cons to see which type, or combination of types, is right for your database.
Just because this type of backup was designed mainly for SIMPLE recovery mode databases, it doesn't mean that we can only use it in such cases. For a FULL recovery model database that has a much smaller window of data loss acceptability, such as an hour, but does contain a large section of read-only data, this backup type can still work to our advantage. We would, however, need to take transaction log backups in addition to the partial full and partial differentials. This will add a little more complexity to the backup and restore processes, in the event of emergency, but will enable point-in-time restore.
Forcing Failures for Fun
Much as I hate to end on a note of failure, that is sometimes the lot of the DBA; deal with failures as and when they occur, learn what to look out for, and enforce measures to ensure the problem does not happen again. With that in mind, let's walk through a doomed partial backup scheme. Take a look at the script in Listing 10.15 and, one last time, try to work out what the problem is before running it.
USE [master] GO RESTORE DATABASE [DatabaseForPartialBackups] FROM DISK = N'C:\SQLBackups\Chapter10\DatabaseForPartialBackups_FULL.bak' WITH NORECOVERY GO RESTORE DATABASE [DatabaseForPartialBackups] FROM DISK = N'C:\SQLBackups\Chapter10\DatabaseForPartialBackups_PARTIAL_Diff.bak' WITH RECOVERY GO
Do you spot the mistake? Figure 6 shows the resulting SQL Server error messages.
The first error we get in the execution of our script is the error "File DatabaseForPartialBackups is not in the correct state to have this differential backup applied to it." This is telling us that the database is not prepared to process our second restore command, using the differential partial backup file.
The reason is that we have forgotten to process our partial full backup file. Since the partial full file, not the full database file, acts as the base for the partial differential we can't process the partial differential without it. This is why our database is not in the correct state to process that differential backup file.
You should now be familiar with how to perform both partial full and partial differential backups and be comfortable restoring this type of backup file.
Backing up databases, and performing restores, should be something all DBAs do on a very regular basis. This skill is paramount to the DBA and we should keep working on it until the subject matter becomes second nature,so that when and if disaster strikes a database, in whatever form, your carefully documented and tested, restore strategy will allow you to get that database back online with an acceptable level of data loss, and minimal downtime.
This chapter appears as part of Shawn's book SQL Server Backup and Restore. If you'd like to be notified when the free ebook first becomes available, you can sign up here.