Click here to monitor SSC
  • Av rating:
  • Total votes: 64
  • Total comments: 4
Hugo Shebbeare

Disaster Recovery for SQL Server Databases

28 August 2009


High-Availability depends on how quickly you can recover a production system after an incident that has caused a failure. This requires planning, and documentation. If you get a Disaster Recovery Plan wrong, it can make an incident into a catastrophe for the business. Hugo Shebbeare discusses some essentials, describes a typical system, and provides sample documentation.
(une traduction en Français de ce Plan de Relève est déjà en cours, et sera publié prochainement.)

In this article, I'll lay out the technical details of implementing a simple Disaster Recovery Plan (DRP) for production applications running Microsoft SQL Server. My goal is to provide you with generic documentation to use as the basis of your own production system failover strategy. You will, of course, need to alter it with your own details and keep it updated any time that changes are made to your production systems, but this should give you a good departure point from which to build your own strategy.

I'll describe the steps to follow in the event of the failure of a database production system, and annotate the process as I go along. This is largely based on a Disaster Recovery Plan I had to design recently (all the better for you to download and personalize), so it is deliberately written in the style of a business strategy document. I’ll also explain the advantages of automatic restoration of compressed backup files from a failover server. I'll also be the first to admit that this topic might seem a little dry, but having a DRS will make it worth the read – I promise.

Part 1 of this article will describe the basic steps necessary to set up a ‘hot' standby server (the recovery method I used when drafting this DRP), and Part 2 is an annotated transcript of Disaster Recovery document, including steps to be taken in the event of a disaster, and information for the unfortunate DBA tasked with recovering from it. Here we go:


PART 1 - Automatic Restoration of backup files to a failover server

SQL Servers' norecovery mode keeps the database stable and ready to accept the changes you're progressively applying as part of the backup process. This means that it's only necessary to apply the latest differential or log backup before the database is ready to be accessed by users and applications.

The disaster recovery method used is to have a ‘hot' standby server (SQL2), which is already installed, stable and, most importantly, is an exact copy of the production server's configuration. The standby server should already have the most recent operational databases fully-restored in norecovery mode.

Implementing a Hot Standby Server

After SQL Server has been installed on the failover server, you need to check that Robocopy is installed in the sysroot\windows\system32 folder. Secondly, Red Gate's SQL Backup software must connect to the server and be configured by clicking the small grey square next to server listing in left pane – this is for instance auto-configuration, if it has not been done already.

Auto-configuring SQL Backup

Figure 1 - SQL Backup's auto-configuration system.

Robocopy is much better than ( the soon-to-be-discontinued) Xcopy, by the way. And since Windows Server 2003, Robocopy has been the recommended / future-proofed tool of choice. As far as I know, Xcopy will no longer be available in future versions of Windows Server.

Next, for the stored procedures that execute Robocopy (we place these procedures in a local database on each server called DBA_tools), you need to allow the advanced option xp_cmdshell to run:

-- To allow advanced options to be changed.

EXEC sp_configure 'show advanced options'1

GO

-- To update the currently configured value
-- for advanced options.

            

RECONFIGURE

GO

-- To enable the feature.

EXEC sp_configure 'xp_cmdshell'1

GO

-- To update the currently configured value for this feature.

            

RECONFIGURE

GO

In order to copy the backup files, each database on the standby server needs a database-specific SQL Server Agent job running Robocopy at the required interval to copy full and differential backups from the production server to the standby server. These jobs can be run at whatever frequency needed, be it daily, hourly or even more often if your operations require it.

Robocopy is the first step in all automated restore jobs, unless you want to add validation steps prior to the backup file copy. The following example copies all differential database backups from a production server to a DRP server:

EXEC dbo.usp_RoboCopy '\\PRODserver\drive$\ProdServerBackupShare\Diff''\\DRPserver\Drive$\ProdServerDbBackupFolder\Diff''database1_* database2_*'

-- This case just handles the differential folder, so we’re assuming you’ll also have
-- a Tlog and Full folder.

A database-specific SQL Server Job will restore these backups daily to the hot standby server (DRP) using stored procedures specifically created for this setup, such as:

  • usp_DB_Restore_Master or usp_DB_Restore_Master_Multi
  • usp_DB_Restore
  • usp_DB_Restore_NoRecovery
  • usp_DB_Restore_differential
  • usp_DB_Restore_Log

A consideration for the DBA regarding the level of database recovery

If you are currently in Simple Recovery mode, and provided there are regular Transaction Log and differential backups (as in, several times a day), you can switch your recovery model over to Bulk-Logged in production to restore up to a specific point in time. This will naturally minimize the amount of data lost from the work session prior to any downtime.

Full Recovery mode is recommended for critical databases that require auditing compliance.

In the event of failure, the most recent log or differential backup is ready to be applied to the standby database sitting in norecovery mode, and you’re up and running quickly with minimal down-time.

An alternative method for a much smaller database, where the total restore time is below five minutes, is to apply the complete restore every hour to the failover server, in which case you don’t need to worry about norecovery mode.

PART 2 - Instructions to follow in the event of a disaster to the production system

  1. If you haven’t heard from them directly already, please contact FIRST LINE DBA SUPPORT at [INSERT NUMBER] or SECONDARY DBA at [INSERT NUMBER]
  2. After the production/original data publisher server failure (SQL1), the restore / backup-subscriber server (SQL2) will be used as the primary database server (a.k.a. DRP server). Inform everyone in the department by E-mail (It's also worth thinking about who will inform internal/external clients).
  3. Once the switch occurs to the DRP server and the downtime of SQL1 actually happens, all application connection strings need to be changed to access SQL2. The CGI should handle this step automatically.
  4. Disable Automatic Restore SQL Agents on SQL2.
  5. Disable all SQL Agent jobs on failed server SQL1 if possible.
  6. Enable all maintenance and backup jobs on newly active server SQL2

Please note that restoring a log backup is not possible if the production database recovery model is set to Simple. For fine-grained restoration, the database needs to have been using the Full recovery model - Thankfully, the default setting is the Full Recovery model. If point in time recoveries are requested by management on a regular basis, then we can also change the database recovery level to Bulk-Logged, if space is an issue, and Full otherwise - Perhaps with deserved hesitation from the side of the Database Administrators, as Bulk-logged recovery is much more space efficient.

How the automation of the restore from compressed backup is benefitial to your production environment. Ideally you should keep two full backups, one on the Test server and one on the DRP server. Having this second copy of the production databases will allow you to do some useful and intensive work which you don’t want to have to run on live databases, such as a full DBCC CheckDB – console commands that can check the integrity of your exact database restore copy.

A log of what has been restored shall be placed in the following directory:

\\DatabaseServerName\drive$\prodBackupDir\DBlog\

As soon as a restore is completed, we should have an automatic purge of old backups – done perhaps every week (maximum 14 days manually, or automatically in a SQL Server Maintenance Plan), and which can be automated using a batch file or PowerShell Script.

To ensure a smooth restore process, we should read the restore parameters directly from the backup log system tables - such as BackupHistory, BackupSet, BackupFile or

Backuplog - unless a backuplog table is explicitly created in a local database or exists in msdb. This is to ensure that the essential restore parameters (such as the backup file name and position) are immediately available.

As I often set them, the SQL Agent Restore jobs have their parameters manually set during testing and are usually left that way – but of course it’s best to pull the meta-data directly from the system in case you move files around and forget to update the restore scripts.

SQL1 & SQL2 (Prod. & DRP) Server Hardware Configuration

This is the configuration for the servers this document was originally written for (I don’t remember the System Models for that setup, but that’s not to say you shouldn’t record yours) Update the following with your own server properties.

SQL1 (production instance)

1.1

Server Type

Windows 2008 (standard x64 edition)

1.2

System Model

[Server Model Number, Product Type]

1.3

RAM Memory

8 Gig

1.4

No. of CPU’s

2

1.5

CPU & Speed

AMD (x64)

Drives

Hard Disk Space

C(#G);D(#G)

SQL2 (storage replication partner / hot standby restore-subscriber)

1.1

Server Type

Windows 2008 ( standard x64 edition )

1.2

System Model

[Server Model Number, Product Type] - Same as SQL1

1.3

RAM Memory

9 Gig

1.4

No. of CPU’s

2

1.5

CPU & Speed

AMD (x64) Opteron Processor 280

Drives

Hard Disk Space

C(#G); D(#G); F(2TB); G(250GB); H (1.5TB); Z(20GB)

This server should have terabytes and terabytes of space, depending on your archiving needs.

SQL Server Configuration

For a previous client our production build of SQL Server was 9.0.3152, so naturally the DRP server had to be the exact same build – both systems must be as identical as possible.

Our servers are using 64-bit versions of the SQL Database Engine 2005/8, with at least service pack 2 (2005), cu3 (2008) installed, and the collation type is Latin1_General_CI_AS (accent sensitive is recommended).   It is preferable to have at least Cumulative Rollup package 8 or SP3 for SQL Server 2005, and it’s important to do an update to production build levels of SQL on a regular basis.

Detailed information for the server and databases is included in the compiled help file located on both servers SQL1 and SQL2

D:\DRP\ServerName.chm (i.e. make it very easy to find DRP info)

Critical SQL Server User Database Details

1. List of databases

Database1

Database2

NB:  We will not be doing master, msdb, model or temp – these are backed up on a regular basis and will be copied by Robocopy although not restored onto the database restore replication subscriber directly.

 2. Database Maintenance Plan and Auto-Restore.

In general, our database restore plan will reflect exactly the backup schedule and wait for backups to finish by querying the metadata from the production server. The restore jobs will check to see if the days’ full backup has completed (or daily diff.) using the backupset.backup_finish_date  column.  Once we see that Full backup has been completed on the production server, we copy the backupfile over to the hot standby server. In the second step of the job, we continue to execute the code from the appropriate usp_DB_restore combined with the metadata extraction from the system tables.

3. Database Backup schedule in production

Maintenance Job Name

Maintenance Job Description

Freq

Time to Run

BackupFull_Database1

Full Database backup Database1

W

Sunday 6:00

BackupFull_Database2

Full Database backup Database2

W

Sunday 6:30

4. Restore jobs on DRP server

Maintenance Job Name

Maintenance Job Description

Freq

Time to Run

BackupFull_Database1

Full Database backup Database1

W

Sunday 6:00

BackupFull_Database2

Full Database backup Database2

W

Sunday 6:30

Critical Scripts, Procedures and Programs related to disaster recovery

Following is a list of all the code used for the DRP process from SQL1 to SQL2:

usp_DB_Restore_Master

CREATE PROC usp_DB_Restore_MasterRecovery

-- Add the database name and input variables, instead of setting them on lines 40-23.

      

AS

DECLARE 

@filename     VARCHAR(255)

      ,

@cmd              VARCHAR(500)

      ,

@cmd2             VARCHAR(500)

      ,

@dbNameSource sysname -- This is an input parameter, unless you are testing.

      

,@dbNameTarget sysname -- This is an input parameter, unless you are testing.

      

,@FullRestoreFolder NVARCHAR(MAX)-- This is an input parameter, unless you

                                       -- are testing.

      

,@dbNameStatement NVARCHAR(MAX)

      ,

@dbNameStatementDiff NVARCHAR(MAX)

      ,

@LogicalName      VARCHAR(255)

      ,

@PhysicalName     VARCHAR(255)

      ,

@Type             VARCHAR(20)-- Useful if reading the backup headers, no?                                     -- As part of the validation perhaps.

      

,@FileGroupName    VARCHAR(255)

      ,

@Size             VARCHAR(20)

      ,

@MaxSize          VARCHAR(20) -- Check what I do above and use what's below                                      -- if relevant.

      

,@filelistStatmt1 VARCHAR(MAX)

      ,

@filelistStatmtDiff VARCHAR(MAX)

/* The following variables are set up for testing and may be taken off when sp is used afterwards (if we cannot get them reliably from sys databases automatically). */

      

,@backupFile sysname  -- will grab from local test .sqb files first.

      

,@logicalDataFile sysname

/* I am developing this code first assuming that we will only have one data file and logical file for each database. Later we'll add support for multiple logical and physical files (there may be, in Database1's case, more than one row for dbo.sysfiles

where fileid=1 and groupid=1). */

      

,@logicalDataStmt1 NVARCHAR(MAX)

      ,

@logicalDataStmt2 NVARCHAR(MAX)

      ,

@logicalDataStmt3 NVARCHAR(MAX)

      ,

@logicalLogFile sysname  -- Returned and verified.

      

,@logicalLogStmt1 NVARCHAR(MAX-- So many annoying variables. If I could just 

      

,@logicalLogStmt2 NVARCHAR(MAX-- read the header it'd be easier in the future.

      

,@logicalLogStmt3 NVARCHAR(MAX)

      ,

@physicalDataFile sysname -- Easy to grab since it was in master.

      

,@physicalLogFile sysname -- Need two variables.

      

,@physicalLogFileStmt1 NVARCHAR(MAX)

      ,

@physicalLogFileStmt2 NVARCHAR(MAX)

      ,

@physicalLogFileStmt3 NVARCHAR(MAX)

  

SET NOCOUNT ON -- Following best practices, although we're not throwing around big

                 -- counts anyway.

-- Parameters and variables set by Hugo for testing.

  

SET @FullRestoreFolder='\\testServer\Drive$\ProdServerBackupFolder\full\'

  

SET @dbNameSource ='Database1'

  

SET @dbNameTarget ='Database1'  -- Sometimes we want to over-write another database

                                  -- (e.g. in the case of importpdm_tst).

  

SET @physicalDatafile=(

  

SELECT filename 

    

FROM MASTER.dbo.sysdatabases 

    

WHERE NAME=@dbnameTarget)

      

PRINT 'The physical data FILE TO RESTORE IS '+@physicalDatafile

  

SET @logicalDataStmt1='select top 1 name from ['

  

SET @logicalDataStmt2='].dbo.sysfiles where fileid=1 and groupid=1'

  

SET @logicalDataStmt3 (@logicalDataStmt1+@dbNameTarget+@logicalDataStmt2)

/* I wanted to do an execute at this line but I kept thinking that if it was within a single statement, my set statement @logicalDataFile would take the result of the query (select top 1 name from PROD_PASRAA.dbo.sysfiles where fileid)=1 and groupid=1. */

CREATE TABLE #logicalDataFile  -- Drop table #logicaldatafile.

  

(

        

logicalDataFile sysname

  

)

INSERT INTO [#logicalDataFile]

  

(

        

logicalDataFile

  

)

  

EXEC (@logicalDataStmt3)

-- Now set the variable from the temp finally.

  

SET @logicalDataFile=(

  

SELECT 

    

FROM #logicalDataFile)

-- Check out a temp table method...need the result set from.

      

PRINT 'the logical data file is '+@logicalDataFile

  

SET @logicalLogStmt1='select top 1 name from ['

  

SET @logicalLogStmt2='].dbo.sysfiles where fileid=2 and groupid=0'

-- Put the statement together.

  

SET @logicalLogStmt3 (@logicalLogStmt1+@dbNameTarget+@logicalLogStmt2)  

CREATE TABLE #logicalLogfile  -- Drop table #logicalLogfile.

  

(

        

logicalLogFile sysname

  

)

INSERT INTO [#logicalLogFile]

  

(

        

logicallogFile

  

)

  

EXEC (@logicallogStmt3)

-- Now set the variable from the temp finally.

  

SET @logicalLogFile=(

  

SELECT 

    

FROM #logicalLogFile)

      

PRINT 'the logical log file is '+@logicalLogFile -- Has the right value  

-- grab the last db file related variable from sysfiles.

  

SET @physicalLogFileStmt1='select filename from ['

  

SET @physicalLogFileStmt2='].dbo.sysfiles where fileid=2 and groupid=0'

  

SET @physicalLogFileStmt3 (@physicalLogFileStmt1+@dbNameTarget+@physicalLogFileStmt2)

CREATE TABLE #physicalLogFile  -- Drop table #physicalLogFilefile.

  

(

        

physicalLogFile sysname

  

)

INSERT INTO [#physicalLogFile]

  

(

        

physicalLogFile

  

)

  

EXEC (@physicalLogFileStmt3)

-- Now set the variable from the temp finally.

  

SET @physicalLogFile=(

  

SELECT 

    

FROM #physicalLogFile)

      

PRINT 'the physical log file is '+@physicalLogFile

/* All verified up to here, and we're ready for some backup logic.Grab the

corresponding file for the restore folder, and according to following inputs

full restore file. */

CREATE TABLE #dirList (filename NVARCHAR(MAX))

CREATE TABLE #filelist (

        

LogicalName NVARCHAR(MAX)

        ,

PhysicalName NVARCHAR(MAX)

        ,

[Type] VARCHAR(20)

        ,

FileGroupName VARCHAR(50)

        ,

Size VARCHAR(20)

        ,

MaxSize VARCHAR(20)

  )

-- Get the list of database backups that are in the restoreFromDir directory.

IF @dbNameSource IS NULL -- If @OneDBName is null.

-- For our purposes we are only using one database name for this.

  

SELECT @cmd 'dir /b /on "' +@FullRestoreFolder'"' -- Select @cmd = 'dir /b /on '

                                                        -- +@restoreFromDir+

ELSE

SELECT 

@cmd 'dir /b /o-d /o-g "' +@FullRestoreFolder'"'

INSERT #dirList 

  

EXEC master..xp_cmdshell @cmd

-- Select filename from #dirlist whose list of files is good.

SELECT @dbNameStatement'full%_'+@dbnameSource+'_200%_%.sqb'  SQB IS FOR SQLBACKUP

  

SET @filelistStatmt1=(

  

SELECT TOP 1 filename 

    

FROM #dirList 

    

WHERE filename LIKE @dbNameStatement)

      

PRINT 'This is the full backup file to be restored '+@filelistStatmt1

  

SET @backupfile=(@FullRestoreFolder)+(@filelistStatmt1)

      

PRINT 'this is the full path to the full restore file that will be restored '

+

@backupfile

EXEC DBA_Tools.dbo.[usp_DB_Restore]  @backupfile@dbnameTarget,  @logicalDataFile,

@logicalLogFile@physicalDataFile@physicalLogFile

usp_DB_Backup & usp_DB_Restore

-- ============================================= 

-- Description: Restore Database 

-- Parameter1: Restore File Name 

-- Parameter2: Full path of file location i.e. 'DriveName:\BackupShare\' 

-- Parameter3: RestoreType 

-- FDN=full or differential no recovery, FDR = full or differential with recovery, 

-- LN=log no recovery, LR=log with recovery 

-- File Extensions: Full = *.bak , Differential= *.dif, T- Log= *.trn , *.sqb

-- (SQLBackup)

-- ============================================= 

CREATE PROCEDURE [dbo].[usp_DB_restore]

  

@RestoreFileName SYSNAME,

  

@LogicalNameData SYSNAME,

  

@LogicalNameLog SYSNAME,

  

@RestorePathData SYSNAME,

  

@RestorePathLog SYSNAME,

  

@ResoreType CHAR(1)

      

AS

  BEGIN

  SET 

NOCOUNT ON ;

  

DECLARE @SqlCmd NVARCHAR(2000)

  

DECLARE @DateTime SYSNAME

  

DECLARE @BakupFile NVARCHAR(1400),

    

@DiffFile NVARCHAR(1400),

    

@LogFile NVARCHAR(1400)

  

IF @ResoreType 'FDN'

    

SET @SqlCmd 'RESTORE DATABASE ' QUOTENAME(@DBName)

          + 

' TO DISK = ' @Bakupfile 'WITH INIT'

    

IF @ResoreType 'FDR'

    

SET @SqlCmd 'RESTORE DATABASE ' QUOTENAME(@DBName)

          + 

' TO DISK = ' @Bakupfile 'WITH INIT'

    

IF @ResoreType 'LN'

    

SET @SqlCmd 'RESTORE LOG ' QUOTENAME(@DBName)

          + 

' TO DISK = ' @LogFile +

    

IF @ResoreType 'LR'

    

SET @SqlCmd 'RESTORE LOG ' QUOTENAME(@DBName)

          + 

' TO DISK = ' @LogFile +

          

PRINT @SqlCmd

    

EXECUTE sp_executesql @SqlCmd

  

END

usp_DB_Restore_NoRecovery 

Same as above, but for databases that need to be left in no recovery mode (e.g. waiting for a log backup to be applied or differential)

CREATE  PROC [dbo].[usp_DB_Restore_NoRecovery]  -- input variables when it all works below 

  

@backupfile SYSNAME,

  

@dbName SYSNAME,

  

@logicalDataFile SYSNAME,

  

@logicalLogFile SYSNAME,

  

@physicalDatafile SYSNAME,

  

@physicalLogFile SYSNAME

      

AS

DECLARE 

@exitcode INT

DECLARE 

@sqlerrorcode INT

DECLARE 

@restoreStmt NVARCHAR(MAX)

  

SET NOCOUNT ON

-- Kill any users in the database nicely? well, not really nicely.

EXEC usp_KillConnections @dbName

  

SET @restoreStmt N'-SQL RESTORE DATABASE ' @dbName 

FROM DISK = ' 

@backupfile 

WITH NORECOVERY 

,MOVE ' 

@logicalDataFile ' TO ' @physicalDatafile 

,MOVE ' 

@logicalLogFile ' TO ' @physicalLogFile 

,REPLACE 

,LOGTO = "\\ServerName\Drive$\SourceServerName\DBlog<DATABASE>_<TYPE>_ <DATETIME yyyymmddhhmss>.txt"'

--PRINT @restoreStmt

EXEC master..sqlbackup @restoreStmt@exitcode OUT@sqlerrorcode OUT

IF @exitcode >= 500 )

      OR ( 

@sqlerrorcode <> )

  

BEGIN

  RAISERROR 

'SQL Restore failed with exit code: %d SQL error code: %d',

          

161@exitcode@sqlerrorcode )

  

END

usp_DB_Restore_Differential

-- restore directly from our copy which is automatically brought local using robocopy 

-- EXEC [usp_DB_restore_Differential]

-- '\\TestServer\Drive$\ProductionServer\Diff\
-- DIFF_ServerName_DB_20080301_210001.sqb',

-- 'DBname', 'LogicalDataFileName',  'LogicalLogFileName',

-- 'Drive:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\Database1.mdf',

-- 'Drive:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\
-- Database1_log.ldf' 

-- drop proc [usp_DB_restore_Differential]

CREATE  PROC [dbo].[usp_DB_Restore_Differential]

-- Input variables when it all works below.

  

@backupfile SYSNAME,

  

@dbName SYSNAME,

  

@logicalDataFile SYSNAME,

  

@logicalLogFile SYSNAME,

  

@physicalDatafile SYSNAME,

  

@physicalLogFile SYSNAME

      

AS

DECLARE 

@exitcode INT

DECLARE 

@sqlerrorcode INT

DECLARE 

@restoreStmt NVARCHAR(MAX)

  

SET NOCOUNT ON

EXEC usp_KillConnections @dbName

-- WITH RECOVERY is used after a full restore is done already, and a final
-- differential is applied to it (restore differential should be on a db in
-- NORECOVERY MODE).

  

SET @restoreStmt N'-SQL RESTORE DATABASE ' @dbName 

FROM DISK = ' 

@backupfile 

WITH NORECOVERY 

,MOVE ' 

@logicalDataFile ' TO ' @physicalDatafile '

,MOVE ' 

@logicalLogFile ' TO ' @physicalLogFile 

,REPLACE 

,LOGTO = "\\DRPServerName\Drive$\ProdServerBackupFolder\DBlog\<DATABASE>_<TYPE>_ <DATETIME yyyymmddhhmss>.txt"'

-- PRINT @restoreStmt 

EXEC master..sqlbackup @restoreStmt@exitcode OUT@sqlerrorcode OUT

IF @exitcode >= 500 )

      OR ( 

@sqlerrorcode <> )

  

BEGIN

  RAISERROR 

'SQL Restore failed with exit code: %d SQL error code: %d',

          

161@exitcode@sqlerrorcode )

  

END

usp_DB_Restore_Log

Should allow multiple logs to be automatically applied

SET ANSI_NULLS ON

GO

SET QUOTED_IDENTIFIER ON

GO

/* EXEC [usp_DB_restore_log]  '\\ProdServer\Drive$\ProdServerDBbackups\Full\

FULL_ServerName_DatabaseName1_20080217_030000.sqb', 'LogicalFileName', 'LogicalDataFile', 'LogicalLogFileName',

'Drive:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\Database1.mdf',

'Drive:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\Database1_log.ldf'

drop proc [usp_DB_restore_log] */

CREATE  PROC [dbo].[usp_DB_restore_log]

-- Input variables when it all works below.

  

@backupfile SYSNAME,

  

@dbName SYSNAME,

  

@logicalDataFile SYSNAME,

  

@logicalLogFile SYSNAME,

  

@physicalDatafile SYSNAME,

  

@physicalLogFile SYSNAME

/* System table backupfile on production server can give us LSN (log sequence number), logical_name, physical_drive and physical_name. If not, to grab the possible backup sets that are usable, see ms-help://MS.SQLCC.v9/ MS.SQLSVR.v9.en/tsqlref9/html/f1a7fc0a-f4b4-47eb-9138-eebf930dc9ac.htm. */

      

AS

DECLARE 

@exitcode INT

DECLARE 

@sqlerrorcode INT

DECLARE 

@restoreStmt NVARCHAR(MAX)

  

SET NOCOUNT ON

-- We will not need to kill connections, since the database is in restoring state

-- already, waiting for a log.

-- EXEC usp_KillConnections @dbName

-- Transaction logs must be applied in sequential order. If there are multiple

-- transaction logs to apply we have to leave the NORECOVERY option on.


/* In most cases, under the full or bulk- logged recovery models, SQL Server 2005 requires that you back up the tail of the log before restoring a database that is currently attached on the server instance. A tail-log backup captures the log that has not yet been backed up (the tail of the log) and is the last backup of interest in a recovery plan. Restoring a database without first backing up the tail of the log results is a mistake, unless the RESTORE statement contains either the WITH REPLACE or WITH STOPAT clause.

A tail-log backup can be created independently of regular log backups by using the COPY_ONLY option. A copy-only backup does not affect the backup log chain. The transaction log is not truncated by the tail-log backup, and the log captured will be included in future normal log backups. This allows tail-log backups to be taken, for instance, to prepare for an online restore without affecting normal log backup procedures. For more information, see Copy-Only Backups (Full Recovery Model). */

-- Restore log info ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/tsqlref9/html/
877ecd57-3f2e-4237-890a-08f16e944ef1.htm.

  

SET @restoreStmt N'-SQL RESTORE Log ' @dbName '

FROM DISK = ' 

@backupfile '

 WITH RECOVERY

 

,MOVE ' 

@logicalDataFile ' TO ' @physicalDatafile '

,MOVE ' 

@logicalLogFile ' TO ' @physicalLogFile

      

'

,REPLACE

,LOGTO = "\\TestServer\d$\TtestDB\DBLog\<DATABASE>_<TYPE>_ <DATETIME yyyymmddhhmss>.txt"'

--PRINT @restoreStmt

EXEC master..sqlbackup @restoreStmt@exitcode OUT@sqlerrorcode OUT

IF @exitcode >= 500 )

      OR ( 

@sqlerrorcode <> )

  

BEGIN

  RAISERROR 

'SQL Restore failed with exit code: %d SQL error code: %d',

          

161@exitcode@sqlerrorcode )

  

END

usp_RoboCopy

CREATE   PROCEDURE [dbo].[usp_RoboCopy]

  

(

    

@srcUNC SYSNAME,    -- Source Server Name.

    

@dstUNC SYSNAME,    -- Destination Server Name.

    

@filelist VARCHAR(1024)  -- Space delimited list of files to be copied.

  

)

      

AS /*****************************************************************/

-- Stored Procedure : usp_RoboCopy

-- Creation Date : 2009-02-26

-- Written by : Stephen Mandeville, adapted by Hugo Shebbeare

 /*************************************************************************/

SET NOCOUNT ON

DECLARE @ccmd VARCHAR(1500)

DECLARE @logfile VARCHAR(25)

DECLARE @retcode INT

/**************************************************************************/

-- This stored procedure uses ROBOCOPY.exe, which is installed on server itself

-- in the sysroot\windows\system32 folder (default on 2008).

-- The Source and Destination shares must exist.

/***************************************************************************/

SELECT  @logfile REPLACE(SUBSTRING(( CONVERT(VARCHAR(15), GETDATE(), 121) ),

          110), '-''')

      + REPLACE(SUBSTRING(( CONVERT(VARCHAR(30), GETDATE(), 121) ), 12,

          8), ':''')

SELECT  @ccmd 'ROBOCOPY ' @srcUNC ' ' @dstUNC ' ' @filelist

      ' /NP /LOG:' @dstUNC '\transfer' '_' @logfile '.txt'

--PRINT @ccmd

EXECUTE @retcode master..xp_cmdshell @ccmd

/***************************************************************************/

-- The return code (@retcode) from Robocopy (version 1.74 and later) is a 

--bit map, defined as follows:

--  

-- Value    MeaningIfSet

-- 16       Serious error. Robocopy did not copy any files. This is either a 

--                         usage error or an error due to insufficient 

--                         access privileges on the 

--                         source or  destination directories.

-- 8        Some files or directories could not be copied 

--                         (copy errors occurred and the retry limit was 

--                         exceeded)Check these errors further.   

-- 4        Some Mismatched files or directories were detected. 

--                         Examine the output log. 

--                         Housekeeping is probably necessary.    

-- 2        Some Extra files or directories were detected. 

--                         Examine the output log. 

--                         Some housekeeping may be needed.    

-- 1        One or more files were copied successfully.that is, new files 

--                         have arrived).

-- 0                       No errors occurred, and no copying was done. 

--                         The source and destination

--                         directory trees are completely synchronized.

/**************************************************************************/

-- Raising error only upon @retcode > 7.

IF @retcode 7

  BEGIN

  RAISERROR 'Error occurred while executing Robocopy'16)

  RETURN @retcode )

  END --IF @retcode > 7

ELSE

  BEGIN

  RETURN @retcode )

  END --ELSE

usp_KillConnections

IF NOT EXISTS (

  

SELECT 

    

FROM sys.objects 

    

WHERE OBJECT_ID OBJECT_ID(N'[dbo].[usp_KillConnections]'

      AND 

type IN (N'P'N'PC'))

  

BEGIN

  EXEC 

dbo.sp_executesql @statement N'

/***************************************************************** 

*** Procedure: usp_KillConnections 

*** Usage: usp_KillConnections @dbname = ''Database Name'' 

*** Description: Drop all connections from a specific database 

*** Input: @dbname - REQUIRED - Name of the database 

*** Output: Outputs the results of the proccess 

*** Revision: 1.0 

*** Revision History: 1.0 First Release 

*** Author: Antonio Pedrosa Linares 

*** Date: 7/25/2007 

******************************************************************/ 

-- exec usp_KillConnections ''staplescpc''

create procedure [dbo].[usp_KillConnections] 

    @dbname varchar(128) 

as 

    declare @spid varchar(5) 

    declare @loginname nvarchar(128) 

    declare @intErrorCode int 

    declare @intOk int 

    declare @intError int 

    declare @intTotal int 

    set @intErrorCode = 0 

    set @intOk = 0 

    set @intError = 0 

    set @intTotal = 0 

    select @intTotal = count(sp.spid) FROM master..sysprocesses sp 

    JOIN master..sysdatabases sd ON sp.dbid = sd.dbid 

    WHERE sd.name = @dbname 

     

    declare KILL_CONS cursor for 

    SELECT cast(sp.spid as varchar(5)),rtrim(sp.loginame) 

    FROM master..sysprocesses sp 

    JOIN master..sysdatabases sd ON sp.dbid = sd.dbid 

    WHERE sd.name = @dbname 

     

    OPEN KILL_CONS 

    FETCH NEXT FROM KILL_CONS INTO @spid,@loginname 

    WHILE @@FETCH_STATUS = 0 

    BEGIN 

        EXEC(''Kill ''+ @spid + '''') 

        SELECT @intErrorCode = @@ERROR 

        if @intErrorCode = 0 

        begin 

            set @intOk = @intOk + 1 

            PRINT ''Process '' + @spid + '' from login '' 

                    + @loginname + '' has been ended.'' 

        end 

        else 

        begin 

            set @intError = @intError + 1 

            PRINT ''Process '' + @spid + '' from login '' 

                    + @loginname + '' could not be ended.'' 

        end 

        FETCH NEXT FROM KILL_CONS INTO @spid,@loginname 

    END 

    CLOSE KILL_CONS 

    DEALLOCATE KILL_CONS 

    PRINT ''Total number of processes from database '' 

           + @dbname + '': '' + cast (@intTotal as varchar) 

    PRINT ''Processes ended normally: '' + cast(@intOk as varchar) 

    PRINT ''Processes could not be ended: '' + cast(@intError as varchar)'

  

END

System Database Backups

On the DRP server itself the backups of the MSDB, DBAs databases, which are critical to this whole DRP process are located here:

\\DRPServerName:\DRPbackupFolder\Full 

There should always be an alternative local backup location for system databases, such as on the Test server.

All DBAs and system databases are backed up as well as on:

\\TstServerName:\TestSrvBackupFolder\Full

The following example was tested on a primary test server and exists on the restore server.  The usp_DB_restoreX stored procedure takes 6 input parameters.  To match up with backup log metadata, we shall match up the database name by date and then pull the relevant restore file input parameter into the appropriate usp_DB_restoreX stored procedure.  The master restore procedures, divided into single file and multiple file restore procedures, use all the sub procedures to do the actual restore process.

Please note that the usp_DB_RestoreX stored procedures are dependent on usp_KillConnections  which will help in the restoration process by killing the existing database users (that is, unless it’s a system user however).

e.g.

EXEC DBA_Tools.dbo.usp_DB_restore '\\TestServerName\Drive$\ProductionBackupFolder\

Full\FULL_ServerName_Database1_20080217_030000.sqb', 'DBlogicalName' 'DB_DataFile_Logicalname', 'DB_LogFileLogical_name',
'DriveName:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\
DBphysicalDataFileName.mdf'
,
'DriveName:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\
DBphysicalLogFileName_log.ldf'

The stored procedure usp_DB_restore_norecovery is the same as usp_DB_restore, only for Databases that need to be left in norecovery mode (as described earlier in the article)

Please view the Activity History from Red Gate SQL Backup for reporting on what databases have been backed up, as the scope of this document covers the restoring process only. Although the backup information is extracted to prepare the automated restore scripts within the jobs, we are not going to create (at least at this stage) customised backup reporting information.  However, do not forget that, since we are using these scripts within a SQL Server Agent job, we will have histories for each step and a log file written to the \DBlog\ folder local to the disaster recovery server running these SQL Agent Jobs.

SQL Backup Activity History

Figure 2 - SQL Backup Activity Log

Database Restore method when applying Differential Backups.

Please note that we use usp_restore_db_norecovery to load a production backup from the local copy moved over using Robocopy. Thus, if executed on the DBA database of the DRP server (SERVER NAME / INSTANCE NAME):

EXEC DBA_Tools.dbo.usp_DB_restore_norecovery 

\\DRPserver\InstanceName\full\FULL_ServerName_Database_20080217_030000.sqb', 'VSOT2', 'VSOT2_data', 'VSOT2_log,
'D:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\VSOT2.mdf, 'D:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\VSOT2_log.ldf'

This will be the core of what runs for the second step of an automated job which leaves the database in NoRecovery mode, and thus should call the respective RestoreDiff_dbx next and, finally, apply the log files via RestoreLog_dbx

After the restore, make sure to run several tests that ensure the integrity of the data and that typical applications can run normal operations on the database.


Summary

Is this disaster recovery method really minimizing the manual intervention after failure? Can we make it better? Yes and yes, but there's always room to improve. More importantly, this method certainly doesn't suit every environment. Before you take what I've put together here and run with it, I strongly recommend you take a look at the High Availability Options table below to get a clear picture of what methodologies might be more appropriate for you individual needs. To make an effective choice, you're naturally going to need a detailed understanding of each clients’ needs for the restore process.

High Availability Options

Solution

Cost

Complexity

Failover

Failback

Hardware Clustering

High

High

Fast

Fast

Software Clustering

High

High

Fast

Fast

Replication

Medium

Medium

Medium with manual
processing

Slow with manual
processing

Continuous Data
Protection

Medium

Medium

Medium

Slow

Log Shipping

Low

Low

Medium

Slow

Backup and
Restore

Low

Low

Slow

Slow

Database Mirroring

Low

Low

Fast, but only at the
database level

Fast, but only at the
database level

At the time of writing, our backup-and-restore is super slow - at least 13 hours before we were live on the warm standby - but if optimization is run, we should be done in around 2 hours.

Nobody wants to go through a disaster without being properly prepared. When I was asked to prepare a plan for Canada’s largest institutional fund manager, I took it rather seriously, hence the length of this document.  We ran this through a real disaster recovery test over a weekend, and it all worked out just fine.  I've tried to share with you exactly how you can get your own disaster recovery plan in place, so that when the time comes at least the recovery step itself is not a disaster.

Hugo Shebbeare

Author profile:

Born and raised in Vancouver, Canada, with a brief excursion to study in Brussels, Melbourne & Washington D.C., Hugo has been working with SQL Server since 1998. He’s busied himself as a DBA since 1999 (with mcdba & mcitp certifications in ‘01 & ‘08 respectively), as independent consultant with his own company, Intellabase Solutions, since 2002 (now part time), and recently took on a permanent position with Canadian Printing giant Transcontinental; managing not only his favourite RDBMS, but also MySQL 5.x and Oracle 10/11. He enjoys writing documentation for quick, safe infrastructure rebuilds and expansions, and most challengingly, propositions to Executives Management. He has spoken at SQLteach/DevTeach, Montreal Dot Net User Group, Roman Rehak's Vermont User Group, has a weekly blog on SQLServerCentral, and has been recognised as a SQL Server MVP in 2010.

Search for other articles by Hugo Shebbeare

Rate this article:   Avg rating: from a total of 64 votes.


Poor

OK

Good

Great

Must read
Have Your Say
Do you have an opinion on this article? Then add your comment below:
You must be logged in to post to this forum

Click here to log in.


Subject: Disaster Recovery for SQL Server Databases
Posted by: Anonymous (not signed in)
Posted on: Monday, September 14, 2009 at 10:09 AM
Message: Hello,

While this approach is good for a small/money strapped companies, everyone else should be using SAN based replication and/or products like XOSoft from CA or Double Take from HP. These 3rd party products makes DR easier, faster and safer. I done a lot of DR work too and find that replication via script(s)is great at first but eventually breaks down. As people leave the company and/or make their own changes the documentation and testing starts to fall behind and eventually when there is a problem they end up scrapping everything and go to a new solution.

Just my 2 cents worth,

Rudy

Subject: xp_cmdShell
Posted by: Anonymous (not signed in)
Posted on: Monday, September 14, 2009 at 12:54 PM
Message: Isnt it a security risk to set up xp_cmdshell? Cant we use Power shell instead?

-Roy

Subject: xp_cmdshell and Double-Take
Posted by: Hugo Shebbeare (not signed in)
Posted on: Monday, September 14, 2009 at 11:18 PM
Message: Sure, Roy, there is a potential security risk - this was designed on a DRP solution that was in a very isolated and secure env. Not the sort of solution you would place on an exposed web server to replicate databases, that's for sure :)

Rudy: Small companies wouldn't typically spend $2G an instance for SQLBackup (in my exp), sorry, it's for larger organisations that have the budget I believe.
Indeed there are many ways to skin the cat, and maybe you'd be very happy with Double-Take. It's funny, in my current mandate, I have been asked to replace Double-take with Database mirroring, so I look forward to that!

My goal is to make this as easy to port to your repsective environment as possible, hence the complete code provided and querying of the system databases to avoid lengthly changes. SQL Agent jobs were used with all the stored procs for each database in requirement of DRP, therefore it was very easy to have SQLResponse, for example, monitoring how the jobs were performing - robocopy sometimes choked on multi-terabyte DBs unfortunately, but that could have been a SAN/network issue.

Thank you both very much for the comments, I've spent a load of time on this and I'm sure there is room for improvement somewhere.

Subject: Correction - wrong price for SQL Backup in previous note
Posted by: hugosheb (view profile)
Posted on: Thursday, September 24, 2009 at 1:55 PM
Message: According to SQL Backup product documentation, prices for SQL Backup start at around $400 USD with support. Sorry for the mistake, thus it is more affordable for medium sized and small organisations obviously :)

 

Phil Factor
Searching for Strings in SQL Server Databases

Sometimes, you just want to do a search in a SQL Server database as if you were using a search engine like Google.... Read more...

 View the blog

Top Rated

Continuous Delivery and the Database
 Continuous Delivery is fairly generally understood to be an effective way of tackling the problems of... Read more...

The SQL Server Sqlio Utility
 If, before deployment, you need to push the limits of your disk subsystem in order to determine whether... Read more...

The PoSh DBA - Reading and Filtering Errors
 DBAs regularly need to keep an eye on the error logs of all their SQL Servers, and the event logs of... Read more...

MySQL Compare: The Manual That Time Forgot, Part 1
 Although SQL Compare, for SQL Server, is one of Red Gate's best-known products, there are also 'sister'... Read more...

Highway to Database Recovery
 Discover the best backup and recovery articles on Simple-Talk, all in one place. Read more...

Most Viewed

Beginning SQL Server 2005 Reporting Services Part 1
 Steve Joubert begins an in-depth tour of SQL Server 2005 Reporting Services with a step-by-step guide... Read more...

Ten Common Database Design Mistakes
 If database design is done right, then the development, deployment and subsequent performance in... Read more...

SQL Server Index Basics
 Given the fundamental importance of indexes in databases, it always comes as a surprise how often the... Read more...

Reading and Writing Files in SQL Server using T-SQL
 SQL Server provides several "standard" techniques by which to read and write to files but, just... Read more...

Concatenating Row Values in Transact-SQL
 It is an interesting problem in Transact SQL, for which there are a number of solutions and... Read more...

Why Join

Over 400,000 Microsoft professionals subscribe to the Simple-Talk technical journal. Join today, it's fast, simple, free and secure.