15 November 2012

Database Deployment: The Bits – Copying Data Out

Occasionally, when deploying a database, you need to copy data out to file from all the tables in a database. Phil shows how to do it, and illustrates its use by copying an entire database from one server to another.

2091-stage4.png

DATABASE LIFECYCLE MANAGEMENT PATTERNS & PRACTICES LIBRARY

Automated Deployment

Just occasionally, you’ll need to get some or all of the tables of a database out to disk as quickly as possible. I usually do it in order to get a set of data for testing a database. For a full test, I might have a number of different sets of data, and I can do a build-up and tear-down reasonably quickly for each test by importing them into the test database and deleting the tables after use. Naturally, you’ll want to copy data in, and I have described how to do this here.  In general, I will keep a consistent set of data that is compatible with the tests, and run the tests with a number of revisions of the database as it approaches completion.  It takes time but then it is done overnight and breeze in the following morning to see a list of tests with ticks or crosses against them. I also occasionally copy a small database to a server when I don’t have sufficient access to the filesystem to copy an MDF file. Oddly enough, I needed to do this via a PowerShell  script on the day I wrote the article you’re reading. I’d never  call this a ‘deployment’ in any sense of the word, but it is a handy way of checking that everything you’ve done works.

 At first glance, there seem to be many more ways of getting data from file into SQL Server fast than there is in getting data out of SQL Server and into files. Actually, there are quite a few techniques one can use, such as OLE automation (deprecated), OrcaMDF,  SSIS, CLR and SQLCMD. In a recent article, I was able to write out the whole of AdventureWorks as  CSV by creating a linked ODBC text database.  I’m not quite sure now why I did it. A similar technique will spit them out as Excel files. As we are not designing a production process in a deployed database, we can use xp_cmdshell and BCP. This is the quickest way of doing it (SSIS uses BCP). The SQLBulkCopy .NET class is fine for importing into SQL Server, but cannot be used to export data. It is perfectly possible to write out all the data of a table using Invoke-Sqlcmd into CSV format or whatever, which is fine for reporting, but it isn’t so fast as doing a Bulk Export.

Bulk Export

A BCP export is pretty trivial to do from within SQL Server if your login has the rights to use xp_cmdshell.  Otherwise, you will need to use an external process such as PowerShell or SQLCMD  to do it for you.

If you are copying out all the tables with an eye to copying the database, the problem, of course, is the same as when you do a backup. If you allow insertions or deletion in data that is subject  to referential constraints during the process then the set of table data will cannot be guaranteed to be consistent when you write it all back in. You can fix this, but it is painful and time-consuming.  You have to read the data in to another copy of the database  using BULK INSERT without the CHECK_CONSTRAINTS option set (the default) . All foreign references will be marked as  ‘is_not_trusted.  Then, you’ll have to remove any rows that violate referential integrity.  It is much better to have a consistent set of data on file. You can knock off the other users or make  the database read-only  just for time in which you are  copying of the data out, by ALTER DATABASE database-name SET READ_ONLY and then doing a ALTER DATABASE database-name SET READ_WRITE after the copy is completed.  The coolest way of doing the process is to do  your bulk export  from a database snapshot because  it is Read-only, but you’d need Enterprise edition to do that.

Here is the way you might do it in PowerShell. This saves the files on the machine on which you are running PowerShell. This doesn’t have to be the case. You can save to a network drive or even a UNC, but the process running the PowerShell script has  got to have permission.

You can, of course, do it in TSQL on the server, in which case you would have to save the files on the server, remotely specifying a UNC, or using  a network drive mapped to a drive on the server.                                                     

This batch, which can easily be  made into a stored procedure for your ‘utility’ database,  exports, or  BCPs out all the tables from the database.  Most of the effort here is to just check that the file path is valid. The actual BCP operation is pretty trivial, though if you’re used to using a cursor or a WHILE loop, it might raise an eyebrow.  We will be placing our BCP output files in a specially-created directory structure where each database we do has its own subdirectory of its instance( or server)  directory. The batch does all this work for you.

This takes typically around 15 seconds for AdventureWorks when saving to disk  locally. You can persuade BCP to save across the network but you lose performance, and would probably do better to place it locally and then copy it across later.

Soup to Nuts: Copying a database the hard way.

Of course, copying a database is the heart of database deployment, but it is rare to have to rely on such a method as this, but it works. I did it mainly to check that the component parts of the process that I’ve described both here and in my previous article ‘Database Deployment: The Bits – Getting Data In’, actually works, by copying the database and then comparing the source with the copy via SQL Compare and SQL Data Compare.  I subsequently needed to use it to copy a small database when remote access didn’t work.  Such is life. SMO has a routine called the TRANSFER task which will do this, and requires much less coding, but it saves and restores the data as insert statements. This means that it is very slow, and I’d feel guilty if I encouraged you to use it to copy a database.

This article is part of our Database Lifecycle Management patterns & practices.

Return to the library to find more articles on DLM, or visit red-gate.com/dlm for more information about SQL Server products for database lifecycle management.

Keep up to date with Simple-Talk

For more articles like this delivered fortnightly, sign up to the Simple-Talk newsletter

This post has been viewed 13971 times – thanks for reading.

Tags: , , ,

  • Rate
    [Total: 9    Average: 3.6/5]
  • Share

Phil Factor

Follow on

View all articles by Phil Factor

  • nportelli

    FYI
    I’m not sure what version of bcp you have but mine does not have the -d option. I had to alter the PS script.

    $WhatHappened= & "$($PathToBin)BCP.exe" "$($_.parent).$($_)" out "$TheScriptDirectory$filename.$FileType" "-S$($_.parent.parent.name)" -n -T

  • Phil Factor

    BCP -d option
    Yes. it is fairly recent. I do it the other way in the final script, avoiding the -d option by qualifying the table name