Microsoft and Database Lifecycle Management (DLM): The DacPac

The Data-Tier Application Package (DacPac), together with the Data-Tier Application Framework (DacFx), provides an alternative way to automate the process of scripting out, or deploying a SQL Server database, reporting on changes, or checking version-drift. As such, it seems to provide an obvious cost-free start for tackling automated Database Delivery as part of Database Life-cycle Management (DLM). We asked Phil Factor if it is ready for Prime-time.

DevOps, Continuous Delivery & Database Lifecycle Management
Automated Deployment

When you need to extract data and metadata from an existing database, deploy a database, report on changes, script out a database, compare databases or perhaps drift-check, it is often useful to have a standard package that contains everything about the database that you’ll need: This means scripts, data, components, metadata and more.

In Database Lifecycle Management (DLM), the scripts and any other components of a particular version of a database can be usefully packaged up together into a file that becomes a release artefact. The canonical source is, as it should be, in source control, but the release artefact is a practical convenience because it has everything required for a clean build, but on the strict condition that it is ‘versioned’.

You can, of course, put everything in a file directory structure but that’s only useful if you have file-system access to the file server from the location of the installation, and that isn’t always possible. Instead of a directory, we tend to use the NuGet .ZIP package format, because it is an open-source standard for application releases, and includes conventions for pre- and post-deployment scripting in PowerShell. This approach makes sense particularly where a database has a lot of non-scriptable components such as SSIS packages, or where permissions and server-specific settings need to be applied.

This article will describe Microsoft’s approach to providing the database release artefact, called a DacPac, which is Microsoft’s own package structure that is is manipulated via its DacFx (Data Tier Application Framework) technology. The DacPac is a .ZIP-based (‘package-based’) approach to database deployment, which strikes a chord with developers of reasonably small databases that are associated closely with a single application, but which has broader practical uses.

Working with DacFx and DacPacs

The DAC approach has been around since SQL Server 2008R2 (v1.0) and the current version at the time of writing is DacPac v3 for SQL Server 2012 and 2014. Earlier versions were restricted to supporting only the cut-down feature set of the original SQL Azure, and had a number of weaknesses that have now mostly been ironed out. The APIs have changed considerably, as a result. It is a good idea to ensure that you have the latest version.

The DAC .NET class library is used by SSDT for deployments that update existing databases, as well as modelling and validating database schemas. It used to be in the Microsoft.SqlServer.Management.Dac namespace, but nowadays is more separated from SMO, in Microsoft.SqlServer.Dac. Some essential functionality, such as getting the current version number from a registered database, got lost in the move; and the API changed significantly. There is therefore currently no simple method for getting a version number from a database. There is a DacServices API to create and use DacPac files for deployment-related tasks such as examining database schemas, modifying the deployment plan or doing Static Code Analysis.

You can work with DacPacs using one of the following:

  • Visual Studio or SSMS wizards
  • Visual Studio VSSC
  • The managed API, DacFx namespace , that is distributed with SQL Server, used with a language like F#, PowerShell or C#
  • An MSBuild Task
  • A command-line utility called SQLPackage.exe

The DacFx namespace and the SQLPackage.exe/DacUnpack.exe command-line utilities are intended to match feature-for-feature. You choose whichever is more convenient. DacFx provides the grunt whichever alternative you use.

What is a DacPac?

The ‘portable’ version of a database design is a DacPac, which is merely a zipped directory. It is described by the pundits as a ‘self-contained database entity used by an application’, representing the ‘declarative model’ of the database, which can be authored alongside application code.

A DacPac can be built from scratch, created from a Visual Studio 2010 Data-Tier project, compiled from source code, or extracted from existing databases. We can deploy DacPac both to Azure SQL Databases or SQL Server, either as a new database or to upgrade an existing database. In the zipped directory is a series of files called ‘parts’. All DacPacs contain at least…:

  • An XML file that represents an XML ‘model’ of the database’s metadata
  • A file containing information about the DAC framework that produced the DACPAC file

The fact that a DacPac contains an XML ‘model’ of the metadata means that you can find out what’s in the database. It also enables an application can use a DacPac for metadata comparisons and synchronization without having to parse a build script. The model is like a document-based object-oriented version of the INFORMATION_SCHEMA views, equivalent to Redgate’s ‘SQL Compare Snapshots’ produced by SQL Snapper. If your comparison tool, like SQL Compare, can read a build script, or collection of object scripts just as easily as it can a live database or backup, then you don’t need the model.

Zipping up all this information up is a good idea because the XML-based ‘declarative model’ is huge. Once unzipped, it is easy to extract the information from the model with XPath queries. Optionally, a DacPac may also include:

  • An XML-based refactor log for auditing changes
  • Pre- and post-deployment SQL scripts to take care of the creation of any build components that cannot be included in the update script
  • Data – if included this is stored, a directory per table, in JSON format
  • Any other contributor artefacts such as SSIS packages.

You can unzip a DacPac simply by right-clicking the file, if DacFx is installed on the machine. If you are scripting, then it can be done from the DacFx .NET library. There is even a Package API within DacFx, to inspect the package parts in more detail. The advantage of these approaches over a simple ‘unzip’ is that they create a build script ‘Model.SQL’ whilst unpacking the DacPac. If you have specified data as well as tables, you’ll see a data directory with a subdirectory for every table, containing one or more files with JSON data like this:

A ‘publish’ operation (DAC-speak for a deployment) requires a ‘main project’ but can contain references to other DacPacs so as to allow several DacPacs to be published as one ‘logical’ database.

Deployment (‘publish’) of a simple database seems ridiculously easy, and can be done via SSMS by right-clicking on ‘databases’ in the Object Explorer, but as soon as you are faced with deploying to a number of server environments with different hardware configurations and access-control requirements, or you have to wrestle with replication, messaging or downstream reporting systems, then things can get complicated.

Fortunately, a deployment via DACPACs can be done using an XML-based ‘Publish’ file (actually an MSBuild ‘manifest’) that records most of the settings needed for a successful deployment. With this, and because DACPACs allow SQL scripting, and support the use of SQLCMD with variables, DAC is versatile enough to cope, especially if you have an intellect of somewhat awesome complexity. The obvious downside of the DacPac, that one notices when going beyond deploying AdventureWorks, is that it is slow, even when just doing comparisons it is much slower than SQL Compare, and if you decide to use it to copy database and their data, then be sure to bring sandwiches.

Doing Things with DACPACs and DAC technology.

Where it is easy to do, I’ll show both the command-line and DacFx way of doing things in PowerShell.  I’m using the command-line within PowerShell, but I’ll add a few DOS Command prompt examples later on for the full hipster experience, or where your build server requires a DOS shell.

The DAC library allows you to deploy, create, rename, register, create objects, detach, delete, data transfer, disable constraints, move data, enable constraints, copy permissions, set read-only, upgrade, unregister, update registration, set read-write, and disconnect users. We aren’t going to attempt to show all these in the article, but hopefully once you’ve seen and run these examples, it will become clearer how to do so.

These samples are really snippets. They don’t have the full error-checking, logging or such. Sometimes, I use existing objects within the PowerShell ISE or assume you have DacFx loaded, which you can do like this:

(You may want to create a directory that you maintain with the latest version of DacFx in there rather than use Microsoft’s default installation location). In all cases with these snippets, you have to change settings like 'MyInstance', 'MyPathAndFile' or 'MyDatabase' to real server instances, file specs and database names.

Getting a DacPac from an existing database

I can use the SQLPackage command-line in order to extract the DacPac. Here it is being executed within PowerShell. You will need to fill in the variables at the start and you might need to change the path to the file.

Listing 1

The drawback to the SQLPackage method is that the documentation doesn’t mention a way of saving a .NET style ‘x.x.x.x’ version number. To do that,you’ll need to use the .NET DacFX class to do the extraction of the DacPac from the database.

Listing 2

So now you can see how to add a version number!

Getting a build script from a DACPAC

If you already have a DACPAC, you can  get a build script from it using GenerateCreateScript method.You can also use DacUnpack, which creates the build script and you can then and pull the build script from it into a string variable $content. Here is the first method

Listing 3.0

You can, of course, do the same thing by unzipping the DacPac and removing it. Instead of the final line, you substitute these two

Listing 3.5

This routine allows you to get at any of the XML files as well. I delete the directory after extraction purely because the unpack method pins its ears back and refuses to work if it exists. This routine gets the script from a DacPac. To get a script from a live database you just add a routine to extract a DacPac from an existing database. It seems quicker than using SMO to do it.

Creating a new database from a DACPAC

This is pretty easy. Using the DAC library…

Listing 4

And alternatively using SQLPackage…

Listing 5

In both cases, I’ve asked DacFx to over-write the existing database.

Registering a database

This embeds into the instance the XML file that records the metadata, along with the version number, so that you can subsequently check if things have gone wrong.

Listing 6

Unregistering a database

This removes the clobber that has been put into the database instance to record the current version and metadata of the database.

Listing 7

Checking a database for ‘drift’

This checks a registered database to make sure that what is there matches what is supposed to be there. You can get and read a report of what has ‘drifted’.

Listing 8

Copying a database

There is a twinge of doubt about using DacPacs for serious use, because the documentation for Data-tier Applications (which I suspect is out-of-date) still states that ‘A data-tier application (DAC) supports the most commonly used Database Engine objects‘. If a car salesman brightly says that the wheels stay on the car most of the time, you’d be nervous. Same here: what objects can’t it do? The documentation ducks the issue by giving a splendid list of all the objects it can do. Later it amplifies ‘You cannot extract, register, or import a DAC from an existing database that contains objects that are not supported in a DAC. ‘

Let’s put that to the test. Can we make an exact clone of the metadata of Adventureworks 2012 using DAC, but without the data? Using SQLPackage in PowerShell …

Listing 9

Hmm. A quick check with SQL Compare shows that the only thing missing is the extended properties of the primary keys, which isn’t at all bad. Most of us can cope without those. So we can conclude that we probably can deploy a new database with a DacPac, but we’d need to test it with a database with lots of obscure knobbly bits to see if more metadata drops off in the process.

Now let’s run the second part again. Idempotent or what?

The DacPac is actually treating every publishing activity as a synchronization. The second time you run it, it will discover that the database already exists, and that the database is the same as the DacPac so after a pause for thought it will do nothing.

Otherwise, it will execute a SQLCMD script to ensure that it has the same metadata as the DacPac.

This is easy to test. We just delete all the unreferenced functions and procedures in the database with the following SQL (it won’t allow you to delete the referenced functions)

Listing 10

And execute it again. Lo, after a pause for thought, it recreates all the dropped functions and procedures, as well as all the related extended properties, as we’d expect. We repeat it and this time around it does nothing, because it is all the same already. This is going to make testing stuff easier!

Differential scripts

So far, we have just gone ahead and built the new database. We’ve now shown how it makes changes to a database to synchronize it to the same version as the DacPac. We’ve been able to do this because Dac always executes a differential script (sometimes called a synchronization script). The script that is generated is actually a SQLCMD script which has variables. We might want check any differential or build script before we go ahead and run it. We might even want to tweak it or make changes before applying it. We can specify the values for the SQLCMD variables in the command line via DacFx or edit them in later ourselves.

Let’s generate this SQLCMD differential script. By making a simple change to our cloned copy of Adventureworks by deleting a stored procedure, and then see if we can generate a script that just updates the clone to bring it in line with the contents of the DacPac.

Listing 11

Here is the start of the differential script. You will see the four :setvar lines which parameterize the most important aspects of the script.

Listing 12

And as we would expect, only the change required, the creation of the stored procedure, is there in the differential deployment script, though I haven’t shown it in the extract in Listing 12.

We tweaked the deploy options to do this. This is a hint of powerful magic that is available. With a DacPac, you can choose whether the script should specify that a backup should be made before deployment, whether the script should be blocked if there has been drift, whether the existing database should be dropped, whether replication objects should be scripted, whether the deployment should be verified, and so on. There are a lot of knobs you can twiddle to make the deployment fit your requirements. If you can’t find a suitable knob to twiddle you can write your own deployment contributor or executor which lets you completely customize the output

Extracting data into a DACPAC as well as metadata

You can extract the data from a database into a DacPac, though normally you’d only want to do this for the short list of tables of static data that are needed for the database to work

Listing 13

Even with Adventureworks, out it pours, all 21Mb of it (zipped) unzipping to 119Mb. The data, in JSON table format is 116MB. It is far better to restore a backup, attach an MDF or use SQL Clone. There is evidently a DacPac way of using native BCP instead, which is a much better idea, but it isn’t documented properly anywhere obvious.

We then attempt to populate an empty version of Adventureworks (AdventureWorksClone) with this data

Listing 14

It certainly attempts to do this, but crashes.

Why truncate an empty table? So it seems that the seamless population of an existing database with data is off the menu. . A far better way of doing it for small tables is to provide a stored procedure in the database that is executed by the post-deployment script. The stored procedure would probably best use, for each table, a MERGE statement that can be used to reset static table values from a multi-row VALUES statement, or deleted after use.

Some DOS Command-line Examples

You don’t need PowerShell at all to use DacPacs. Here are a few command-line examples that you can also use in a batch file. In reality, typing commands in to the command prompt is more useful if you are using the Publish.xml manifest as you’ll be needing to specify several options and wanting to store them for reuse.

Listing 15

Packaging up Databases

DacPac technology, like SMO before it, has been slow to get the love, despite having some good features and a huge development effort given to it over the years. It has features that are useful to a database developer and DBA, but as it stands it is only part of the DLM solution.

One of the reasons why enthusiasm for DacFx has been muted amongst DBAs is because a database can only be packaged up like this if it is a simple one. The idea works for small databases that are isolated from other services and exist only to store the application’s data. However, the typical corporate database will have complicated interdependencies, compliance issues, security complications, high availability mechanisms, agent tasks, alerting systems, complex partitioning, queuing, replication and downstream analysis dependencies. You can deal with these whilst using DacPacs but you lose the initial advantage of simplicity.

The most obvious complication is with permissions. Although these are scripted, they should not be in development source control. If security is done properly, permissions will be different at the different stages of deployment and production. Also, tables often have their physical location specified in SQL DDL code. This will vary from server to server, somewhat complicating the journey of a database from release through to production.

The DacPac gets around some of these problems by using a SQLCMD build script and by having a host of options for excluding parts of the metadata. It also allows DacPacs to include other DacPacs. Options are collected together in an XML manifest file that can be made specific to a particular database and server.

Another problem with a DacPac is that you can soon lose sight of the important principle that a database release is defined by the scripts in development source control. A database deployment, in turn, combines that release with the version-controlled server scripts and files in the Central Management System archive for that version that provide the relevant permissions, server-specific configuration and all the other clutter. What is in source control, rather than what is in the DacPac, must determine what is deployed. A DacPac isn’t source control. It can be copied, modified or combined with others, but not in a way that always provides a reliable audit trail. It would be nice to be able to check the contents of a DacPac with the version of the source in Source control just to double-check.

Is DacFX and DacPac ready for production use?

This may seem a silly question as they are already in production use, but mainly as part of SSDT projects, and few people seem to be, as  yet, using them for more general SQL Server Database development.

After putting together these snippets, I then put together a lengthier script that took all the servers in my list of Local Server Groups and Central Management Server groups. There are a range of instances from SQL 2008 to SQL 2014. For each non-system database in every server, it created a DacPac in a central archive. Having done that, each time it was run it compared the dacpac with the live database and created a new Dacpac and report of the differences if it detected a difference. The script is too long to show but it is here on GitHub and in the downloads at the bottom of the article. Briefly, it worked, in that it spotted many of the changes I had made in the various test databases between test runs, but by no means all. In one case, 0n SQL2012  it detected a false positive every time, repeatedly believing that a particular function had changed. In another, it couldn’t detect a whole collection of Table-valued functions that I inserted into a test database. I thought there was something wrong with my script, and it is a definite possibility, but the same thing happened using the command-line version by hand, and with a PowerShell script using the Command-line SQLPackage.exe. It seemed, from examining the progress messages, that the comparison process seems to be occasionally swallowing parsing errors and reporting them only as status changes (which you normally don’t see) and failing to detect differences as a result. Some more investigation revealed that SMO, which DacFx uses to access databases, had’nt picked up a table deletion and was causing an exception that wasn’t properly handled by DacFx (it should have aborted the process)

The DacFx defaults aren’t always sensible so it could be that I’d missed a tweak. One thing I noticed was that transactions aren’t used by default for synchronization scripts so if things go wrong, there is no rollback and the database is left in an indeterminate mess. The switch is ...
/p:IncludeTransactionalScripts={True | False}
...
and I believe it should be set to True, though I haven’t tested it.

DacFX can spot many changes I thought would prove difficult. It seems to have full filegroup support, detects linked servers, and seems to deal well with server-level objects and database-level settings. It deals well with the deployment of certificates as well as both asymmetric and symmetric keys. It is often able to detect table renames.

Compatibility with SQL Server is a slightly grey area, though the compatibility with SQL Azure is well-documented. It seems good though according to my tests, as it is able to compare SQL Server 2014 in-memory objects. It can compare database settings but, unsurprisingly, cannot detect Agent jobs. Full Text stoplists and Encrypted objects get deployed but without their content. It gets confused by case differences in partition schemes, some Table-valued functions and assembly dependencies. I suspect that a few more more problems will emerge with further testing.

The filtering of classes of database objects is very comprehensive, and in reading through some of the options, it is obvious that the designers of DacFx are aware of the complications or replication, Cryptographic providers, and Change Data Capture.

Conclusions

DacFx and DacPac technology is starting to show great promise as a general database deployment tool, and there are already available third-party alternatives such as Redgate’s DLM Automation, which uses the well-established NuGet package format. The availability of such tools helps make SQL Server the more obvious choice for departmental databases, and a robust replacement for MySQL databases, because they make database deployment so simple.

DacPacs are a good way experiment with DLM, but if I were needing to use DAC technology for deployment work on any enterprise database, where risk is not an option, then I would want to do deployments within a transaction, and rollback on the first indication of problems. I’d want to check the scripts very carefully, and test them thoroughly, to make sure that the deployment scripts and manifests do what I expect.

References

DevOps, Continuous Delivery & Database Lifecycle Management
Go to the Simple Talk library to find more articles, or visit www.red-gate.com/solutions for more information on the benefits of extending DevOps practices to SQL Server databases.

Downloads

Tags: , , , ,

  • 21490 views

  • Rate
    [Total: 37    Average: 5/5]
  • rogerthat

    Fantastic review!
    Thanks for the great review – we currently use Visual Studio Database Projects for production change control.
    I like the granularity it provides, but if the database is big, the complexity of all the pieces becomes close to unmanageable.
    We were considering looking at DacPac, but it looks like we will be sticking with database projects for a while, though.
    Thanks for the footwork!

  • paschott

    Great overview!
    I really appreciate this. I’ve used SSDT and its precursors for some time and appreciate the functionality that they’ve provided. The deeper dive into DacFX, PowerShell, and how everything could work outside of that environment was interesting and useful.

    Personally, the changes from VSDB Projects to SSDT were well worth it – smaller deploy sizes led to faster releases. The only change I really miss from VSDB was the "ignore column order" option. It didn’t come up often, but was _really_ helpful when it did. I know that better handling for 2012+ features in SSDT was also well worth it. You might be able to work around that with the older VSDB projects for a while, but it’s not going to be fun. I’d recommend checking out the SSDT projects in a new branch to see how they compare – you might find that they work pretty well with minimal changes and no issues in your environment.