Virtue in the Virtual

Are we now seeing just a frisson of excitement amongst database professionals at the potential of virtualization, and containerization technologies? I read with interest Andrew Pruski’s recent post on SQL Server and containers. Wow, I thought: so I can now run a Docker SQL Server Express in a Hyper-V Container on Windows Server 2016! Then I hesitated: but is this really going to be useful to me?

I’ve always considered the use of virtualization and containerization, as a means of provisioning a database server, to be of more interest to developers and testers, rather than DBAs and Ops people, because the hardware configuration is so important for the consistent performance of production database servers.

Indeed, it is in development and testing that use of Virtual Machines and containers has now become more commonplace; it is now relatively simple to set up virtual test environments for each developer, where they can install all the processes they need, including SQL Server, to run their tests. However, it’s still basically one instance per VM. What if a team wants to run multiple parallel database tests against realistic volumes of data? Surely, they can simply spin up multiple ‘containerized’ instances of SQL Server ‘on demand’, automate the install and configuration of SQL Server, load the data, run the tests, then immediately tear them down again? However, I don’t know many teams who are doing database testing in this way. Maybe all it needs is that flash of inspiration while wrestling with the technology to start to understand better how these technologies solve real problems.

At Redgate, we run an extraordinary number of tests, numbered in their thousands, whilst releasing a tool like SQL Compare or SQL Backup, using VMs. However, the step of provisioning a database with realistic data, for these tests, remained laborious. This chore inspired a developer to think of a creative way of exploiting VM technology, and the idea of SQL Clone was born. This tool just uses standard virtualization technology baked into Windows. A database clone is just a Virtual Hard Disk (VHD) mount point, holding the source data image of the database, plus a differencing disk, to which SQL Server unwittingly writes any changes we make to the data in the clone. SQL server writes to a disk as usual, but the same clone can be mounted on as many machines as you need for your test or development work, and the build and tear down is quick and relatively painless. As with the taming of Containerization to produce Docker, the rest was the struggle to make the technology easy for the end user to install and manage.

This makes me wonder if there are other ways of exploiting the virtualization technology in both Linux and Windows to solve problems. Containerization, especially Docker, can be convenient, but there is, surely, a lot of other ways that the technology behind PaaS and Virtual Machines can be exploited to make our working lives easier?

I’d be interested to hear if you are using virtualization and containerization as part of database development and testing work? Are there good reasons still to be wary?

Commentary Competition

Enjoyed the topic? Have a relevant anecdote? Disagree with the author? Leave your two cents on this post in the comments below, and our favourite response will win a $50 Amazon gift card. The competition closes two weeks from the date of publication, and the winner will be announced in the next Simple Talk newsletter.

  • 5068 views

  • Rate
    [Total: 4    Average: 4/5]
  • Peter Schott

    There’s a lot of interest right now amongst our devs in using Docker to easily stand up instances for the apps. However, I’m not sure how much work would have to be done to keep them current. The idea of spinning up SQL Express in Docker is interesting to play around with the capabilities, but I agree that data persistence will eventually be more important than creating/destroying an instance. We can easily create/drop a database once SQL is installed and that tends to work pretty well. We can use automation tools to help with the installation if needed, but once installed, it will just work. I guess the advantage w/ Docker is that it spins up a SQL Express instance pretty quickly.

    I like the idea of this for apps – have a container w/ your stack ready to go and deploy as needed. Build up the next version, release that, roll back quickly if needed. I have concerns that someone could easily lose data if they aren’t careful with containers for databases.

    As for virtualization – we do a lot of that with VMs, but not too much else at the moment. That’s definitely useful to snapshot a machine as needed or bring up a clone of some base machine, but most times we provision that out with Chef and just let that process set up the machine, install the appropriate software, set the right permissions, and hand it off to the team to manage/deploy software.

  • robwestwood

    We’ve had a lot of success in using databases for database in dev and test on a couple of projects (using PostgreSQL). In the first project we used to build a database image with the schema and and system data pre-loaded and push that to our internal docker registry. In the second we just just a vanilla PostgreSQL image and rebuild the database from scratch via hooks into the container startup (sorry for being vague here, I was not involved in setting this up). Our database scripts are kept in version control – that’s something we’ve always done for the past 15 years or so.

    Regardless, we’ve found it great for developers:

    – Easy to get both the current version of the database and any previous version
    – Stopped database drift between developers as it was easy to get the current version, rather than relying on devs to keep their own database up to date
    – Encouraged the use of version-controlled test data packs rather than just a dev inserting their own test data into their own instance
    – Also easy to rollback after running the app- just stop and restart the container
    – As it was easy to run tests that integrated with a database, more integration tests between a service and its database were written and they got run more often.

    So far we have not deployed containers to production for a database, but tend to use managed database services such as SQL Azure or Amazon RDS.

  • rogerthat

    We love how quickly we can spin up an Azure instance of a database and application server. We have scripted this to make it very easy to create new instances and allow other developers quick access to an environment they do not have to worry about making mistakes.

    I agree that 5 years ago, there were still concerns about virtualizing your database servers, but the progress made in the VM infrastructure (from all vendors) has won the case for virtualizing all our database servers.

    In my opinion and experience, there is no longer any need to fear VMs, even in production environments.

    We are in the process of evaluating containers as a means to provide CI.

  • Warner Chaves

    I believe there’s a business opportunity for a vendor like RedGate to use virtualization or containers for automated testing and results evaluation.

    Say I want to deploy PAGE compression on my database. I have to backup the database, trace a representative workload, restore the db on a test server, replay the trace and then evaluate if the results are the same, better, worse?

    It would be a lot less painful if this process was automated by a tool. Either the same workflow as above or let’s imagine something even more fancy like a live-replay that takes the traffic from PROD and in real-time streams it over to the test copy and then gives you an eval on how the two of them compared. This would be useful for testing new code, new features, SQL version upgrades, etc. It could even be part of a CI process where this test infrastructure runs automatically and then alerts or stops the deployment if a performance degradation is detected.

    What do you think guys? Cheers!