As head of the DBA Tools team at Red Gate, I’ve visited many DBAs to talk to them about their day-to-day tasks and responsibilities, longer-term goals and objectives, the sort of tools they used in their jobs, and so on. Many of these DBAs entered their role determined to make a big difference to the efficiency of their organization’s applications, databases and infrastructure, to help make the change that would really drive forward, rather than simply sustain, their businesses. In most cases, to a great or lesser extent, reality had intervened.
I’ve visited many DBAs to find out about their tasks and challenges and how they tackled them. I’ve often heard how the DBA joined their respective organizations with ambitious goals. In one case, the DBA’s primary goal was to overhaul the organization’s system architecture, setting up clusters for high availability and migrating existing applications and databases. In another, it was to set up a BI and reporting environment that the business could use to drive decision-making. Rather than being able to focus on these projects, they found their time disappearing into the daily churn of user complaints, investigating intransigent performance problems, battling disk space issues, contending with backup failures, and all of the other tasks that are vital for keeping the operational lights on. In neither case had the DBA made substantial progress on their primary goal, despite the fact that both they and their organizations were highly motivated to do so.
Many DBAs will empathize. The number one responsibility of the DBA is guardianship of the data, to protect it and make sure it is constantly available to the people who need it. If they fail to keep the lights on and the doors open they, and in time their organizations, will fail. Unfortunately, dealing with this ‘daily churn’ often leaves little time for the projects that will help make the organization more efficient, give it a better understanding of its data, and so offer it true competitive advantage.
I heard many similar stories of partially or largely thwarted attempts to build BI solutions, or overhaul system design, or improve the way they deployed their applications, or managed their backups, or any one of numerous other such tasks that would offer great value to their company.
Having previously spent five years as a system administrator, I have a few first-hand experiences of what it is like to be at the sharp end of “keeping the lights on” and ensuring business continuity in the face of disaster. The more I have learned, the more I came to understand that the traditional software tools that we had provided up until now were a great help but not always the whole answer. Cloud services have a pivotal role to play in reducing this infrastructure drag, and daily task churn.
Traditionally, there was a one-to-one relationship between an application and its physical server and so of course a direct correlation between the health, configuration and scalability of that server, and the smooth running of an application workload.
As Virtualization technology has improved in recent years, many organizations have been able to rationalize, moving workloads (testing workloads, for example) from many physical servers to a few physical servers running virtual machines (VMs). This introduces a layer of “abstraction” between application and the underlying hardware but it still allows the developer or DBA direct control over the configuration of everything from the operating system up. It reduces some of the burden of setting up and configuring and maintaining hardware, but of course there is still a cost to setting up and maintaining the virtualization environment. Ultimately, though, the performance and smooth operation of your database and application still rely on the health of your dedicated servers (or VMs). If one of them develops a fault, or fails, then someone will need to go in and fix it, quickly.
In a sense, the Cloud is simply a virtualization environment on a grand scale. The ‘Cloud’ works not on the level of individual machines but with container-sized units full of machines, all with water, network, and power plugged in. If one machine in the unit fails, the load automatically fails over to another machine. The Cloud hosting service does not have to rush in and fix the machine to ensure the continued smooth operation of your application. If a significant proportion of the machines in a unit fail (say 40%), then they will simply plumb in another whole unit, take the previous one offline and fix all the machines at once. Suddenly one person can manage hundreds of servers, and with such an architecture comes massive economies of scale. This is true utility computing, with computing resources (CPU, memory, I/O and so on) provisioned on-demand, and automatic failover, without any single point of failure, and theoretically at a much lower cost than would be possible with a traditional on-premises or even dedicated hosting facility.
By moving workloads to the Cloud, we can exploit, to a greater or lesser degree depending on the application, many potential advantages, including:
- Reducing design, development, administrative costs for a workload- either simply removal of the hardware burden, Infrastructure-as-a-Service, or hardware plus platform, Platform-as-service, or full Software-as-a-Service, where all you do is simply logon and use the application.
- High Availability – your application is available anywhere with an internet connection and the redundancy built into the Cloud architecture means uptime should be very high. In some cases, this can make the Cloud a convenient “failover” facility for an on-premises installation.
- Scale up – due to economies of scale, it may simply prove cheaper to achieve vertical scaling, in other words provision the horsepower and disk space you need, in the Cloud, rather than through conventional bare metal or visualized installations.
- Scale out -true “Cloud architected” applications, designed to allow “sharding” of data and workload across many worker nodes, can exploit the massive horizontal scaling possibilities that come with the Cloud.
The Disruptive Force of the Cloud
There is no doubt that with the “Cloud”, the ability to exploit reasonably low cost, efficiently managed, and distributed ‘utility computing’, is an increasingly appealing, but also challenging and potentially disruptive force for most organizations.
Some workloads are a great fit for the Cloud. Typically, they are those that need to be accessible via the Internet, those that are associated with fewer regulatory concerns regarding personally identifiable information, do not integrate tightly into other business workflows, and may be subject to ‘elastic’ variations in demand. However, for organizations and DBAs that care for large databases and their sensitive data, on which often many applications rely, are, rightfully, very conservative in their approach to the security, availability and performance of their databases. Dedicated hosting is one thing; for many, the Cloud is a leap they are hesitant to make.
Nevertheless, most organizations are aware of the likely advantages of moving workloads to the cloud and, rightly, understand that it is a mistake to ignore its potential. Recent years have seen a rapid shift in attitude to the point where most organizations store at least some of their corporate data in the cloud. Many companies have been using “Software as a Service” for years, in the form of packaged applications, like Salesforce. Such applications are now largely cloud-based so these organizations store in the cloud some of their core customer relationship data. Use of Evernote, Trello and Dropbox and for writing and sharing documents has become ubiquitous. In all such cases, it represents an organization ceding some element of control over storage of significant business-sensitive data, in return for the advantages of the Cloud, such as “data anywhere”, removal of the infrastructure, administration and provisioning burdens, potentially reduced costs, and so on.
For some types of organization, the Cloud is a perfect fit. Many companies born in the last few years simply do not own any physical infrastructure. These startups focus on their core business ideas and let others build and provision their applications and databases as cloud services. Likewise, we know software development houses who, proudly, no longer own a single physical server. One such company (10th Magnitude) writes custom software applications
“Three or four years ago my business used to be write code and turn it over to the customer, and they’d run it on their own hardware, or in Rackspace or a provider like that. With Cloud infrastructure, it’s literally happening twice as fast and the cost is coming down” – Steve Harshbarger, CTO at 10th Magnitude
They now provide Platform-as-a-Service, backed by a cloud infrastructure. They develop in Azure/.NET and plug in third-party cloud services as required (for example, they use a cloud service for database backup, another service for payments and so on). They cater mainly to these startups, who have a good idea but few IT resources or experience. They help them get started and expand, provisioning more servers and different services, on demand.
For some types of workload, the Cloud is a perfect fit. The classic example is the application that needs to cope with ‘elastic’ demand, in other words, one that might sit more or less idle for 11 months and then be ‘hammered’ for one month. Microsoft gave the example of one of their own ‘charitable donation of time programs’, where employees can donate 2 hours of their time to a charitable cause. It is available all year but only used intensively for a month. In such cases, provisioning dedicated hardware, or making space in a shared, virtualized infrastructure, to cope with such short burst of demand is a considerable administrative burden.
Such workloads are perfect candidate for moving to a cloud architecture, especially when designing from scratch. For existing applications, it is possible that, with some redesign, to move such applications wholesale to the Cloud to take advantage of the built-in scale out potential. Alternatively, one can maintain a relatively small on-premises installation, and then provision extra ‘worker nodes’ in the cloud to scale out to cope with demand bursts.
For such organizations and such applications, cloud adoption has been relatively quick, since the benefits are so clear. However, the best ‘route to the cloud’ is often unclear for complex workloads with complex usage patterns and dependencies (databases being a prime example).
Types of Cloud Service
The cloud means different things to different organizations, and different things to different departments within an organization. It might take the form of software on a web site, or a virtual machine provided by a vendor, or a database service, or a simple file storage location.
There are three broad categories of ‘cloud’, or distributed, service. I will not delve into full definitions of each here (for that, see Buck woody’s article, https://www.simple-talk.com/sysadmin/general/the-ass-of-distributed-computing/), but broadly:
- Software as a Service, SaaS – provides a complete software solution as a service
- Platform as a Service, PaaS – abstracts the hardware and operating system and runtimes, essentially making available a complete deployment environment. Having deployed the application code, others can consume it as a service (SaaS)
- Infrastructure as a Service, IaaS – abstracts the hardware (CPU, RAM, Network, I/O). The service presents to you a Virtual Computer on which to build and run your applications
If for example, you have an on premise Microsoft CRM, deciding the best approach for moving to the Cloud is difficult. This is a core business application, and there is a lot of risk associated with the change, even though the organization realizes that there might be a big upside. Many organizations are still waiting to acquire for themselves the confidence that they know how to do these change programs and where the actual costs are going to be, where the actual risks lie.
Databases to the Cloud?
Those organizations that have approached the “okay, the Cloud is interesting, how can we best make use of it for our database workloads?” debate are drawing their own, different conclusions.
For many years, organizations have also been exploiting “non-Cloud” IaaS for various types of workload, including databases, using a dedicated or shared (colo) facility to host the physical machines for their bare-metal installations, or for their virtualized environment, with the hosting service assuming full responsibility for the hardware. The hosting facility will monitor the disk array, swap disks out, and rebuild raid partitions and so on, as necessary, and will provide failover capabilities.
As such, an obvious first step for many is “IaaS in the cloud”, in other words to move their Virtual Machines (VMs), on-premises or hosted, into the Cloud, renting the VMs, complete with a licensed copy of SQL Server, on Amazon EC2, or on Microsoft’s Windows Azure infrastructure (http://www.windowsazure.com/en-us/home/features/virtual-machines/). This is a ‘safe’ first step for many organizations as it means someone else takes care of the infrastructure and licensing but they retain a large degree of control over the management of their databases. They are still working with “normal” SQL Server in much the same way they would with on-premises SQL Server instances, except they do not control or have to build or maintain the hardware. However, they can still get a whiteboard and a pen and draw a diagram that locates their database server and its data and log files.
VMs in the cloud can have advantages over on-premises including better uptime, higher bandwidth and so on. Some companies use these them a failover facility for their on-premises SQL Servers. Setting them up has lower upfront costs when building a new virtualized environment.
However, the upside of retaining control over your SQL Server architecture is also a downside in that you still need to monitor, configure, patch and generally administer your server. When the time comes to scale out the architecture, to cope with increased demand, you need to build and implement that architecture.
If you opt for a Cloud service such as Amazon Relational Database Service (RDS) for SQL Server, Amazon takes control of both SQL Server and the host Windows VM, as well as certain administrative functions such as patching and backups (see http://aws.amazon.com/rds/sqlserver/ for more details). In this situation, DBAs get to work with SQL Server, as they know and love it, but relinquish granular control over what is possible in terms of backup and restore, which of course is a worrying prospect for many.
Such concerns aside, significant as they are, these sorts of Cloud options are the least disruptive from the point of view of the DBA, in that they continue to work with SQL Server in a very familiar fashion, and gain many of the Cloud benefits. However, to exploit the real ‘scale out’ power of the cloud computing requires moving to a cloud-architected database, such as Windows Azure SQL Database, designed to “shard”, or horizontally partition, data across many small database on many worker nodes. In return, you relinquish all control over not only the hardware but also SQL Server configuration and any notion of the exact location of your data and log files. In essence, you just ‘see’ a connection string and that is it.
Porting an existing database to SQL Azure is not necessarily straightforward, even if you have no requirement for massive scale-out, via sharding. If your database is IO-bound, due to heavy activity on a poorly-design then you no longer have the same storage leavers to pull. You have to deal with performance optimization in a different way, and provisioning in a different way. For a start, this is a shared environment, with other customers using the same compute nodes in the cluster, and potentially even sharing the same database (multi-tenancy). Unless you pay for SQL DB Premium edition, the resources available for your workload will depend, on how nicely you, and others, “play” in the shared environment. Potentially, you may need to perform a lot of tuning, and application rewriting to avoid throttling issues, optimizing application-database communication to deal with increased latency between the two, and so on. You may need aggressive application caching. You will also need retry logic and to deal with expected node failure and the need to reconnect. If you wish to benefit from the massive scale-out capabilities of the Cloud, you will need to take a deep look at possible sharding patterns for your data models.
In short, you have to be mindful that there is an explicit process to port the application, but once you have done so, you get the full benefits of a Cloud architecture.
Cloud Services and the DBA
For some databases, the statutory requirements for the storage of personal and financial information and attendant security regulation, such as the need for logging and recording of every attempt at intrusion, make them unlikely candidates for Cloud migration, unless to a private Cloud (although this option is likely only open to larger organizations, who already have a large data center).
However, for any database with less stringent regulatory restrictions, I believe the DBA should actively help their organization find a route to the cloud for at least some aspects of the related workload for that database. All DBAs want to retain a high degree of control over their production instances. However, for every production instance, there are likely several other non-production instances of the same database (development, testing and so on). Moving these environments into the Cloud can help, but this is only the first step in how the Cloud can help a DBA.
Every server comes not just with associated infrastructure “drag”, in terms of ensuring the server’s availability, security and performance, but also with the need to perform a set of essential maintenance tasks. One of the big problems is that each of these maintenance tasks comes with associated infrastructure drag, and this can really tie up a DBA’s time. For example, setting up backup verification and test restores comes with associated infrastructure drag, as does setting up dedicated SQL Server monitoring, as do ETL processes that move data between different environments, and so on.
Many organizations are already starting to recognize the convenience of shifting as much as possible of certainly this sort of ‘secondary infrastructure’ into a much cheaper managed environment like Amazon or Azure. Essentially, this is the “hybrid cloud” approach, whereby the core database systems, which may contain sensitive data, remain on premise, but surrounding infrastructure and services go to the Cloud.
I’ve already mentioned briefly the idea of using Cloud infrastructure for failover and disaster recovery facilities (see, for example, http://blogs.technet.com/b/dataplatforminsider/archive/2013/08/12/alwayson-availability-groups-fully-supported-on-windows-azure-infrastructure-services.aspx). Let’s take a look at a couple of other examples.
SQL Server Monitoring As a Service
Setting up SQL Server monitoring is a good example of a sound investment that nevertheless requires significant infrastructure “churn”. Most DBAs understand proactive monitoring of their servers to be a “good thing” that over time will reduce firefighting. However, monitoring a SQL Server infrastructure is non-trivial. If you do it “by hand”, you need to write and then schedule and maintain the scripts that collect all of the required memory, CPU, I/O, performance data. You need to schedule the job and alerts. You need to set up a monitoring infrastructure, comprising a VM or a machine running IIS, having another database to look after to be the repository for the monitoring data. You need to spend time managing that new infrastructure, its availability, data growth and so on. Finally, of course, you need to establish baselines for all your metrics, respond to alerts when metric value stray substantially from the norm, for sustained periods, and analyze the data for anomalies or worrying trends. Consequently, according to our research, only a third of DBAs use a commercial SQL Server monitoring solution. A third use, at least to some degree, custom monitoring scripts and the final third do no monitoring at all.
We worked very hard with our on-premises SQL Monitor tool to reduce the “set up and configuration” barrier, but it does not remove the need to set up and manage the monitoring infrastructure. However, what if, instead, all you needed to do to monitor your SQL Server databases was to download a small relay (a window service) that would sit close to your SQL servers, collect the monitoring data and store it in a secure, hosted location for you to view and analyze. What if this were possible regardless of where your SQL Servers were located on-premises, or a dedicated hosting site, or cloud hosted on an infrastructure as a service platform like EC2, or using a full platform as a service relational database like Amazon RDS or Windows Azure SQL Database.
Our plan is just that: to provide a cloud-hosted version of SQL Monitor that is just like SQL Monitor, but via the Internet. This would remove much of the concern about having to provision extra machines, extra databases, and all other hum and hassle that comes with keeping it all up and running. We’ll store the monitoring data in a secure “Red Gate” cloud and of course, once there, we can offer services based on analysis of this data, producing concise reports on key issues, uncover underlying trend or anomalies in resource usage, and so on. This will help the DBA understand quickly and easily what happened, what is happening, what is going to happen, why, and what they need to do.
A nice fringe benefit of using a SaaS version of a tool like SQL Monitor Hosted, rather than on premises, is that makes it much easier to keep up to date with the latest release and feature. It also, I believe, encourages in the vendor an attitude of “continuous improvement”.
At Red Gate, we tend to do releases every two or three months. However, many of our on-premises tool customers, because of the work involved in updating their existing systems, making sure the new version is OK to deploy, and so on, will hold back from new releases, or skip one or two versions. It means, of course, that they are missing the benefit of the new functionality, as it’s developed. With a hosted tool, the barriers are much lower; it becomes more like logging onto Google and finding out that there is a new version of Google Maps that you can check out right away.
Database Backup and Restore as a Service
One DBA I spoke to recently was responsible for a web-based e-commerce application subject to spiky, often high-intensity, demand. The organization had a primary and secondary data center, and high availability technologies were in place for the application.
Following a few issues with restore operations the DBA started using the “Scheduled Restore and Backup verification” functionality that we introduced last year in SQL Backup Pro 7. Each night he shipped the full and log backups to a third data center, where he set up scheduled restore operations, followed by DBCC checks on the restored databases. Every morning he reviewed the DBCC output across all of his databases to ensure all was well.
On the plus side, it all worked very smoothly and he reestablished confidence in the integrity of the database backups, and the data. On the downside, the solution came at quite a high price, both financially and in terms of his time. He now had to manage another piece of infrastructure, in the third data center, as well as another contract with another co-location provider, a new set of log shipping and restore/verification processes, yet another potential source of the 2AM blackberry alert in cases of failure.
What if there was a way, as an integral part of your normal database backup process, to enable an option to back up the data to secure hosted storage, where the provider then preformed test restores, ran DBBC checks, dropped the restored databases, and simply emailed you a brief report. Suddenly, you have all of the benefits of data and backup verification with no additional infrastructure “churn”.
Of course, there may be DBAs who adopt a rather conservative attitude to storing their actual data (as opposed to, for example, performance-monitoring data) in a vendor-owned section of the public Cloud. However, as discussed earlier, I do believe attitudes are changing quickly with regard to trusting Cloud tools vendors with business data, and for those databases not subject to strict regulatory control, the advantages are considerable.
With SQL Server 2012, you can now use the normal backup functionality to back up your SQL Server Database to the Windows Azure Blob storage. Likewise, as Red Gate, we offer Red Gate Hosted Storage for all SQL Server Backups, integrated directly into the SQL Backup Pro tool. Check one checkbox in the backup wizard and the tool will copy the backup file out to the cloud, as well as to your standard backup location (see https://www.simple-talk.com/cloud/cloud-data/backup-to-the-cloud-the-silver-lining/).
Red Gate has devoted a lot of it time trying to help the DBA with strategic tools that will help manage and automate backups or monitor their servers for performance issues, to help spot potential problems before they erupt into time-consuming crises. Many DBAs use our tools and others, write scripts and schedule jobs. They automate. They buy a little of the time they need for the ‘bigger’ projects that seek to fundamentally improve business processes, make some limited progress, but then run up against various issues that slow them down, and then find themselves sucked, inexorably, back into the daily task “churn”.
More and more, organizations are looking for ways to mitigate the setup and configuration cost, and to avoid altogether the additional infrastructure management cost. In this respect, I believe Cloud-based tools and services have a crucial role to play.