Getting Started with Azure SQL Data Warehouse

Azure SQL Data Warehouse is an obvious first-step towards migrating on-premise organisational data to the cloud. So how do you get started with it? Robert Sheldon provides a simple guide that should provide you with sufficient of the the basics you need to get a SQL Data Warehouse database up and running.

In my Simple-Talk article Azure SQL Data Warehouse, I introduced you to SQL Data Warehouse and gave you an overview of the architecture and technologies that drive the service and make it all work. In this article, I go a step further and provide details about getting started with SQL Data Warehouse, demonstrating how to add a sample database and then accessing the server and database settings.

If you want to follow along with my examples and try out SQL Data Warehouse for yourself, you must have an active Azure subscription, even if it’s just the free trial. For those who have already used up their free trial, be aware that SQL Data Warehouse is a pay-as-you-go service, even though it’s still in preview, so unless you’re on an unlimited company budget or happen to have accrued MSDN credits, you’ll want to be judicious in how you try out the service. Fortunately, as you’ll see in this article, you can pause the compute resources when not in use, helping to minimize the costs associated with learning about the service.

That said, let’s get started. We’ll create a database and poke around a bit in the server and database settings. It’s actually a fairly straightforward process, thanks to Azure’s intuitive and user-friendly portal. In a follow-up article, I’ll show you how to connect to the database, but for now, we’ll focus on getting you set up.

Add a SQL Data Warehouse database

When you create a SQL Data Warehouse database, you must specify a server and resource group. The server is a logical Azure server that resides in a specific geographical region and serves as a host for the databases you create through either SQL Database or SQL Data Warehouse. The resource group is a logical container for managing a collection of Azure resources.

You can create the server and resource group when you define your SQL Data Warehouse database, or you can use ones that already exist.

In addition to designating a server and resource group, you must also assign the initial number of data warehouse units (DWUs) that your database will use. The DWUs provide a measure of the computational resources (CPUs, memory, and storage I/O) available to your database, aggregated across participating nodes.

You can scale the number of DWUs assigned to your database up or down at any time. The more DWUs you assign, the better the performance but the higher the costs. The preview rate for the compute resources currently starts at 70 cents per hour for 100 DWUs, which comes to about US $521 per month, if you keep your database running full time.

With these variables in mind, let’s look at how to create a SQL Data Warehouse database. Your first step is to log into the Azure portal and then click New. This launches the New pane, which is populated with the various categories of Azure services. Click Data + Storage and then, in the Data + Storage pane, click SQL Data Warehouse, as shown in the following figure.

2403-image1.png

After you click SQL Data Warehouse, you’re presented with the SQL Data Warehouse pane, where you provide a name, set the initial DWUs, and specify the server, database source, and resource group. The pane looks similar to the one shown in the following figure, before configuring any of the settings.

2403-image2.png

Notice that the initial DWU setting is 400. If you’re just here to try things out, move the slider down to 100 DWUs, the lowest setting available. I like to take care of this setting before all others to make sure I don’t forget. No sense spending any more than necessary just to test the waters.

This is also a good time to provide a name for your database. For this article, I used SdwDB1, which is reflected in many of the screenshots, but use whatever works for you.

Once you’ve gotten the name and DWU settings out of the way, you can specify your server. Click Server in the SQL Data Warehouse pane to launch the Server pane. Here you can either select an existing server or create one. If you want to create a server, click the Create a new server option, which launches the New server pane, shown in the following figure.

2403-image3.png

In the New server pane, type a name for the server, a name for the server’s administer account, and a password for the account (twice). You should also choose the server location, based your geographical region. For this article, I named the server sdwsrv1 and the administrator account sdwadmin. I chose West US as my location.

Notice you also have the option Create V12 server, which refers to the latest SQL Database server version. This is because the SQL Database service underlies SQL Data Warehouse. When you create a SQL Data Warehouse database, you’re actually creating one of the SQL Database variety, with some extra bells and whistles to support data warehousing.

Currently, the V12 server is the only version you can choose for a SQL Data Warehouse database. For that reason, the option is enabled by default and you cannot change it.

Another option you cannot change is Allow azure services to access server. The checkbox is selected by default and is a must for a SQL Data Warehouse server.

That’s all there is to creating a server. Just be sure to click OK after you’ve filled in the blanks. From there, you can select the database source by choosing one of the following three options:

  • Blank database: Creates a new database that contains no user tables or data.

  • Sample: Creates a database based on a sample database.

  • Backup: Recovers a database from a geo-redundant backup.

The following figure shows the Select source pane with the Sample option selected, which is what I’ve used for the examples in this article.

2403-image4.png

When you select Sample as your source type, the SQL Data Warehouse pane adds the Select sample option. Currently, the only sample available is the AdventureWorksDW database, so it is selected by default. I’m assuming that Microsoft has included this option because it will be possible at some point to choose from other sample databases.

After we select our source, we can specify a resource group. When you click the Resource group option in the SQL Data Warehouse pane, the Resource group pane appears, where you can choose an existing resource group or create a new one. To create a new one, click the Create a new resource group option. This launches the Create resource group pane, shown in the following figure.

2403-image5.png

To create the resource group, you need only type in a name and click OK. (I named my resource group SdwGrp1.) After that, you’re ready to create your database. Before you do, however, you might want to select the Pin to dashboard check box so you can easily access your database from the dashboard. Then, if all your settings look satisfactory, click Create to generate your new database.

When the process start, you should receive a notification that keeps you posted on the progress. You can tell when you have a notification because the Notifications button at the top of the portal will include a red alert that indicates the number of messages you have not read. To access your notifications, simply click the button. This also allows you to monitor the progress of your database as it’s being created, as shown in the following figure.

2403-image6.png

When the process is completed, you should receive a second alert similar to the one shown in the next figure, indicating that the database has been created in the specified resource group, in this case, SdwGrp1.

2403-image7.png

If you pinned your database to the dashboard, you can easily access it from there. Just click the box that shows your database to open a pane that contains details about the database. For example, the following figure shows part of the SdwDB1 pane for the SdwDB1 database I created in Azure.

2403-image8.png

From the database’s pane, you can access a variety of information about your database, as well as the server, resource group, and subscription associated with that database. You can also scale the database (raise or lower the DWUs), restore the database, or open the database in Visual Studio or PowerShell.

Of the available options in the database pane, I think the most valuable one is the Pause button at the top. As the name suggests, you can pause the compute services used for the database at anytime, effectively putting the database on hold until you’re ready to use it again. This cancels out running queries and the DWUs assigned to the database, thus avoiding compute fees for unnecessary services. You still pay for the data storage – which preserves your data – but at least this way you get a break on the compute rates. When you consider how many hours a data warehouse can sit idle, the ability to pause a database can result in considerable savings, assuming you remember to take this step.

Working with the server settings

As you saw in the previous figure, the database pane lists the server’s fully qualified name, in this case, sdwsrv1.database.windows.net. This is the name you use when connecting to the server from another service or application, such as Visual Studio. (As mentioned earlier, I’ll be covering database access in my next article on SQL Data Warehouse.)

The server name, as it is listed in the database pane, also serves as a link for accessing the server settings. When you click the link, the server pane appears, along with the Settings pane, which is specific to that server, as shown in the following figure.

2403-image9.png

From the server pane and Settings pane, you can access a variety of information about the server. If you’ve been using the SQL Database service, you’re probably familiar with many of these settings. If you’re not, you can review these at your leisure to better understand what each one is about. The nice part is that you can pause the database while doing so to avoid extra charges when you’re simply trying to make sense of the features.

One group of settings that you’ll likely need to know about sooner rather than later are the firewall rules. To connect to your database from a service or application, you must add the necessary firewall rules to permit access, based client IP addresses. You can access the firewall settings in one of two ways:

  • Click the Show firewall settings link in the server pane.

  • Click the Firewall option in the server’s Settings pane.

Either approach launches the Firewall settings pane, where you can add the IP addresses of those systems that should be able to access your database.

When you’re first setting up your database, you’ll likely want to access it from an application on your local system, such as Visual Studio or Power BI Desktop. By default, the Firewall settings pane lists your system’s client IP address; however, you must still add a firewall rule for that address to enable local access. To do so, simply click Add client IP at the top of the pane. This adds a rule to the pane, as shown in the following figure. In this case, I changed the rule name to ClientIP1, rather than using the cumbersome default name. (The actual IP address has been grayed out.)

2403-image10.png

You can also add rules that define ranges of IP address to better accommodate multiple systems or addresses that are likely to change. After you create a rule, be sure to click Save to make sure it gets added to the server. It can sometimes take a few minutes for the rule to be applied, so you might have to wait a bit before you can connect to the database.

When you first set up the server and database, as I’ve described so far, you can use the server administrator account to connect to your data warehouse, which has full access to everything on the server. In the next article, I’ll dive into the details about setting up your user accounts and security roles to better restrict access, but for now, just know that these options are available to help you protect your data.

While we’re on the topic of connectively, another option in the database pane worth pointing out is Show database connections strings. Clicking the link launches the Database connection strings pane, shown in the following figure.

2403-image11.png

As you can see, the pane provides a handy way to get the connection strings necessary to connect from an application or service via ADO.NET, ODBC, PHP, or JDBC. You need only click the button to the right of each one to copy the connection string to your clipboard.

Working with the database settings

From the database pane, you can also access a variety of settings associated with that database. To get to these settings, click Settings at the top of the database pane. This launches the Settings pane associated with the database, which is shown in the following figure. (You can also launch the Settings pane by clicking the All settings link in the Essentials section near the top of the database pane.)

2403-image12.png

The Settings pane includes a number of categories that provide access to information and tools beyond just the basic settings. For example, you can click the Troubleshoot option to launch the Troubleshoot pane where you can get information about selected issues, as shown in the following figure.

2403-image13.png

To get to any information, you must select one of the predefined issues from the drop-down list. Currently, the list includes only the following two issues:

  • How do I migrate?

  • My queries are slow.

When you select either one of these issues, you’re provided with a link to documentation that addresses the topic. My assumption is that more issues will eventually be added to the list. (Note that the server settings also include a Troubleshoot option and its related pane, which includes more issues than what are available at the database level. However, all issues listed at the server level are specific to the SQL Database service, although they might prove useful when working with SQL Data Warehouse.)

The next option available to the database settings is Audit logs, which provides access to the Events pane, where you can view information about database events, as shown in the following figure.

2403-image14.png

Through the Events pane, you can carry out the following tasks:

  • Filter the information based on such variables as the event category or time span

  • Select which columns to include in the table

  • Hide or display the chart of summarized audit information

  • Export the audit logs to Azure storage or Event Hub

Next on the list of database settings is the Check health option, which launches a pane that is supposed to provide information about your database’s operations, as shown in the following figure.

2403-image15.png

Unfortunately, I was never able to get this pane to return any information other than what is shown in the figure. I tried logging into and out of the database several times within the allotted 10 minutes, on several different occasions, but still no luck. I’m assuming that this is related to the service still being in preview, but some other nefarious factor might be at play.

The database settings also include the New support request option, which launches the New support request pane and the Basics pane, the first step in acquiring support, as shown in the following figure.

2403-image16.png

If you have purchased a support plan, then you will be walked through the steps to request support, providing Azure can find the plan. If no support plan is found, your only option is to advance to Step 2, which provides the following links:

  • Buy a support plan (links to the Azure Customer Support page)

  • Ask your question on a forum (links to the Azure forums)

  • Connect with @AzureSupport (links to the Azure Support Twitter page)

You’re also provided the ability to specify the information necessary to link to an existing Premier Support contract or Azure Program Benefits information.

Within the database Settings pane, you’ll also find the Properties option, which finally gets us to the actual database settings. If you click the option, the Properties pane appears, as shown in the following figure.

2403-image17.png

The page provides read-only access to the following properties:

  • Pricing tier

  • Latest SQL database update

  • Status

  • Max size

  • Collation

  • Creation date

  • Connection strings

  • Server name

  • Location

  • Server admin login

  • Active Directory admin

  • Resource group

  • Subscription ID

  • Subscription name

For some of the settings, you can copy the property value to the clipboard by clicking the button to the right of that value.

Also in the Settings pane, you’ll find the Scale option, which let’s you set the DWUs assigned to your database, as shown in the following figure:

2403-image18.png

You can adjust your database’s DWUs at any time, boosting performance to meet your needs. Currently, you can scale up to 2000 DWUs, which currently comes to $14.00/hour. You can also access this pane directly from the database pane by clicking the Scale button at the top of the pane.

Another group of settings available to the Settings pane is Auditing & Threat detection, which launches the Auditing & Threat detection pane. By default, the database settings are inherited from the server. In fact, you can access the server auditing settings by clinking the View server auditing settings link, as shown in the following figure.

2403-image19.png

If you want to configure auditing at the database level, clear the Inherit settings from server check box, and configure the settings accordingly.

Whether you configure auditing at the server level or database level, you must specify how the audit data will be stored and what data to audit. You can also configure threat detection at this time, but that feature, like SQL Data Warehouse itself, is still in preview.

In addition to auditing options, the database settings include the Transparent data encryption option, which launches the Transparent data encryption pane, where you can enable data encryption, as shown in the following figure.

2403-image20.png

Transparent data encryption provides real-time data encryption and decryption capabilities for your data at rest, which includes the database itself, its backups, and the transaction logs.

Working with SQL Data Warehouse

You should now have the basics you need to get a SQL Data Warehouse database up and running. For the most part, Microsoft makes this process a fairly painless one. Just remember to pause the database when you’re not using it. Those subscription fees can add up quickly. (Luckily, storage is remarkably cheap for the AdventureWorksDW database.)

To access the data in your database, you’ll need to turn to a tool such as Visual Studio or Power BI. In the next article, we’ll look at several tools for accessing the data. If you can’t wait until then, you’ll need the following information to connect to the server: the fully qualified server name, the administrator account name for that server, and the password for that account. Just remember to reactivate the database if you already paused it.

In the meantime, if you have any questions about what I covered or want additional specifics, you can access more information on the SQL Data Warehouse landing page. There you’ll find links to pricing details, documentation, and other SQL Data Warehouse resources.

  • 10032 views

  • Rate
    [Total: 6    Average: 4.8/5]
  • Anonymous

    Thanks
    Thanks Robert, fantastic series of articles.. Keep them coming please 🙂 Would be great to see how Azure DW can work with SSAS Multi-dimensional in the next article!

    Cheers