By: Ken Adams 

Senior BI Consultant at Key2 Consulting

 

As was mentioned in a previous blog post comparing the pros and cons of Microsoft Azure SQL Database versus SQL Data Warehouse, Microsoft announced at SQL PASS 2016 that they are offering a free, 1-month trial of SQL Data Warehouse. In this post, I wanted to take Microsoft up on their offer and show you how to get started using Azure SQL Data Warehouse (SQL DW)!

So let’s get started. First, head over to the free trial link from above –

 

SQL Data Warehouse free trial

 

You will notice you’re applying for a free trial and that there are a couple of things required. The important details are:

An Azure account is necessary – you can sign up for free if you do not already have one.

  • Please note this does require providing a valid email address, phone number, and credit card or debit card number. As stated on the Azure sign-up site, you will not be charged for anything unless you decide to extend the trial. The card number is for identification purposes only. I guess we have to trust them here.  😉

A referral code was provided during Joseph Sirosh’s keynote at SQL PASS. Joseph Sirosh is over the Data Platform group, which includes Azure. If you weren’t at PASS, the referral code can be found with a simple Google search. I will make it easy on you –the referral code is SQLPASSROCKS.

Upon submitting a request, you must wait for an email with the next steps. By the way, the free trial includes compute resources of 200 Data Warehouse Units (DWU) and 2TB of storage. This seems to be a good starting point for a test drive of SQL DW. The 200 DWU represents 2 compute nodes and 30 distributions per node. You can find a good breakdown of the DWU and what that equates to in terms of computing power here.  The crux of DWU to compute nodes is seen in the chart below:

DWU #of
compute nodes
# of
distributions per node
100 1 60
200 2 30
300 3 20
400 4 15
500 5 12
600 6 10
1000 10 6
1200 12 5
1500 15 4
2000 20 3
3000 30 2
6000 60 1

Be careful that the reply from Microsoft does not find its way into your spam inbox, which happened to me. The email is seen below:

 

Microsoft SQL Server Azure SQL Data Warehouse thank you message

 

Once I submitted my request for a free trial, it took about 24 hours to receive a response. The response I received seemed a little more personal than what I was anticipating, meaning it appeared to come from an actual person. I was expecting an automated reply.

Here is the body of that message:

Thank you for submitting your request for Azure SQL Data Warehouse free trial.

As next steps to activate the free trial, please create a logical SQL server by logging into the Azure Portal. Please note that the logical SQL Server must be created in one of the following regions to participate in the Azure SQL Data Warehouse free trial program.

Available regions for Azure SQL Data Warehouse free trial: West US, West US2, West Central US, North Europe, West Europe, East Asia, South East Asia.

Please reply to this email with the following details within the next 2 business days.

1. Logical SQL Server name

2. The region in which you created the logical SQL Server. This should be one of the above-mentioned regions for you to be eligible for the free trial

3. The Azure subscription ID under which you created the logical SQL Server

We will get back to you with the next steps after reviewing the above-mentioned details that you provide. Please note that your one-month free trial does not start until we send you a confirmation email that provide you the offer validity period. We advise you not to create the SQL data warehouse instance in this logical SQL Server until we send you the confirmation email about the offer validity period.

You will notice that the response is time sensitive, so plan accordingly. Once I received the reply, I went ahead and logged into the Azure portal.

Clicking on the Azure Portal will take you to a login screen where you will login with your Azure credentials. Remember this was a prerequisite to signing up for the free trial. Once you log in, you will see your Azure dashboard. Your dashboard will likely be a blank dashboard unless you possess other services.

 

Microsoft Azure SQL Data Warehouse dashboard

 

The email received states that you need to set up a new instance in one of the required regions and provide that instance information back to the Azure team. I did that by clicking on the + New link at the top left of the Azure page and typing “logical” in the search box:

 

Microsoft Azure SQL Data Warehouse search box

 

That brought up SQL server (logical server), which I selected.

You are then presented with a screen with the filtered selections. The logical server should be the first option in the list.

 

Azure SQL data warehouse filter

 

Click on SQL server (logical server) and then create:

 

SQL Server Iogical server

 

Note that in the description, the logical server functions as a host for the various Azure database offerings. After clicking ‘Create’ you are ready to set up the server. You are required to fill in the following:

 

SQL Server logical server

 

I had previously established a resource group, so I used that. You may need to create a new resource group. I chose to have my instance pinned to my dashboard to make it easier to access. Upon selecting Create you are taken back to the dashboard, where you can see the instance being generated:

 

deploying sql data warehouse

 

In all of 30 seconds, the instance was created and the details were displayed. The SQL host information is what you need to provide to the Azure team. I have highlighted the request information below:

 

sql host information

 

Send an email with the highlighted pieces of information and wait for the green light from the Azure team. It took about 5 days to get approval from the Azure team in the DW free trial program.

From the time I sent the email back to the Azure team it took about 5 days to get the approval in the Azure DW free trial program. I want to point out a couple of things here:

– First off, the Azure team representative was helpful and professional. I was persistent in trying to get a response and resolution to the problem mentioned in the next bullet.

– Please note that the logical SQL Server name is case-sensitive. When providing the details requested please make sure the name you give them matches the name in Azure down to the case.  This caused some issues for me as my email client “corrected” the case of my instance name – it changed “sqldwtrialhost” to “Sqldwtrialhost”.  This caused a bit of pain for the Azure team to locate my logical SQL Server and probably delayed my approval by a day or so.

The acceptance email into the trial program arrives with a “Getting Started” guide and details about what you get in the trial.  A few highlights:

– 1 month of SQL DW service

– SQL DW service at 200 DWU

– Charges will start after 1 month unless you delete your instance

The guide provided by the Azure team is helpful in getting started. It details setting up SQL DW and many of the touch points along the way. I read that and then started down the road. First, log into the Azure portal and select the + New link\Database\SQL Data Warehouse.

 

Microsoft Azure SQL Server database

 

The next screen will be the details of your SQL DW instance. See my screen below:

 

SQL data warehouse

 

A couple of important items to note here:

– Select Free Trial for the Subscription since we’re signed up as part of the free trial group.

– The resource group should be the same group created while setting up your SQL Server.

– I left the source as “Blank database.”

– Server should be the server name set up and ultimately approved by the Azure team.

– Performance, as stated in the acceptance email, should be set at 200 DWU. You will have to modify the value or slider down as the default is 400 DWU.  My assumption is that if you set it too high you could get charged.

– I also selected Pin to dashboard to make it more accessible from the Azure portal.

Now click create and wait while the SQL DW instance is formed:

 

deploying sql data warehouse

 

The process to create the SQL DW instance took 2 minutes and I was ready to move forward. The dashboard pin is updated and shows that the SQL DW instance is Online.

 

sql data warehouse trial

 

After the SQL DW was deployed, it sent me directly to a configuration page. From here, we need to work on the firewall settings. Start by clicking on the server name as seen below:

 

Azure sql data warehouse firewall settings

 

Clicking the server name link brings up another configuration page. From here, you can access the firewall settings from a couple of places – click on Firewall from the menu on the left or by clicking on the Show firewall setting link:

 

Azure sql data warehouse firewall settings

 

This brings up yet another configuration screen:

 

Azure sql data warehouse firewall settings configuration

 

Now you need to determine what this rule represents. This should be the IP address of the machine where you plan to access Azure SQL DW. I have a Windows 10/SQL Server 2016 virtual machine that I used. I added a rule name, a start and end IP address, and saved my new rule. If you log in from different networks, you will also want to add those exceptions, as the IP address will be different.

 

machine address IP

 

 At this point, we are ready to try to connect to SQL DW. I’m using SQL Server Management Studio (SSMS) 2016 from my Windows 10 VM. Initially you can log in with the administrator login, which was set up when the SQL Server was created. However, you will want to set up a new user based on the guide sent along by the Azure team. My initial login was successful.

connect to SQL server

 

As seen in the SSMS Object Explorer:

SSMS object explorer

 

According to the guide, the administrator class that comes with the server is set to a smaller resource class with SQL DW. The new user will be set-up and given access to the larger resource class. The following code can be run to perform this operation:

 

 

Now that we have gotten past the administrative portion of setting up a user, I would like to start importing some data. Data can be imported into SQL DW from Azure Blob Storage, SQL Server, and HDInsight among others. You can find a summary of these options here. For the initial population of the SQL DW database, I chose the import from an Azure Blob. The Azure Blob Storage is analogous to Amazon’s S3 storage and is defined by Microsoft as a service for storing large amounts of unstructured data that can be accessed from anywhere in the world via HTTP or HTTPS. Fortunately there is publicly available data in Azure Blob storage that can be accessed. That made it easy to begin setting up a connection and importing data.

The first step is to establish a connection to the source data. I chose the Contoso Retail DW, which is a sample retail dataset.  The code to set up the connection is below:

 

Referencing external objects in this manner was new for SQL Server 2016.  It allows you to use T-SQL statements to connect to and query data stored in Azure Blob Storage or Hadoop.  You can find more information on this concept here.  Being able to query unstructured objects is very reminiscent of HiveQL querying a Hive table except that you get to use T-SQL!

In addition to the store dimension, I set up references to the Contoso production dimension on inventory fact table.

 

The external objects can now be moved into tables inside of the SQL DW using the CREATE TABLE AS (CTAS) construct.  CTAS is available only in SQL DW and Parallel Data Warehouse (PDW).  The largest of the three tables is Fact.Inventory which loaded in under 30 seconds.  The code snippet is below:

Now you have three tables loaded into the SQL DW.  You can see them in the SSMS Object Explorer:

 

 

You can also see the three external objects defined.  After or during the load into SQL DW you can see the statistics and progress via a system view query:

You get some cool information back such as the amount of data processed, the number of files ingested, and the query executed.

 

 

I dropped the tables and re-ran the CTAS script several times hence the duplicate entries.  After loading the data into SQL DW, it is suggested that you rebuild the indexes and create statistics. Tables by default are stored as a clustered columnstore index (CCI).  In order to ensure the data is fully compressed you need to rebuild the CCI after the data load completes.

In addition, statistics are not created automatically in SQL DW.  You will need to do this once the data is loaded.  Fortunately, there is a stored procedure available to do this for all columns in the database.  Create the stored procedure in your database and execute it:

Your database should now be ready to query and optimized to do so.  In a subsequent post, I will dig into some queries, play around with the resources by scaling down the DWUs and even look at an alternative for importing data into SQL DW.  Stay tuned…

To learn more about the various methodologies for getting started with SQL Data Warehouse, we highly recommend reading James Serra’s (Data Warehouse Evangelist at Microsoft) recent post on SQL Data Warehouse Reference Architectures.