2187-just_azure.svg

I am not a web developer, so when I promised Cerebrata that I would write this article on accessing the Azure Blob Service through the REST API, I didn’t know anything about using REST APIs. I could write a WCF service that offers up a REST API that could be called, but I’d never actually executed a request against a REST API and examined the response. That made this article seem daunting, and while it’s the fifth article in the series about blob storage, it was the last one written.

One question you may ask if why do you want to know how to use the REST API? Well, the Azure product team frequently releases new features (this is the firehose effect). Many times, the new features are accessible through the REST interface, but hasn’t been surfaced through a storage client library or the UI (like the portal) yet. If you always want to use the latest and greatest, learning REST is a required skill. Also, if you want to access blob storage with a programming language that does not have an SDK or storage client library, you can use REST instead.

I find that I am rarely the only one in a room who doesn’t understand something – I’m just the only one brave enough to admit my ignorance. Because of that, I’m going to write this from the standpoint of someone who has no idea how to make a REST call. This may bore those who are all-knowing and just want the syntax for Azure blob storage. If so, feel free to skip to the end and check out the reference links at the bottom of the article.

For those of you who make REST calls but want more details about the fields used, those who want to know what “canonicalized” means, and those of you who, like me, have never actually executed a request to a REST API, this is the place for you. If you want to run amok through the REST API for blob storage, this will give you a good basis to start. I’ll link to the MSDN documentation and show how to translate it into an actual REST call. If I’m successful, then you will no longer need to go home sick to avoid someone who asks if you have time to test their REST API. (No, I’ve never done that, but I had a coworker once who did. Don’t do that – never miss an opportunity to learn something new.)

REST means “representational state transfer”. Wikipedia starts explaining REST this way: “REST is an architectural style consisting of a coordinated set of architectural constraints applied to components, connectors, and data elements, within a distributed hypermedia system. REST ignores the details of component implementation and protocol syntax in order to focus on the roles of components, the constraints upon their interaction with other components, and their interpretation of significant data elements.”  Phew. It makes me tired just reading that.

What does it mean? To me, REST is an architecture you can use when calling APIs or making APIs available to be called. It is independent of what’s happening on either side, and what other software is being used when sending or receiving the REST calls. You can write an application on a Mac, Windows, Linux, Nokia Lumia 1520 [free product placement because I love mine], iPhone, Samsung Galaxy Note, or web site that and use the same REST API for all of them. Data can be passed in and/or out when the REST API is called. The REST API doesn’t care what it’s called from – what’s important is the information passed in the request and the data provided in the response.

Now that you know more about REST than you ever thought you wanted to, let’s talk about writing code to make a REST call, specifically using the Azure Storage Services REST API. To use this, you actually have to have a storage account in an Azure subscription. I’m going to show an example for listing the containers. Once you get the hang of this, the other calls should be easy to figure out, and in case you have trouble, I’ll give you a link at the end to a project with has tons of sample code.

If you look at the Blob Service REST API, you’ll see all of the operations you can perform with the REST API. The storage client libraries are just wrappers around the REST APIs – they make it easy for you to access storage without writing REST APIs. But we’re going to do this the hard way since that’s the point of this article.

MSDN: List Containers API

Let’s look at the page in MSDN for the ListContainers operation so you understand where some of the fields come from in the request and response when we get to the next section with the code.

Request Method: GET. This is the HTTP method you will specify as a property of the web request. This could also be HEAD, PUT, or DELETE, depending on the API you are calling.

Request URI: https://myaccount.blob.core.windows.net/?comp=list  We will build this from the blob storage account namespace http://myaccount.blob.core.windows.net and the resource string /?comp=list.

URI parameters: There are additional query parameters you can use when calling ListContainers. A couple of these are timeout (in seconds) for the call and prefix which is used for filtering.

Another helpful parameter is maxresults: if more containers are available than this value, the response body will container a NextMarker element that indicates the next container to return on the next request. To use this, you provide the NextMarker value as the marker parameter in the URI when you make the next request. This is like paging through the results.

To use additional parameters, you just append them to the resource string with the value, like this:

/?comp=list&timeout=60&maxresults=100

Request Headers: This section lists the required and optional request headers. Three of the headers are required: an Authorization header, x-ms-date (contains the UTC time for the request), and x-ms-version (specifies the version of the REST API to use). Including x-ms-client-request-id in the headers is optional – you can set this to anything you want to; it will be stored in the storage analytics logs when logging is enabled.

Request Body: There is none for ListContainers. This is used on all of the PUT operations when uploading blobs, as well as SetContainerAccessPolicy, which allows you to send in an XML list of data access policies to apply. These are discussed in the article about Shared Access Signatures, which will be published later in this series. (Links to all of the published articles are displayed at the top of this post.)

Response Status Code: Tells of any status codes you need to know. In this example, it tells us that 200 is ok. For a complete list of HTTP status codes, click here.

Response Headers: These include Content Type; x-ms-request-id (the request id you passed in, if you passed one on; x-ms-version (indicates the version of the Blob service used), and the Date (UTC, tells what time the request was made).

Response Body: This is an XML structure providing the data requested. In our case, this will be a list of containers and their properties.

You will see these in the request and response information provided below, and in the code when creating the request and parsing the response.

Steps for creating the REST request

A couple of notes before starting – always use HTTPS rather than HTTP for security. Also, I recommend that you download Fiddler so you can examine the request and response of the REST calls. (It’s free.)

Let’s see how to build the REST request. This returns an HttpWebRequest object. The method signature and setting of some of the variables used are as follows:

For ListContainers, the method is GET. The resource is the query portion of the URI that indicates which API is being called, so the value is /?comp=list. I didn’t pull that out of the air – as noted earlier, it is on the MSDN page that shows the information about the ListContainers API. The resource is concatenated to the endpoint to create the request URI.

In this case, the endpoint is http://testsnapshots.blob.core.windows.net. I’ve hardcoded this value into the class where this code resides, along with the StorageAccount name and key. In a real application, those would reside in some kind of configuration file. The value for request URI ends up being http://testsnapshots.blob.core.windows.net/?comp=list.

For ListContainers, requestBody is null, the headers are null, ifMatch is string.Empty, and md5 is string.Empty. These are included in the method because there are other calls to the Storage Service that use them, so this will work regardless of which API you are calling. An example of where ifMatch is used is when calling PutBlob. In that case, you set ifMatch to an eTag, and it only updates the blob if the eTag you provided matches the current eTag on the blob. The md5 value is passed when creating or updating a blob and you want Azure to make sure the upload was successful by hashing the content and comparing the MD5 hash with the one you’ve passed in.

Instantiate an HttpWebRequest object (in the System.Net namespace), providing the request URI.

Set the request properties for method (GET, PUT, POST, etc.). Set the content length to 0; this is the length of the body, and it can be set later if there is a body.

Add the request headers for x-ms-date and x-ms-version. Add any additional request headers required for the call. ListContainers has no additional headers. An example of an API that will pass in extra headers is SetContainerACL. For blob storage, it will add a header called “x-ms-blob-public-access” and the value for the access level.

If there is a request body, add a request header for Accept-Charset and set the ContentLength.

Create the authorization header (this is the hard part) and add it to the list of request headers. I’ll show you how to create the authorization header later in the article.

Fill in the request body (if there is one).

Return the REST request to the caller.

At this point, you can call GetResponse() on that REST request. This calls the API and gets the response back. You can examine the response StatusCode (200 is okay), then parse the response. In this case, you get an XML list of containers. Here’s the code for calling the GetRESTRequest method to set the request, executing the request, and examining the response for the list of containers.

If you run Fiddler when making the call to GetResponse, these are the things you will see. The name of the storage account I’m using is testsnapshots.

Request:

GET /?comp=list HTTP/1.1

Request Headers:

x-ms-date: Mon, 05 Jan 2015 02:49:59 GMT
x-ms-version: 2014-02-14
Authorization: SharedKey testsnapshots:Es3ptO/u0bzuklCYDGBOPbBarCH8vE+L/dclRLG24M8=
Host: testsnapshots.blob.core.windows.net

Status code and response headers returned after execution:

HTTP/1.1 200 OK
Content-Type: application/xml
Server: Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0
x-ms-request-id: 51d62149-0001-0001-0b72-c053cb000000
x-ms-version: 2014-02-14
Date: Mon, 05 Jan 2015 02:50:01 GMT
Content-Length: 1110

Response body (XML): In our case, this shows the list of containers and their properties.

?<?xml version=”1.0″ encoding=”utf-8″?>
<EnumerationResults ServiceEndpoint=”http://testsnapshots.blob.core.windows.net/”>
  <Containers>
    <Container>
      <Name>a-test-sas</Name>
      <Properties>
        <Last-Modified>Fri, 02 Jan 2015 02:31:03 GMT</Last-Modified>
        <Etag>”0x8D1F44B4A1B9A30″</Etag>
        <LeaseStatus>unlocked</LeaseStatus>
        <LeaseState>available</LeaseState>
      </Properties>
    </Container>
    <Container>
      <Name>a-testblob</Name>
      <Properties>
        <Last-Modified>Fri, 02 Jan 2015 02:31:20 GMT</Last-Modified>
        <Etag>”0x8D1F44B542F0DA7″</Etag>
        <LeaseStatus>unlocked</LeaseStatus>
        <LeaseState>available</LeaseState>
      </Properties>
    </Container>
    <Container>
      <Name>myimages</Name>
      <Properties>
        <Last-Modified>Sun, 07 Dec 2014 22:24:01 GMT</Last-Modified>
        <Etag>”0x8D1E083A2B38EF4″</Etag>
        <LeaseStatus>unlocked</LeaseStatus>
        <LeaseState>available</LeaseState>
      </Properties>
    </Container>
    <Container>
      <Name>myimages-01</Name>
      <Properties>
        <Last-Modified>Sun, 07 Dec 2014 22:53:47 GMT</Last-Modified>
        <Etag>”0x8D1E087CB6552D3″</Etag>
        <LeaseStatus>unlocked</LeaseStatus>
        <LeaseState>available</LeaseState>
      </Properties>
    </Container>
  </Containers>
  <NextMarker />
</EnumerationResults>

So that’s all it takes to make a request to the REST APIs for retrieving a list of containers from blob storage. That doesn’t seem too hard, does it? Oh. What about the authorization header. Oh yeah, the hard part.

Creating the authorization header

Creating the authorization header is the difficult part. I’m going to distill Microsoft’s authentication article down to what we need here. First, we will use a Shared Key authentication. The authorization header format looks like this:

Authorization=”SharedKey <storage account name>:<signature>”

That looks pretty simple, right? Oh wait, what is that Signature field? It is a Hash-based Message Authentication Code (HMAC) created from the request and calculated using the SHA256 algorithm, then encoded using Base64 encoding. Got that? (Hang in there, we haven’t even used the word “canonicalized” yet.)

This is the format of the Shared Key signature string:

StringToSign = VERB + “\n” +
               Content-Encoding + “\n” +
               Content-Language + “\n” +
               Content-Length + “\n” +
               Content-MD5 + “\n” +
               Content-Type + “\n” +
               Date + “\n” +
               If-Modified-Since + “\n” +
               If-Match + “\n” +
               If-None-Match + “\n” +
               If-Unmodified-Since + “\n” +
               Range + “\n” +
               CanonicalizedHeaders +
               CanonicalizedResource;

You won’t use most of these fields. For blob storage, you will specify VERB, md5, content length, Canonicalized Headers and Canonicalized Resource. You can leave all the others blank (but put in the \n so it knows they are blank).

What the heck are CanonicalizedHeaders and CanonicalizedResource?? Good question. What the heck does canonicalized mean? It’s not even a recognizable word in the Word dictionary. Here’s what wikipedia says: In computer science, canonicalization (sometimes standardization or normalization) is a process for converting data that has more than one possible representation into a “standard”, “normal”, or canonical form. Uh-huh. In normal-speak, this means to take the list of headers (in the case of Canonicalized Headers) and standardize them into the required format. So basically, they decided on a format, and that’s what you have to use – that’s what this means. And in case you actually go read that MSDN page about authenticating the request, I’ll save you the trouble of looking up lexicographically in a dictionary and tell you that it basically means alphabetically.

Let’s start with those two canonicalized fields, because they are required to create the Authorization header.

Canonicalized Headers

To create this, take the headers that start with “x-ms-” and put them in a list and sort them. Then format them into a string of [key: value\n] bits concatenated into one string. For our example, this will return the following:

x-ms-date:Sun, 04 Jan 2015 00:48:38 GMT\nx-ms-version:2009-09-19\n

Here’s the code:

This calls GetHeaderValues which searches through the request headers and returns the value that goes with that header.

Canonicalized Resource

This part of the signature string represents the storage account targeted by the request. Remember that the Request URI is http://testsnapshots.blob.core.windows.net/?comp=list. This, plus the actual account name (testsnapshots in this case). In our example this returns the following:

/testsnapshots/\ncomp:list

If you have query parameters, this will include those as well. Here’s the code, which also handles additional query parameters and query parameters with multiple values.

So now that we have the canonicalized strings, let’s look at how to create the authorization header itself. We will start by creating a string of the message signature in the format of StringToSign previously displayed in this article. This is easier to explain using comments in the code, so here it is, the final routine that returns the Authorization Header:

When running this, the MessageSignature for our example looks like this:

GET\n\n\n\n\n\n\n\n\n\n\n\nx-ms-date:Wed, 07 Jan 2015 02:51:55 GMT\nx-ms-version:2014-02-14\n/testsnapshots/\ncomp:list

The final value for AuthorizationHeader is:

SharedKey testsnapshots:qrdrdEbRfvfLr/79lxozcp0lbEIn2tk5CRkIHBN/TBY=

This is placed in the request headers before posting the response.

How about another example?

How about if we call ListBlobs for container a-test-blob. If you look at the MSDN documentation for this API, you find that the method is GET and the RequestURI is:

https://myaccount.blob.core.windows.net/mycontainer?restype=container&comp=list

It also has a query parameter called include with possible values snapshots, metadata, uncommittedblobs, and copy. You can see in the Message Signature below that by default, this requests the metadata and snapshots. With those values, this doesn’t just return the blobs, but also all of the snapshots of each blob. Snapshots will be discussed in a later article in this series. (Links to all of the published articles are displayed at the top of this post.)

If you run this through the code, these are the results you get:

Canonicalized Headers:

x-ms-date:Wed, 07 Jan 2015 02:57:51 GMTx-ms-version:2014-02-14

Canonicalized Resource:

/testsnapshots/a-testblobcomp:listinclude:metadata,snapshotsrestype:container

MessageSignature:

GET\n\n\n\n\n\n\n\n\n\n\n\nx-ms-date:Wed, 07 Jan 2015 02:57:51 GMT\nx-ms-version:2014-02-14\n/testsnapshots/a-testblob\ncomp:list\ninclude:metadata,snapshots\nrestype:container

AuthorizationHeader:

SharedKey testsnapshots:BYPOycD66nFD8BQ7fVkgkiZ9orsWuTEWpG6E3ybw4s0=

These are from Fiddler:

Request:

GET /a-testblob?restype=container&comp=list&include=snapshots&include=metadata HTTP/1.1

Request Headers:

x-ms-date: Wed, 07 Jan 2015 02:57:51 GMT
x-ms-version: 2014-02-14
Authorization: SharedKey testsnapshots:BYPOycD66nFD8BQ7fVkgkiZ9orsWuTEWpG6E3ybw4s0=
Host: testsnapshots.blob.core.windows.net

Status code and response headers returned after execution:

HTTP/1.1 200 OK
Content-Type: application/xml
Server: Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0
x-ms-request-id: e116a424-0001-0044-7ad3-292d57000000
x-ms-version: 2014-02-14
Date: Wed, 07 Jan 2015 03:00:36 GMT
Content-Length: 9665

Response body (XML): This shows the list of blobs and their property and metadata. I’ve edited the list and removed all the entries but one for clarity.

<?xml version=”1.0″ encoding=”utf-8″?>
<EnumerationResults ServiceEndpoint=”http://testsnapshots.blob.core.windows.net/” ContainerName=”a-testblob”>
  <Blobs>
    <Blob> 
     <Name>AZCopyTestImages/BeardedSealOnIceberg_Svalbarg_Norway_EN-US1166573819.jpg</Name>
      <Properties>
        <Last-Modified>Sun, 07 Dec 2014 21:39:32 GMT</Last-Modified>
        <Etag>0x8D1E07D6C60364C</Etag>
        <Content-Length>79310</Content-Length>
        <Content-Type>image/jpeg</Content-Type>
        <Content-Encoding />
        <Content-Language />
        <Content-MD5>MADqzfCVAL262BgtJ+Gbow==</Content-MD5>
        <Cache-Control />
        <Content-Disposition />
        <BlobType>BlockBlob</BlobType>
        <LeaseStatus>unlocked</LeaseStatus>
        <LeaseState>available</LeaseState>
      </Properties>
      <Metadata />
    </Blob>
  </Blobs>
  <NextMarker />
</EnumerationResults>

Summary

In this article, I explained how to make a request to the blob storage REST API to retrieve a list of containers or a list of blobs in a container. I showed you the code for creating the authorization signature and using it in the REST request, and how to examine the response. I have to thank Neil Mackenzie for his blog article that got me started on my way. And a big shoutout to David Pallmann. He has a project on CodePlex that shows how to perform every Azure Storage operation for blobs, queues, and tables using REST or the storage client library. It’s from 2011, but the REST API code still works. I updated the version and used it as the basis for this article. It really helped me understand the documentation posted on MSDN, which has too many words and not enough code.

Please check out the next article when it’s published; it will be about Blob Properties, Blob Properties properties, and metadata, and will hopefully explain it far better than MSDN does, at least at this time.