2187-just_azure.svg

In the first article of this series, I talked about there being two types of Blobs – Page Blobs and Block Blobs. Page blobs are mostly used for Virtual Hard Drives (VHDs) and are optimized for random reads and writes, whereas block blobs are used for most other types of files. Each of these blob types has their own .NET representation in the Client SDK – CloudPageBlob and CloudBlockBlob, respectively. In this article, I’m going to talk about the properties of the CloudBlockBlob. The previous article talked about using the REST API with Blob Storage; in this article, I’m going to use the .NET Storage Client Library to access properties that are surfaced specifically through that interface.

I searched high and low, on Google and Bing and even the closet in my office (the land of no return – there are even some VAX/VMS shirts in the back somewhere). I found it difficult to find seriously useful information written after Storage Client Library 1.7 (and we’re on 4.something now). In fact, some information is just not out there (see comments about MSDN later in this article). Hopefully, this article will fill that gap with updated and helpful information. Thanks to Gaurav Mantri and Mike Wood for all the discussion on the Properties properties. (More on that in the next paragraph.)

Properties and Properties properties

There are two sets of properties. The CloudBlockBlob object itself has properties, and one of those properties is called Properties. The Properties property has its own properties. When referring to the CloudBlockBlob.Properties property, I am going to always spell it in mixed case, so you can tell which Properties (or properties) I’m talking about. I’m going to talk about them separately after I talk about them together. (Are you with me so far?)

CloudBlockBlob properties

Most of the CloudBlockBlob properties are really information provided by the Client SDK to make your life easier. I think of them as “derived properties”; they are not stored with the Blob, they are created using information about the Blob. When you get a reference to a Blob, these properties are filled and returned to you. For example, there are four Uri properties; they are all some variant of https:// [storageaccountname].blob.core.windows.net/[container]/[blobname]. Whether it is a URI pointing to a snapshot or a URI pointing to a non-snapshot in secondary storage, it is easy for you to say, “I want the URI for the Blob in primary storage” and just use the appropriate URI property that has been created for you by the Storage Client Library.

Because most of the properties of the CloudBlockBlob class are derived, they are read-only, except for two. The read/write properties are StreamMinimumReadSizeInBytes and StreamWriteSizeInBytes. These properties are only used by the client when reading or writing Blobs. They are not retained in Blob Storage somewhere to make everyone else use the same values. You want these values to be set by the client. You wouldn’t want to say “always upload this Blob in 4MB increments” if some of your customers had awesome broadband and some had terrible broadband, unless you really didn’t like the customers with terrible broadband and were trying to drive them away.

The other important piece of information you will find in the CloudBlockBlob class is the Metadata. This is a list of key/value pairs that you can create and store. When you get a reference to the blob, the metadata is not automatically retrieved; you have to call the FetchAttributes method on the blob to retrieve the metadata. To write the metadata, call SetMetadata. Note that SetMetadata overwrites the metadata, so if you don’t call FetchAttributes first and then modify them, the original values will be lost.

What would you store in the metadata? Anything you want to. Say you have customers uploading files – you could store their e-mail address or name in the metadata so you can verify who uploaded the file. If users are archiving their documents to the cloud, you could store the path to the file on the on-premises network in the metadata, and then you can see the origin of the file. If you have customers updating files in multiple applications, you could put the name of the application in the metadata – was the file updated via mobile device, from the web application, from a desktop application, etc.?

CloudBlockBlob Properties properties

Before we dive into the details, you may ask yourself why you can’t just look this up in MSDN. Go ahead, I’ll wait here. (Actually, I figure this will take you as long as it took me, so I’ve gone to Starbucks. Back in a minute.) (Sipping hot chocolate…) So now you’ve used Bing or Google to your heart’s content, and found that there’s not much detail in the documentation on the properties (or Properties) of CloudBlockBlobs. And what there is often provides a definition in the same words as what you’re looking for, such as “CopyState” – “A CopyState object containing the copy state, or null if there is no copy state for the blob.” Because of incredibly useful definitions like this, I’ve opened up Visual Studio and written a bunch of code trying out all of the properties (and Properties) to see what they do, or what they are.

The Properties of the CloudBlockBlob are called “system properties” by Microsoft. (I haven’t found any place where they define what that actually means.) I could just say that and leave it here, but that’s just cruel, and it would mean I wasted a lot of time figuring this out. I’ll explain it this way: these are properties that give you control over the information that other systems will be paying attention to. They are stored in Blob Storage and returned when reading the associated Blob. For example, if you set CacheControl, and someone reads the Blob using a browser, CacheControl will end up being in a header that’s set when they retrieve the Blob.

Most of the Properties properties are read/write. You might infer that Microsoft thought they should put these in a section by themselves to delineate them from the “convenience” properties available to people using the Client SDK, and so they could be passed around as a group. It’s unfortunate that they decided to call it Properties because most people find it confusing, but we all know that naming a project, class, or object is sometimes the most difficult part of the project, so a little understanding is probably in order, especially now that you understand the two different types of properties/Properties. (By the way, when doing research, I asked three people about these properties, and all three said, “Oh geez, not the properties properties thing!”)

To see the values in the Properties properties, get a reference to the Blob and call FetchAttributes. To update these Properties, you can call SetProperties.

Now let’s talk about the individual properties of the CloudBlockBlob and the CloudBlockBlob.Properties.

CloudBlockBlob properties (details)

BlobType BlobType (r/o) – BlockBlob or PageBlob, tells which type of Blob it is. (BlobType is an enumeration.)

CloudBlockBlob.Container Container  (r/o) – a reference to the container object in which the Blob resides.

CopyState CopyState (r/o) – the state of the most recent or pending copy operation. If the Blob has never been copied, this will be null. If the Blob has been copied or is in the process of being copied (which is now an asynchronous operation), it will provide information regarding that operation. So you can call StartCopyFromBlob (or StartCopyFromBlobAsync) then call FetchAttributes, and examine the properties of the CopyState. (Yes, it has its own properties, which include such helpful values as BytesCopied, Status, and TotalBytes.)

bool IsSnapshot (r/o) – true if the Blob you are looking at is a snapshot, otherwise false. Oops, that sounded like an MSDN definition, so let me give you a little more background. Snapshots will be covered in a later article in this series. You can basically take a snapshot of a blob, update it, take another snapshot, update the blob again, etc. Then you can retrieve earlier snapshots if you need to restore a previous version. You can retrieve a list of blobs for a specific base blob which will return all of the snapshots and the current representation of the blob. This property (IsSnapshot) tells you whether the one you’re looking at is a snapshot or the base blob.

IDictionary<string, string> Metadata (r/o) – user-defined metadata. You can put anything you want into here; it’s just key-value pairs. If you’re using Snapshots, each Snapshot will have its own version of the Metadata. Use SetMetadata to update the value of this property.

string Name (r/o) – the name of the Blob. This is the end of the URL after the container name. If the Blob is in any “pseudo-folders”, they will be included in the name and rendered either as a flat listing or a directory listing, whichever you select. For example, for URL http:// mystorage.blob.core.windows.net/mycontainer/folder1/blob1.png, the Blob name would be “folder1/blob1.png”. The “pseudo-folder” concept is explained in the second article in this series.

CloudBlobDirectory Parent (r/o) – the CloudBlobDirectory object representing the virtual parent directory for the blob. This has a bunch of properties itself, including the URI to the parent directory for the blob. For example, in the URL in the Name property, the Parent would be a CloudBlobDirectory object pointing at mycontainer/folder1.

Properties – We’ve already discussed this; the properties of this property called Properties are listed in the next section.

CloudBlobClient ServiceClient (r/o) – a reference to the CloudBlobClient object that represents the service client used to access this blob. (This is discussed in the third article in this series.)

Uri SnapshotQualifiedStorageUri (r/o) this property has two properties – Primary and Secondary. These represent the URI for the primary and secondary storage locations, including the query string if the blob is a snapshot. Note that if the blob is in a storage account that is not set up as RA-GRS, the Secondary property will still be populated, but the URI will not work. (RA-GRS is Read Access Geo-Redundant Storage, which is discussed in the first article in this series.) Examples of these might be:

Uri SnapshotQualifiedUri (r/o) – this is the primary absolute URI to the blob, including query string information if the blob is a snapshot. For example, this might be:

DateTimeOffset? SnapshotTime (r/o) – if the blob is a snapshot, this is the date/time that the blob snapshot was taken. Otherwise, it’s null. Example for the blob mentioned in the Snapshot URI fields above: {9/6/2014 6:33:53 PM +00:00}.

Uri StorageUri (r/o) – this property has two properties – Primary and Secondary. These represent the URIs for the primary and secondary storage locations. Note that if the blob is in a storage account that is not set up as RA-GRS, the Secondary property will still be populated, but the URI will not work. (RA-GRS is Read Access Geo-Redundant Storage, which is discussed in the first article in this series.) For example, these might be as follows:

int StreamMinimumReadSizeInBytes (r/w) – this gets or sets the minimum number of bytes to buffer when reading from a blob stream. This is like StreamWriteSizeInBytes but in the opposite direction. Like that property, you may want to adjust this depending on the bandwidth available to the process reading the blob.

int StreamWriteSizeInBytes (r/w) – this gets or sets the block size for writing to a block blob. This was used in the fourth article in this series when discussing uploading files.

Uri Uri (r/o)this is the URI for the blob in its primary location. This is the actual blob. Snapshots always have a query parameter appended to this Uri with the date/time stamp of the snapshot. For example, this might be:

CloudBlockBlob.Properties properties (details)

Let’s look at the Properties properties in detail now. Unless you want to go try Bing or Google first. I hope you don’t, because I’m still full from the last hot chocolate.

BlobType BlobType (r/o) – BlockBlob or PageBlob, tells which type of blob it is. (BlobType is an enumeration.) (Do you ever have a feeling of deja vu?)

string CacheControl (r/w) – this is for instructing user agents (like browsers, CDN, etc) how long the content should be cached. See section 14.9 of the HTTP/1.1 Header Field Definitions. Example: “max-age=3600”

string ContentDisposition (r/w) – this property lets you suggest a file name for dynamic content. For example, if someone was going to prompt the user to download the blob, this could be used to suggest a file name for that blob. Example “myfile.txt”

string ContentEncoding (r/w) – this is the method used to encode the data, Example: “gzip”

string ContentLanguage (r/w) – this is the language the content is in, using standard internationalization abbreviations such as “en-US” for US English or “en-GB” for UK English.

string ContentMD5 (r/w)- this is a base64-encoded binary MD5 value. For more information, check out this article. It’s from 2011, but it has lots of really interesting information on the use of ContentMD5.

string ContentType (r/w) – this is the MIME type of the content. Example: “image/jpeg”, “text/plain”, or “application/vnd.openxmlformats-officedocument.spreadsheetml.sheet”.

string ETag (r/o) – this is an identifier for a specific version of a resource. This is used for web cache validation, and allows a client to make conditional requests. ETags are also used for optimistic concurrently control. For example, if you were reading a blob and saved the ETag, then did some other processing and came back to upload a new version of that blob, you could read the ETag again and check it against the prior value. If the ETag values match, then the file hasn’t changed and you can upload a new version of it. If the match fails, the storage service will return a 412 error (precondition failed). For more information about managing concurrency in storage, check out this excellent article by the Microsoft Azure Storage team.

DateTimeOffset? LastModified (r/o) – this is the date/time when the blob was last updated.

long Length (r/o) – this is the size of the blob, in bytes.

You can put a lease on a blob so it cannot be modified without removing the lease. These three properties relate to that feature, which will be covered in a future article in this series. 

LeaseDuration LeaseDuration (r/o) – this is the type of the duration – Fixed, Infinite, Unspecified

LeaseState LeaseState (r/o) – this is the state of the lease — Available, Breaking, Broken, Expired, Leased, Unspecified

LeaseStatus LeaseStatus (r/o) – this is the current status of the lease – Locked Unlocked, Unspecified

Summary

In this article, I have solved one of the mysteries of the universe – “What is the difference between the CloudBlockBlob properties and the CloudBlockBlob Properties properties?” I also listed both sets of properties, and explained what they could be used for. I talked about user metadata, and how to read and update it. In my next article, I’m going to show you how to take snapshots of blobs without a smart phone and then examine the blob to see all of its snapshot history.