04 October 2016

Why Would I Ever Need to Partition My Big ‘Raw’ Data?

Whether you are running an RDBMS, or a Big Data system, it is important to consider your data-partitioning strategy. As the volume of data grows, so it becomes increasingly important to match the way you partition your data to the way it is queried, to allow 'pruning' optimisation. When you have huge imports of data to consider, it can get complicated. Bartosz explains how to get things right; not perfect but wisely.… Read more
16 September 2016

How to Start Big Data with Apache Spark

It is worth getting familiar with Apache Spark because it a fast and general engine for large-scale data processing and you can use you existing SQL skills to get going with analysis of the type and volume of semi-structured data that would be awkward for a relational database. With an IDE such as Databricks you can very quickly get hands-on experience with an interesting technology.… Read more
16 September 2016

Azure SQL Data Warehouse: Explaining the Architecture Through System Views

The architecture of Azure SQL Data Warehouse isn't easy to explain briefly, but if you have some useful queries that access the management and catalog views, and diagrams that show how they relate together, you can very quickly get a feel for what is going on under the hood. By using and extending these queries that use these views, you can check on a variety waits, blocking, status, table distribution and data movement in ASDW.… Read more
08 September 2016

SQL Database: How to Configure Active Geo-Replication

Active Geo-location is powerful magic for ensuring the high availability of a Azure SQL database, and for disaster-recovery. In choosing the best options, you need to accurately understand the value that the business places on the service you're running, long it will take for a secondary replica to be in synch with the primary replica, the importance of spreading the location of replicas widely, and the maximum tolerable unscheduled downtime. Just clicking all the options could prove to be expensive. … Read more
06 May 2016

Taking Azure SQL Data Warehouse for a Test-Drive

Azure SQL Data Warehouse uses SQL to query the data, but there are some differences to SQL Server's dialect of TSQL, and queries don't necessarily work in the same way. DBAs are also required to use SSDT to access Azure SQL Data Warehouse. It is worth taking the time to try the preview of the product, and take it for a 'spin around the block', following Robert Sheldon's walk-through.… Read more
05 May 2016

Connecting to SQL Data Warehouse

The most frustrating thing with any new system is often just working out how to connect to it. Oddly, you can't use SSMS with SQL Data Warehouse, but it is fine with SSDT, SSIS, Power BI desktop, sqlcmd, BCP, and a range of Microsoft cloud services - there are PowerShell Cmdlets too. Rob Sheldon provides the details.… Read more
25 February 2016

In Search of the Cortana Analytics Suite

Cortana Analytics Suite is important and significant, but it is difficult to work out why or how from the existing 'information'. After more setbacks than Dr Livingstone, Bob Sheldon emerged from the jungle of marketing hyperbole triumphantly with a small diagram which explained it. Here he reveals the individual components, and finds them, in combination, to be a curiously interesting attempt to bring Big Data under contro… Read more
24 February 2016

Azure SQL Data Warehouse

Azure SQL Data Warehouse is a fully-managed and scalable cloud service. It is still in preview, but solid. Not only is it compatible with several other Azure offerings, such as Machine Learning and Data Factory, but also with various existing SQL Server tools and Microsoft products. It talks Power BI. Are we now seeing the final piece of the Azure jigsaw fall into place?… Read more
22 December 2015

Microsoft Azure DocumentDB

DocumentDB is a late-entrant in the Document-oriented database field. However, it benefits from being designed from the start as a cloud service with a SQL-like language. It is intended for mobile and web applications. Its JSON document-notation is compatible with the integrated JavaScript language that drives its multi-document transaction processing via stored procedures, triggers and UDFs.… Read more
20 November 2015

Azure Data Lakes

The Data Lake is basically a large repository of data for 'big data' analytic workloads, held in its original format. The Azure Data Lake adds Data Lake Analytics, and Azure HDInsight. Although the tools are there for Big Data Analysis, it will require new skills to use, and a heightened attention to Data Governance if it is to appeal to the average enterprise.… Read more
09 November 2015

The Logical Data Warehouse – Towards a Single View of All the Data

What is wrong with the Enterprise Data Warehouse? Quite a lot, it seems. By taking the narrow view that the struggle is that of accommodating and interrogating huge quantities of data, then initiatives such as the Virtual Data Warehouse and Logical Data Warehouse could make sense. But what about data quality, security, access control, archiving, retention, privacy and regulatory compliance?… Read more
02 June 2015

Microsoft Azure Stream Analytics

Azure Stream Analytics aims to extract knowledge structures from continuous ordered streams of data by real-time analysis. These streams might include computer network traffic, social network data, phone conversations, sensor readings, ATM transactions or web searches. It provides a ready-made solution to the business requirement to react very quickly to changes in data and handle large volumes of information. Robert Sheldon explains its significance. … Read more
21 April 2015

MongoDB vs. Azure DocumentDB

If you are familiar with MongoDB you may be wondering how Azure's DocumentDB service compares. In this article David Green gives us a look. Note that Azure DocumentDB is evolving very quickly, so some of the limitations mentioned in this article may have already been removed.… Read more
10 March 2015

The Internet of Things: A New World Order?

Was the marketing hook 'The Internet of Things' conjured up before the technical definition? Are we being persuaded to spend money on fending off yet another fantasy tsunami of data? Already, we have televisions that listen to, and report, your conversations; so are we facing the Science Fiction future of gadgets that report where you go, who you visit and what medications you take? As Robert Sheldon says; "It's big, almost too big to get your arms around"… Read more
03 February 2015

Cloud Storage Replication Is Not Backup

The options that you need to select when setting up an Azure Storage service account allow you to specify the durability and high-availability of your data, but they don't provide for data recovery to a point-in-time. In fact, it means that some of the bad things that can happen to data are more efficiently replicated to all copies. Backup is quite a separate issue.… Read more
19 November 2014

Data as a Service: The Next "As a Service" Wave?

There was a time that data seemed part of the application that maintained and used it. Now, there is increasing demand to deliver data through platform-agnostic open-standard APIs so it can be consumed in a variety of ways, whether refined, aggregated, or combined with additional information. Are we heading towards a shared understanding of applications as data-providers, feeding other services such as BI, or even in the right circumstances, publishing it?… Read more

Join Simple Talk

Join over 200,000 Microsoft professionals, and get full, free access to technical articles, our twice-monthly Simple Talk newsletter, and free SQL tools.

Sign up