Click here to monitor SSC
  • Av rating:
  • Total votes: 25
  • Total comments: 1
Regular Columnists

Short Articles

27 April 2007

SQL Server Programming

How to read raw files in SSIS

1 May 2007
by Simon Sabin

SSIS has introduced the idea of Raw Files, which enables data to be saved very conveniently and effectively to disk. Great, until you want to read the files; then not so easy. Luckily, Simon Sabin is on hand to explain how one does that!

If you've ever used SQL Server 2005 Integration Services (SSIS), you will have come across raw files. Microsoft introduced them with SSIS as a mechanism for storing data on the local file system. The design of raw files enables data to be written very quickly but, unfortunately, to achieve this performance Microsoft has used a proprietary binary storage format. This means that you can't open these files and view their contents using a normal editor. What's more frustrating is that Microsoft has yet to supply a tool for reading these files.

To get around this limitation, I developed my own custom "Raw File Reader" tool, which this article briefly describes.

Why raw files?

One of the big benefits of working with raw files is the ease with which you can add them to your packages. The raw file destination requires you to specify only the name of file to use and the columns you want to store. Conversely, if you use a text file destination, then you have to build the structure of the file and ensure it makes your data flow. If you use an OLEDB destination, then you have to have a database and also a table created to store the data.

Their ease of configuration makes raw files ideal for storing errors in your data flows. In the past I've been guilty of configuring certain components of my packages to ignore failures, on the assumption that it "can't error". Of course, this isn't ideal and a much better approach is to capture all errors and handle them appropriately. This is where raw files are very useful. All you need to do is direct your error flows to a raw file destination, specify and filename and select the columns you want to store. If errors do occur, then they will be stored in the raw files.

So, you're storing any errors in a raw file, and now you want to review them – but you can't because of the proprietary format used. This is the exactly the sort of scenario for which my raw file reader is designed.

SQL Server Programming

Choosing between CLR and T-SQL stored procedures:
a simple benchmark

24 April 2007
Greg Larsen

In which Greg proves that a CLR routine will run faster than an OLE automation routine, but then wonders whether code execution is the only consideration when deciding whether to implement programming logic using a CLR or T-SQL

In my previous article (Building my First SQL Server 2005 CLR) I wrote about my experiences coding and implementing my first SQL Server 2005 CLR procedure. The idea was to:

  • Provide a "learn through my pain" tutorial for people considering adoption of CLR functionality
  • Demonstrate what I considered a realistic practical usage of CLR procedures, in replacing an unsupported extended stored procedure (SP) called xp_getfiledetails. I'd used this XP in SQL 2000 to allow me to obtain various bits of operating system information for a physical disk file.

A lot of interesting points came out of the subsequent discussion on this article. At the time, I felt that a potential drawback of my new CLR-based solution was that for the first time I would have objects in my database for which the source code was stored outside of the database. This misconception was corrected by Adam Machanic. Adam pointed out a couple of ways to get your source code stored in SQL Server, one of those being to use the ALTER ASSEMBLY T-SQL command, like so:

  ALTER ASSEMBLY xp_getfiledetails
  ADD FILE FROM 'c:\temp\xp_getfiledetails.cs'

After issuing the above command the source code for my CLR can be viewed by using the sys.assemble_files catalog view. With the CLR and source code stored in SQL Server I can move my database and the CLR and source code will move along with the database.

Additionally the waters were "muddied further" by comments from Phil Factor where he offered a T-SQL solution that accomplished exactly the same thing, thus removing the need for introducing CLR functionality in the first place – although, this alternative solution was supported using OLE Automation.

So, when faced with two apparently viable solutions for a problem – one CLR-based and one T-SQL-based – how does one choose which one to use? As a starting point, I decided to perform a simple performance benchmark.

DBA Deliberations

Troubleshooting with Dynamic Management Views

24 April 2007
Eric Brown

In which Eric describes how to use DMVs (Dynamic Management Views) to work out how and why a SQL Server has 'gone astray'.

If you work with SQL Server 2000, then you know how painful it is to triage a server that has "gone astray". In many cases, the error codes are undecipherable and the root cause of the problem is not easily found, let alone remedied. In fact, more times than not, a SQL 2000 DBA will simply do the unthinkable, and reboot the server.

During the development cycle for SQL Server 2005, the team at Microsoft received significant feedback – basically everyone said, "I need more visibility into what is happening inside the database." Ta-da! The Dynamic Management Views (DMVs) were born.

Many DMVs are scoped to the database level, which in itself is a big step up from SQL 2000, where all system views were at the server level. Furthermore, the data stored in DMVs is much more user-friendly than was the case for the system views, which were pretty much unusable for many people.

The core DMVs

SQL Server 2005 provides more than 80 new DMVs. Special views are provided for checking on .NET assemblies, SQL Service Broker, security, and much, much more. All DMVs include current data, and some, in addition, use hidden tables which store historical data. For example, the transaction related DMV, sys.dm_tran_top_version_generators contains historical data.

DMVs are organized into five general categories according to the area on which they report:

  • sys.dm_exec_* – provide information about execution of .NET CLR Modules and connections. All contained here are a number of views available to help you drill into issues related to execution of queries.
  • sys.dm_os_* – report on memory, locks, and execution scheduling.
  • sys.dm_trans_* – provide insight into transactions and isolation.
  • sys.dm_io_*. – monitoring disk I/O
  • Sys.dm_db_* – provide database-level data.
Rate this article:   Avg rating: from a total of 25 votes.





Must read
Have Your Say
Do you have an opinion on this article? Then add your comment below:
You must be logged in to post to this forum

Click here to log in.

Subject: Sorry, but what is kimerikas? Jane.
Posted by: sweeta-pz,Sorry, but what is kimerikas? Jane. (not signed in)
Posted on: Thursday, March 27, 2008 at 3:15 PM
Message: Sorry, but what is kimerikas?


Simple-Talk Database Delivery

Patterns & Practices Library

Visit our patterns and practices library to learn more about database lifecycle management.

Find out how to automate the process of building, testing and deploying your database changes to reduce risk and make rapid releases possible.

Get started

Phil Factor
Routine SQL DML Testing for the Unenthusiastic Tester

There are more exciting things in life than unit testing SQL Statements, checking the results, timings, and... Read more...

 View the blog

Top Rated

Jodie Beay and the Production Database Drift
 You make an example database, like NorthWind or WidgetDev in order to test out your deployment system... Read more...

SQL Server Data Tools (SSDT) and Database References
 SQL Server Data Tools (SSDT) provides, via the DacPac, interesting support for verifying not only... Read more...

Writing Build vNext tasks for Visual Studio Online
 Hosted TFS, now called Visual Studio Online (VSO), has a new way of writing build processes called... Read more...

Microsoft and Database Lifecycle Management (DLM): The DacPac
 The Data-Tier Application Package (DacPac), together with the Data-Tier Application Framework (DacFx),... Read more...

A Start with Automating Database Configuration Management
 For a number of reasons, it pays to have the up-to-date source of all the databases and servers that... Read more...

Most Viewed

Beginning SQL Server 2005 Reporting Services Part 1
 Steve Joubert begins an in-depth tour of SQL Server 2005 Reporting Services with a step-by-step guide... Read more...

Ten Common Database Design Mistakes
 If database design is done right, then the development, deployment and subsequent performance in... Read more...

Temporary Tables in SQL Server
 Temporary tables are used by every DB developer, but they're not likely to be too adventurous with... Read more...

SQL Server Index Basics
 Given the fundamental importance of indexes in databases, it always comes as a surprise how often the... Read more...

Concatenating Row Values in Transact-SQL
 It is an interesting problem in Transact SQL, for which there are a number of solutions and... Read more...

Why Join

Over 400,000 Microsoft professionals subscribe to the Simple-Talk technical journal. Join today, it's fast, simple, free and secure.