SQL Server Programming
SSIS has introduced the idea of Raw Files, which enables data to be saved very conveniently and effectively to disk. Great, until you want to read the files; then not so easy. Luckily, Simon Sabin is on hand to explain how one does that!
If you’ve ever used SQL Server 2005 Integration Services (SSIS), you will have come across raw files. Microsoft introduced them with SSIS as a mechanism for storing data on the local file system. The design of raw files enables data to be written very quickly but, unfortunately, to achieve this performance Microsoft has used a proprietary binary storage format. This means that you can’t open these files and view their contents using a normal editor. What’s more frustrating is that Microsoft has yet to supply a tool for reading these files.
To get around this limitation, I developed my own custom “Raw File Reader” tool, which this article briefly describes.
Why raw files?
One of the big benefits of working with raw files is the ease with which you can add them to your packages. The raw file destination requires you to specify only the name of file to use and the columns you want to store. Conversely, if you use a text file destination, then you have to build the structure of the file and ensure it makes your data flow. If you use an OLEDB destination, then you have to have a database and also a table created to store the data.
Their ease of configuration makes raw files ideal for storing errors in your data flows. In the past I’ve been guilty of configuring certain components of my packages to ignore failures, on the assumption that it “can’t error”. Of course, this isn’t ideal and a much better approach is to capture all errors and handle them appropriately. This is where raw files are very useful. All you need to do is direct your error flows to a raw file destination, specify and filename and select the columns you want to store. If errors do occur, then they will be stored in the raw files.
So, you’re storing any errors in a raw file, and now you want to review them – but you can’t because of the proprietary format used. This is the exactly the sort of scenario for which my raw file reader is designed.
SQL Server Programming
In which Greg proves that a CLR routine will run faster than an OLE automation routine, but then wonders whether code execution is the only consideration when deciding whether to implement programming logic using a CLR or T-SQL
In my previous article (Building my First SQL Server 2005 CLR) I wrote about my experiences coding and implementing my first SQL Server 2005 CLR procedure. The idea was to:
- Provide a “learn through my pain” tutorial for people considering adoption of CLR functionality
- Demonstrate what I considered a realistic practical usage of CLR procedures, in replacing an unsupported extended stored procedure (SP) called xp_getfiledetails. I’d used this XP in SQL 2000 to allow me to obtain various bits of operating system information for a physical disk file.
A lot of interesting points came out of the subsequent discussion on this article. At the time, I felt that a potential drawback of my new CLR-based solution was that for the first time I would have objects in my database for which the source code was stored outside of the database. This misconception was corrected by Adam Machanic. Adam pointed out a couple of ways to get your source code stored in SQL Server, one of those being to use the ALTER ASSEMBLY T-SQL command, like so:
ALTER ASSEMBLY xp_getfiledetails
ADD FILE FROM 'c:\temp\xp_getfiledetails.cs'
After issuing the above command the source code for my CLR can be viewed by using the sys.assemble_files catalog view. With the CLR and source code stored in SQL Server I can move my database and the CLR and source code will move along with the database.
Additionally the waters were “muddied further” by comments from Phil Factor where he offered a T-SQL solution that accomplished exactly the same thing, thus removing the need for introducing CLR functionality in the first place – although, this alternative solution was supported using OLE Automation.
So, when faced with two apparently viable solutions for a problem – one CLR-based and one T-SQL-based – how does one choose which one to use? As a starting point, I decided to perform a simple performance benchmark.
In which Eric describes how to use DMVs (Dynamic Management Views) to work out how and why a SQL Server has ‘gone astray’.
If you work with SQL Server 2000, then you know how painful it is to triage a server that has “gone astray”. In many cases, the error codes are undecipherable and the root cause of the problem is not easily found, let alone remedied. In fact, more times than not, a SQL 2000 DBA will simply do the unthinkable, and reboot the server.
During the development cycle for SQL Server 2005, the team at Microsoft received significant feedback – basically everyone said, “I need more visibility into what is happening inside the database.” Ta-da! The Dynamic Management Views (DMVs) were born.
Many DMVs are scoped to the database level, which in itself is a big step up from SQL 2000, where all system views were at the server level. Furthermore, the data stored in DMVs is much more user-friendly than was the case for the system views, which were pretty much unusable for many people.
The core DMVs
SQL Server 2005 provides more than 80 new DMVs. Special views are provided for checking on .NET assemblies, SQL Service Broker, security, and much, much more. All DMVs include current data, and some, in addition, use hidden tables which store historical data. For example, the transaction related DMV, sys.dm_tran_top_version_generators contains historical data.
DMVs are organized into five general categories according to the area on which they report:
- sys.dm_exec_* – provide information about execution of .NET CLR Modules and connections. All contained here are a number of views available to help you drill into issues related to execution of queries.
- sys.dm_os_* – report on memory, locks, and execution scheduling.
- sys.dm_trans_* – provide insight into transactions and isolation.
- sys.dm_io_*. – monitoring disk I/O
- Sys.dm_db_* – provide database-level data.