In my previous article (Building my First SQL Server 2005 CLR) I wrote about my experiences coding and implementing my first SQL Server 2005 CLR procedure. The idea was to:
- Provide a "learn through my pain" tutorial for people considering adoption of CLR functionality
- Demonstrate what I considered a realistic practical usage of CLR procedures, in replacing an unsupported extended stored procedure (SP) called xp_getfiledetails. I'd used this XP in SQL 2000 to allow me to obtain various bits of operating system information for a physical disk file.
A lot of interesting points came out of the subsequent discussion on this article. At the time, I felt that a potential drawback of my new CLR-based solution was that for the first time I would have objects in my database for which the source code was stored outside of the database. This misconception was corrected by Adam Machanic. Adam pointed out a couple of ways to get your source code stored in SQL Server, one of those being to use the ALTER ASSEMBLY T-SQL command, like so:
ALTER ASSEMBLY xp_getfiledetails
ADD FILE FROM 'c:\temp\xp_getfiledetails.cs'
After issuing the above command the source code for my CLR can be viewed by using the sys.assemble_files catalog view. With the CLR and source code stored in SQL Server I can move my database and the CLR and source code will move along with the database.
Additionally the waters were "muddied further" by comments from Phil Factor where he offered a T-SQL solution that accomplished exactly the same thing, thus removing the need for introducing CLR functionality in the first place – although, this alternative solution was supported using OLE Automation.
So, when faced with two apparently viable solutions for a problem – one CLR-based and one T-SQL-based – how does one choose which one to use? As a starting point, I decided to perform a simple performance benchmark.
Retrieving file information using a CLR procedure
I will not present again here my original xp_getfiledetails CLR procedure solution. However, the C# code is included in the code download for this article, and is described in full in my original article.
Retrieving file information using T-SQL and OLE Automation
The basis for my T-SQL code solution was provided by Phil Factor. I modified Phil's code slightly so that the T-SQL code performs and produces exactly the same results as my CLR-based solution. Both the CLR and the T-SQL procedures produce the same results as the deprecated xp_getfiledetails extended stored procedure provided with SQL Server 2000. If you want to follow along, grab the code for the tsql_getfiledetails_OLE stored procedure, and create it in your test database.
If you plan to execute the above stored procedure you will need to enable OLE Automation. To do that you can run the following code:
sp_configure 'show advanced options', 1;
sp_configure 'Ole Automation Procedures', 1;
Benchmarking the solutions
In my benchmark testing I ran these two processes on a standalone SQL Server 2005 machine. I recorded and compared the elapsed time for each process to return file information.
Here is the simple test harness I used to drive this performance testing:
SET nocount ON
DECLARE @maxsize INT
DECLARE @x INT
DECLARE @starttime1 datetime
DECLARE @stoptime1 datetime
DECLARE @starttime2 datetime
DECLARE @stoptime2 datetime
CREATE TABLE #temp (
[Alternate Name] VARCHAR(100),
[Creation Date] CHAR(8),
[Creation Time] CHAR(6),
[Last Written Date] CHAR( 8),
[Last Written Time] CHAR(6),
[Last Accessed Date] CHAR(8),
[Last Accessed Time] CHAR(6),
SET @x = 0
SET @starttime1 = GETDATE()
WHILE @x < 20
SET @x = @x + 1
INSERT INTO #temp
EXEC master.dbo.tsql_getfiledetails_OLE 'C:\temp\longfilename.txt'
SET @stoptime1 = GETDATE()
SET @x = 0
SET @starttime2 = GETDATE()
WHILE @x < 20
SET @x = @x + 1
INSERT INTO #temp EXEC master.dbo.xp_getfiledetails 'C:\temp\longfilename.txt'
SET @stoptime2 = GETDATE()
SELECT DATEDIFF(ms,@starttime1, @stoptime1) ole_duration,
DROP TABLE #temp
As you can see, this code has two WHILE loops. The first loop executes the tsql_getfiledetailes_OLE T-SQL stored procedure twenty times, and the second loop does the same for the CLR store procedure. In each case, we return the details of the same file (c:\temp\longfilename.txt).
I capture the start and end time for each WHILE loop and then execute a SELECT statement that calculates and displays the duration, in milliseconds, for each process.
When I ran this code on my standalone SQL Server, I got the following output:
Clearly, the CLR-based procedure is much faster at returning file information from the operating system than the T-SQL OLE automation procedure. From this test we can conclude that OLE Automation has more overhead than the different methods used in C# for returning file information.
T-SQL or CLR?
At first glance it may seem obvious that you'd choose to implement the CLR-based procedure in your system, over the T-SQL procedure. However, is raw speed the only consideration?
On reflection, it probably isn't too surprising that the OLE T-SQL solution is slower. OLE automation is generally used for activities such as printing, sending email, saving stuff to files and so on – slow processes by definition. In our shop, the process that uses xp_getfiledetails only calls the routine once, and runs as a batch application. Therefore the extra overhead of using the T-SQL solution is really not an issue.
More important considerations in our case are ease of development and ease of maintenance and from this standpoint we'd probably end up choosing the T-SQL solution regardless of the slower performance.
On the other hand, if you where building a real-time solution that was traversing a directory tree and summarizing the size of each file so it could present a total size of the directory, then probably the CLR solution would be more appropriate.
From my benchmark testing, it is clear that the CLR method of is much more efficient at returning operating system file information. If your goal is to optimize the performance of returning file information then the CLR approach is the way to go.
However, while benchmark testing is an important element in deciding your approach, it is also clear that speed of code execution is not the only consideration when deciding whether to implement programming logic using a CLR or T-SQL. Most shops will probably find it easier to build, debug and maintain a T-SQL procedure over a CLR-based procedure. One needs to weigh all the benefits and drawbacks of any solution prior to implementation.
For additional information on how to weigh the strengths and weakness of CLR over T-SQL you might consider reading the following white paper from Microsoft: