|
|
Red Gate Support Engineer
-
Posted Thursday, March 27, 2008 6:03 PM |
If you had come of age before Component Object Model as I have, the concept of COM registration seemed rather odd at first glance. Why is your system limited to exactly one application extension which is forced to be backwards-compatible, when, for years, other operating systems have enjoyed the ability to have many versions of the same shared library and an established probing precedence to find the appropriate library file? This has landed us Windows users in deep doo-doo when we try to upgrade software that uses shared components (so-called DLL Hell).
Thankfully, the advent of Microsoft .NET had done a lot to solve this problem. Not only can the same dll exist in multiple locations, and a predictable probing behaviour has been established, but we can even override the assembly binding by creating an assembly config file. This feature can be leveraged for an easy way to 'patch' errant software, provided that the new assembly has the same public methods and objects available as the old one.
I had recently been in a situation where I could put this into practice. Our schema comparison keeps all of its' logic in a shared assembly: RedGate.SQLCompare.Engine.dll. This shared component had been updated to support a certain SQL Server 2005 feature that I won't name, suffice to say that it wouldn't require a change to the User Interface to support it. Because the UI and the Engine were of the same major version, it seemed a perfect time to exercise my assembly binding prowess. Here is the path to dll heaven when you need to upgrade a single dll in a piece of software:
Step 1: Identify the assembly name of the dll that you want to replace. The assembly name is normally the same as the file name without the extension, but this is not a given. Tools such as .NET Reflector will tell you the assembly name, and if you're really hard-core, you can code up something using Reflection: Assembly.LoadFile(@"c:\file.dll").FullName. When you configure the assemblyIdentity (below) you will need to know the assembly name.
Step 2: Copy the dll, and any dependent dlls, into a subfolder in the calling application's APPATH. This is very important because .NET will not allow you to bind to an assembly originating from a folder outside of the application's "home" folder, presumably for security reasons. I have created a folder called SQL Compare Engine 63 underneath the program's folder for this reason.
Step 3: Open Notepad and create a file named after the executable that is going to load your dll, appending ".config" to the end of the filename. In my case, the file will be called RedGate.SQLCompare.UI.exe.config. The contents of this file will be in XML, the bindingRedirect element is the magic that allows me to use the new dll in place of the old one.
<? xml version="1.0" encoding="utf-8"?> <configuration> <runtime> <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1"> <dependentAssembly> <assemblyIdentity name="RedGate.SQLCompare.Engine" publicKeyToken="7f465a1c156d4d57" culture="neutral" /> <bindingRedirect oldVersion="6.2.1.36" newVersion="6.3.0.123" /> <codeBase version="6.3.0.123" href="file://c:/Program files/red gate/SQL Compare 6/sql compare engine 63/RedGate.SQLCompare.Engine.dll" /> </dependentAssembly> </runtime> <startup> <supportedRuntime version="v2.0.50727"/> </startup> </configuration>
You may notice a few things about the contents of this file that I haven't mentioned. First of all, in addition to the elements that identify the assembly that I want to configure, I have added the codeBase directive. This will instruct .NET to look in the sql compare engine 63 folder that I had created to hold the new version of the dll because I feel a lot safer not overwriting the shipped SQL Compare dll. I have also added the supportedRuntime tag. This may be important if the application that you were patching was built against the .NET 1.x runtime and the replacement dll was built against 2.0. If the whole shebang loads into the 2.0 runtime, then there shouldn't really be any incompatibilities (with the requisite exceptions, of course!)
An additional complication is introduced if the replacement dll relies on updated dependant dlls. Just because we have redirected the RedGate.SQLCompare.Engine.dll doesn't mean that the runtime will probe the SQL Compare Engine 63 for dependencies. One solution could be to add a probing path inside the assemblyBinding element:
< probing privatePath="c:\program files\red gate\sql compare 6\sql compare Engine 63"/>
However, this doesn't seem to work for me and my guess is that it's because codeBase always overrides probing. To solve this problem, I got a list of all dependent assemblies from .NET Reflector for RedGate.SQLCompare.Engine and added the file locations for all dependencies using codeBase to make the complete config file:
<? xml version="1.0" encoding="utf-8"?> <configuration> <runtime> <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1"> <dependentAssembly> <assemblyIdentity name="RedGate.SQLCompare.Engine" publicKeyToken="7f465a1c156d4d57" culture="neutral" /> <bindingRedirect oldVersion="6.2.1.36" newVersion="6.3.0.123" /> <codeBase version="6.3.0.123" href="file://c:/Program files/red gate/SQL Compare 6/sql compare engine 63/RedGate.SQLCompare.Engine.dll" /> </dependentAssembly> <dependentAssembly> <assemblyIdentity name="RedGate.SQL.Shared" publicKeyToken="7f465a1c156d4d57" culture="neutral" /> <codeBase version="6.2.1.30" href="file://c:/Program files/red gate/SQL Compare 6/sql compare engine 63/RedGate.SQL.Shared.dll" /> </dependentAssembly> <dependentAssembly> <assemblyIdentity name="RedGate.SQLCompare.ASTParser" publicKeyToken="7f465a1c156d4d57" culture="neutral" /> <codeBase version="6.3.0.123" href="file://c:/Program files/red gate/SQL Compare 6/sql compare engine 63/RedGate.SQLCompare.ASTParser.dll" /> </dependentAssembly> <dependentAssembly> <assemblyIdentity name="RedGate.SQLCompare.Rewriter" publicKeyToken="7f465a1c156d4d57" culture="neutral" /> <codeBase version="6.3.0.123" href="file://c:/Program files/red gate/SQL Compare 6/sql compare engine 63/RedGate.SQLCompare.Rewriter.dll" /> </dependentAssembly> </runtime> <startup> <supportedRuntime version="v2.0.50727"/> </startup> </configuration>
This seems to work a treat! The only drawback is that your new dll may be incompatible with the application because public methods may have been deprecated or some other undesirable behaviour.
|
-
Posted Wednesday, March 26, 2008 6:09 PM |
Ever since the dark ages of programming, certain tricks have been implemented in code compilers to eke out the maximum performance from the compiled code, the most important of which is inlining. Put simply, inlining is the process by which a child method is inserted into the parent method in its' entirety. For larger methods, this would make about as much sense as a chocolate teapot because the program would suffer from serious bloating if the same method had been invoked many times from different places. For smaller methods, however, this makes perfect sense and offers a huge performance boost, especially when methods whose compiled code is 32 bytes or less, because absolutely no space is wasted considering the method pointer occupies 32 or 64 bits anyway and performance is boosted because the CPU doesn't need to "jump" across to different memory locations as often.
The same optimizations are also available in the Microsoft .NET Runtime's Just-In-Time compiler, although this is done more or less on the fly.
With an eye towards performance analysis tools such as code profilers or coverage tools, this has some implications in the results because methods that have been inlined effectively cease to exist, so many of these tools disable inlining and other optimizations to save the end-user from confusing and nonsensical results. The disadvantage of this approach is that your code profiler doesn't exactly run your code in the same way as it would run in the "real world" due to the absence of JIT optimizations. Thankfully, ANTS Profiler allows you to profile inlined code using its fast mode profiling, and provides visual cues as to which methods had been inlined and which source code had been inlined into parent methods.
I've made a brief 5-minute presentation demonstrating the speed increase effected by inlining, and the way in which ANTS Profiler will show information related to inlined methods.
Demonstration of inlining using ANTS Profiler
|
-
Posted Wednesday, March 26, 2008 1:09 PM |
Many years ago, when switching from programming in plain old C to the managed environment of .NET Framework, I had discovered exceptions. The idea was not completely new to me because I'd already seen try/catch blocks in JavaScript and I liked that method of error handling a lot, especially when compared to the ON ERROR GOTO handling that VBScript uses, which is why I prefer using JScript whenever I can, although sometimes in the scripting environment you begrudgingly have to use VBS.
.NET had significantly enhanced the try/catch block by allowing the programmer to extend the exceptions by adding their own properties and methods to them. In addition, the exceptions can be typed, so if you are interested in one type of exception, say a file can't be opened, but not in another type of exception, for instance an out-of-memory condition, you can have that sort of granularity.
So I thought, great!, I will use exceptions everywhere. I will use them all over the place, not only to handle catastrophic and unusual errors, but also anywhere an object could not be created or a value exceeded a certain threshold or any one of 1001 completely common situations where something happened that needed some conditional branching to happen. This, as I found out, could have some particularly nasty performance implications!
Here is a simple example to demonstrate just how much slower throwing exceptions can make your .NET Program: using System; using System.Collections.Generic; using System.Collections; using System.Text; namespace ExceptionTest { class Program { static void Main(string[] args) { for (int i = 0; i < 1000000; i++) { HandleViaException(); HandleViaCondition(); } } static void HandleViaException() { string s = null; try{ string ss = s.Substring(0, 2); } catch (System.NullReferenceException) { } } static void HandleViaCondition() { string s = null; if (s != null) { string ss = s.Substring(0, 2); } } }
In the HandleViaException method, I attempt to execute a method on a string object set to a null value, which will cause a NullReferenceException and make the exception handling code run. In the HandleViaCondition method, I simply check the value of the string, and if it is null, I do not run the Substring method on the string. Although these methods perform the same function, there should be a noticable performance difference when the methods are both run a million times. I had tested this using the ANTS Profiler code profiling tool with the following results:
HandleViaCondition -- Hit Count: 1000000 Total Time: 0.571 seconds HandleViaException -- HitCount: 1000000 Total Time: 50.4 seconds
Using Exceptions to trap a null condition is roughly a hundred times slower than simply checking to see if the string is null. I knew that using exceptions would incur a performance penalty but mama mia that is slow! I've taken a program that should return in less than a second and turned it into an excuse to hang out at the water cooler and gossip for awhile.
Could I make this any worse? Oh, yes I can, by attaching a debugger (cdb.exe) to the program as well!
HandleViaCondition -- Hit Count: 1000000 Total Time: 0.554 seconds HandleViaException -- HitCount: 1000000 Total Time: 54.3 seconds
Well, that's not too much worse, but then again, cdb is pretty lightweight. Let's attach to it using Visual Studio 2005's debugger:
HandleViaCondition -- Hit Count: 1000000 Total Time: 0.678 seconds HandleViaException -- HitCount: 1000000 Total Time: 1936 seconds
See, now I have a convenient excuse to go down to the cantine and get a donut. Mmmmmm, donuts.
The Visual Studio debugger is particularly invasive when it encounters an exception in the code that you're debugging. You expect a debugger to pause your code when an exception is encountered, grab information about the stack and heap, and allow your code to continue on. With CDB.exe, the overhead is pretty minimal, but Visual Studio 2005 seems to pause my running code for much longer.
The end result is that if this code was part of a real-world application, I would probably spend all day trying to debug it, and I frankly have better things to do, like eat bacon sandwiches. On toast. With some of that nice Brown Sauce they have over here.
From now on, I use exceptions in my code very sparingly, and try to avoid using them in code loops altogether because the cumulative effect of wasting a few milliseconds in a tight code loop can turn an application into sludge if you're not careful!
|
-
Posted Tuesday, November 13, 2007 5:29 PM |
Microsoft .NET's runtime provides an execution engine for Just-In-Time compiled code, but it also has the clandestine capability to pre-compile code and cache it on disk. This at first seems a little odd, since the point of environments like .NET Framework and Java are supposedly designed to offer machine-independent, 'virtual' code. I suppose that .NET's native image support was introduced to solve some performance pitfalls of using Intermediate Language code, which needs to be compiled dynamically as it runs. Native Images would not suffer from this performance loss, since the compilation has been done in advance.
Native images can become invalid, however, and cause some very strange errors. In one case, one of our programs was crashing at random points in the usage, and in most cases, the program would actually fail to even start, throwing a scary-looking invalid program exception and offering the chance to debug, which presents some users with and obtuse error message and if a debugger is installed, a yellow arrow pointing at a machine instruction that nobody who has started programming a computer after 1987 really understands.
Theoretically, of course, invalid native images should not occur. But if we understand how these native images are created, it's possible to see some holes in Microsoft's design.
For starters, native images are typically created when a program is installed, and the process appears to be automatic -- Windows Installer knows that .NET assemblies are contained in an MSI by some hocus-pocus and ngen.exe is invoked to create a cache of native images for all assemblies in the MSI.
If the native images are never modified, there is a possibility that they could become invalid, if, say the .NET Framework environment had changed. If the Framework is patched or re-installed, there could be outdated reference in the native image that would cause the program to crash where it would not if it were properly loaded and JITted in near-real time. Microsoft have thought about this and designed the Framework to examine the .NET Runtime for changes that would cause a problem. If, for instance, mscorwks.dll, mscoree.dll, or other runtime libraries have changed, this will force a new native image generation when a cached assembly tries to load. Likewise, any changes to dependent assemblies, even if they have cached native images, will force a 'cascading' recompilation of native images for all dependent assemblies.
It sounds like Microsoft have thought of everything, but, alas, no, there are situations where an invalid native image is sitting on the hard disk, waiting to strike. Microsoft know that when they apply changes to the .NET Framework, such as the updates that may be distributed by Windows Update, that the runtime's native image cache needs to be updated. So these updates run NGEN.exe /update to refresh the native images so that they are no longer invalid. Sounds all well and good -- but there are some situations where multiple automatic updates have resulted in invalid native images that the runtime thinks are still valid, probably because the system really requires a reboot after running the first .NET update, but Automatic Updates silently suppresses it and applies a second .NET update.
If you come across strange behaviour in a managed .NET program, it may be useful to rule out the native image as the cause before accusing your program's vendor of releasing buggy code! It is possible to determine if the program you are running is the JITted version, or a native image, by using our old friend, the Fusion Log Viewer. Setting this tool up as described in the previous link will create a log entry any time a .NET assembly is loaded by your program. Because Fusion Log Viewer can discriminate between bindings to native assemblies and ones that are purely Intermediate Language, it's plain to see whether the assemblies are being loaded from cache or compiled dynamically by mscorjit.dll.
If the errant program is loading lots of native-image assemblies, running %systemroot%\Microsoft.net\framework\v2.0.50727\NGEN.exe /update may just put your program, and possibly many others, back in working order.
|
-
Posted Sunday, October 14, 2007 10:48 AM |
I was watching a medical drama the other day, you probably know the one, where the diagnostician darn near kills the patient with bizarre tests in trying to figure out his illness, given a strangely conflicting set of symptoms. I've come across some similar situations with my inorganic patients recently which forced me to step back and look at the whole patient rather than the symptoms presented.
The code profiling tool that I provide support for basically works by loading one of our DLLs into the process being profiled, and there is a possibility for the process to crash or hang with our DLL hooked in when it runs perfectly fine without it. Previously, my focus was strictly on finding bugs in our DLL, but I have come to believe that this is a pretty narrow-minded approach.
Everybody over age five has heard of the proverbial camel with the lumbar malady upon which a straw too many had been loaded, causing the camel to exhibit a lack of structural integrity. Well, that's how I would word it in my bug report because I'm a stickler for the specifics.
The metaphor is a good one, though. A programmer will notice poor performance or memory usage in his application and try to find the problem using our code profiler to gather information about slow methods or poor memory allocation, so a 'sick' application is going to have to carry the additional strain of loading our dll and all of its' requirements of memory and processing time along with it. It doesn't take a master engineer to realize that this might fail for more basic reasons than a bug in our code.
This is similar to a medical case in the respect that the patient needs to be fit enough to survive a procedure that may cure the underlying problem. If a patient needs an angioplasty and presents to surgery with nervous problems, the nervous system would need to be stabilized first so he isn't flopping around on the operating table like a fish out of water while surgeons try to fix a delicate artery wall.
I'd like to say that our code profiler is the tool to help cure to all performance problems, but sometimes other diagnostic tools need to be employed to point at the serious problems first so that the code profiler can be used to find inefficient code.
In one instance, a case came to my attention where a program appeared to hang while being profiled. Sure, profiling introduces some overhead, but this was extreme (the spash screen stayed up for an hour!). I'd suggested using performance counters, specifically the .NET CLR Memory->% Time spent in GC while profiling, and found that the application was spending 90% of its' processing time just cleaning up objects in its' own memory! Without profiling, this went down to a 'respectable' 80%. This application was in too bad a shape to be profiled because it had so many objects containing self-references. Simply adding more data to the application could reproduce the hang even without profiler, so this patient was clearly running out of time.
In rare cases, it's time to wield computing's least understood tool: the debugger. But first we will need a crash or hang dump from the customer produced using adplus.vbs from the Windows Debugging Tools suite. This produces a big, fat file containing the memory of the application, including the .NET runtime, so it's seldom to have a customer who is willing to upload a 50MB file to us. I can use windbg to analyze what happened at the time of failure. It's a dark art, but can reveal interesting information about the application. This MSDN blog is probably the most useful resource I've found on how to use the Windows Debugger.
Another common remark from customers is that although the code profiler shows minimal memory usage by their applications, the total space as reported by Task Manager is monumentally huge. In this case, you may not need to worry about optimizing your code, which is what a code profiler helps to do, but rather you want to examine the process memory space and have a look at how many modules are loaded into it. Then it's a question of perhaps ridding yourself of unused references or using alternative components in your code that are smaller. A simple, free tool for doing this is Red Gate Memory Tracker, which will give you a graphical representation of what is loaded into memory. Cleaning up this memory first may give the code profiler a bit more 'elbow room' if the problem is a shortage of memory.
Once the critical problems in an application are sorted out, then the task of enhancing the application performance is much easier!
|
-
Posted Saturday, October 06, 2007 11:00 PM |
I should really count my blessings. Technical Support, to some companies, means a bank of Rhesus monkeys picking up telephones and telling customers to 'turn it off and then on again'. Previous to this prevalent statement, 'is it plugged in?' was de rigeur until customers found this too insulting to their intelligence -- particularly the ones who did actually forget to plug it in. The point is that a lot of technical support managers live and die by the stats: ring time, hold time, and wrap-up.
It's almost conventional wisdom down in the trenches that this type of management is self-defeating and results in poor service. For instance, if you penalize staff for breaking your arbitrary ten-minute wrap-time, the support representatives are going to dole out bogus information and the customer will inevitably have to call back! But that's just fine with management, because there will be one more call to add to their stats that conforms to the ten-minute wrap-time, where inevitably a second support representative will 'accidentally' cut off the customer at 09:59.0001.
A good customer support department has three main goals: solve problems, prevent problems, and maintain a good relationship between the customer and the company. Ideally, you could call the company and an omnicient semi-deity in the computer field will pick up in one ring, be able to provide the required answer, thank you so much for being a good sport, and sing you a pacifying lullabye to make you feel completely at ease.
In the real world, though, this is not entirely possible. At a minimum, the goal of tech support is to provide you with the solution to a problem, enough background information to prevent knock-on effects of the proposed solution, and steps to prevent the problem from re-occurring. Now if that ain't good service I don't know what is!
At my place of work, we take customer care to this level; the average work time on an issue is 45 minutes. Clearly, this has some scalability concerns, so preventing issues from occurring is paramount. Two ways to accomplish this are effective feedback mechanisms to the developers so outstanding issues can be fixed and documenting known issues.
Once the customer has run into an issue, I feel the company has already let them down, and our job is merely to recover some trust and let them know that they're being taken seriously. As soon as new issues are discovered, they are duly reported to the development team, along with a complete description, and where possible, a reproduction of the bug to make it easier (and quicker!) for them to find the root cause. The end effect is that the software will be modified to correct the problem, or the help file updated, or the wording of an error message changed. Et ouala! A few less phone calls to answer.
Next, any open issues are documented in a knowledge base. This allows customers to locate the solution themselves on our website's search facility, but also allows us to quickly respond to email queries where the KB article fits the customer problem. (It has also has the effect of reduced cases of Repetative Strain Injury in Technical Services by not making them type the same paragraphs over and over again!)
There is a temptation for support representatives to gain knowledge and keep it to themselves so they will be 'important' to the company. The cure is for management to keep an eye on how much KB material individuals are producing and keep that statistic handy for performance reviews!
Finally, a one-on-one response is still the preferred communication model for technical support. Even in the age of instant messaging, chat, forums, and self-help, one-on-one synchronous communication methods still reign. According to the Service and Support Professionals Association, a whopping 54% of technical support is done via email. Slightly less by phone, and impersonal means such as forums and self-service account for only a tiny percentage. Clearly, people want to 'talk to someone' and this is unlikely to change in the near future. I feel that the ideal synergy between the customer and the service provider is that of a partnership. The customer bought our software to solve a problem. Our job is to work with them to make sure that the problem is solved, and does not become an even more formidable problem.
I suspect that supporting utility software may not be the same as supporting an Internet Service Provider or a manufacturer of novelty timepieces, but I hope that I've impressed the importance of taking the time and expending the effort to provide excellent customer service -- and that extends beyond trivia such as how long someone spends on a telephone call!
|
-
Posted Sunday, September 30, 2007 6:40 PM |
Sometimes SQL Server can pose conundrums of such complexity that even the Riddler couldn't conceive of. In my mind's eye, a SQL Server is in fact its' own operating system, running on another operating system. For instance, SQL Server has a task scheduler to manage threads running within the SQL Server Process, and this User-Mode Scheduler marches to a different drum than the one in Windows, the most significant being that the multitasking is done cooperatively. Most of the time, anyway. SQL Server also does memory management of a sort, and a fixed memory area for running third-party extensions (extended stored procedures). It has processes for committing I/O in the form of the lazy writer, periodically flushing SQL pages from memory onto disk. This makes SQL a bit hard to troubleshoot in that you don't know whether to look for configuration problems in SQL or a misbehaving installation of Windows.
Riddle me this, Batpersons: when does a disk with 100 gigabytes of free space tell you it's full?
When the context of the question involves a failing SQL Server backup job, I'd like to contort myself into a shipping crate, have someone address it to Brazil, cover it with postage stamps, and stuff it into the nearest mailbox. Once the initial panic passes, though, I can grab a chair in front of the bat-computer and get to work.
To rule Windows out as the culprit, we can copy a file of a similar size to the disk, perhaps by grabbing a previous backup of the database and copying it to the target disk. I can think offhand of a few situations where this would fail:
The disk may be formatted as FAT32, imposing a 4GB limit on the size of file. Not very likely these days. Second, there could be a disk quota exceeded for the user copying the file. Even if the copy succeeds, you may want to open a console and use RunAs /username:x (where x is the user who normally runs the backup) and copy the file from the command prompt.
These are relatively easy to fix. Then, there are the weird problems that require some deeper investigation. Locked files are one possibility, and we could use OH.EXE (open handles, from the Windows resource kit) to see what process is locking the backup file. The problem here is that we need to be sitting in front of the box as the job fails: as backup problems tend to be sporadic and happen when we are in bed counting sheep, so this isn't always practical.
In one case, a remote backup to a share failed because the Windows Remote File Cache got corrupted somehow, which is impossible to diagnose but straightforward to repair (read REINSTALL WINDOWS!).
Third-party SQL Server backup solutions (Veritas, LiteSpeed, Legato, SQL Backup) that rely on Virtual Device Interface (VDI) introduce potential memory allocation problems that masquerade as disk I/O problems. If the backup tool cannot allocate the requested buffer space from VDI, then SQL Error 112 (out of disk space) errors occur. SQL Server reserves 128MB of memory for these buffers, which is shared with extended stored procedures and COM objects created using sp_OACreate. If extended stored procedures are leaking memory or COM objects are not freed with sp_OADestroy, this memory will continue to shrink. Running
DBCC MEMORYSTATUS
on the server may be revealing in this case.
Memory fragmentation is also a potential issue if DBCC still reports oodles of free memory. If the VDI-based backup job succeeds after a SQL Server restart, then one of these issues could be the smoking gun. An improperly configured SQL Server OS could also be to blame. Certain startup options for Windows and SQL Server may be worth looking into depending on the size of physical memory:
- If you have 3-4 GB memory, include the /3GB switch in the Windows startup. - If you have 4-8GB memory, use /3GB and /PAE - If you have 16 GB or more, use only /PAE, as /3GB will cripple memory over 8GB.
-When you have set /PAE, go into SQL Server's configuration and set the option to use AWE to ON. If you do this, however, you need to also specify a maximum memory value in SQL Server, if you do not, then SQL Server will take all but 128MB of the computer's memory if the automatic memory management is used in SQL Server.
-The user who runs the SQL Server needs to have the 'lock pages in memory' user right in the local security policy, or it will have problems allocating in the large memory.
-There is a -g startup option in SQL Server to control how much memory SQL Server will leave free for extended stored procedure code. This is important to do when you have more than 500 databases on a server: each database that's online will use 64Kb of the free memory. Microsoft recommends:
500 databases: -g288 1000 databases -g372
After setting -g, you need to restart SQL Server. Note that AWE is not available on all editions of SQL Server. If you have hundreds of databases on a SQL Server Standard, think about upgrading to Enterprise or get another instance of SQL to hold the databases.
If the cause of failure cannot be determined, most third-party backup programs (as well as SQL Server) include MaxTransferSize and BufferCount parameters to make it possible to control the size of the buffers manually.
Hopefully, this information will help sort out various native SQL Server and third-party database backup problems caused by resource issues, and ensure a successful backup tomorrow, same bat-time, same bat-server.
|
-
Posted Thursday, September 20, 2007 10:11 AM |
I work for a company that sells software. And SOME people, because of their untrusing nature, need to be convinced that the software that we want to sell them actually works. This process, in some large, bureaucratic institutions, involves salesmen and conference calls and schmoozing and all of those things, that I, as a technical guy, like to leave to the more sociable ranks in the company.
Part of the process of demonstrating that the software that we produce meets certain functional criteria may involve webcasts or meetings where we show the product, in all it's glory, working flawlessly. This means having a server always available for this purpose and in tip-top working order.
The fly in the ointment is this: how to allow non-technical people to run software on this machine and potentially modify (a polite way of saying BREAK) the SQL Server, its' databases, and operating environment, and then later in the day allow someone else to run the same demonstartions on a server that's restored to its' original environment. I know what you're thinking and this was my first though as well: virtual server! You could create a virtual server and roll back the entire environment to an earlier state without a lot of trouble.
The virtual server idea was shot dead like a wounded racehorse because we need to demonstrate the increased performance gains of our SQL Backup software: we needed a high-performance environment with the quickest disks, more RAM than a virtual environment can provide, and ultra-quick GUI response time. What we needed was SQL Server with a rewind button so any atrocities committed against the server environment could be rolled back at will, or better yet on a schedule so I don't have to tinker around with the server ever again.
Ideally, I thought, leveraging backup technology would work. Assuming that all of the SQL Server configuration is kept in the system databases, it would 'simply' be a matter of taking a one-time, incorruptable backup of these databases and restoring it back to the platform.
So I backed up all of the databases on the demo server and saved the backups to an out of the way path (NOT the default SQL Server backup path!!!). The backup file names for all demo databases also correspond to the database names, making it easier to automate the restore job.
But since you need to restart SQL multiple times and restore system databases, a SQL Script is impractical for this task. I need to work outside the SQL Server and for that reason I implemented rewind as a VBScript. The first thing a coder needs is a plan, so I wrote out my strategy as pseudo-code:
- If SQL Server is started, stop it and all dependent services
- Start SQL Server in single-user mode
- Restore the master database
- Shut down the server and bring it up in multi-user mode
- Restore model and msdb
- Restore all user databases (the list of databases is in our newly-restored master database)
- Start SQL Server and all dependent services
- Delete all backup files from the default backup location to prevent demonstrators from filling up the disk with SQL Backup demo data
This script has been running as a Windows Scheduled Task at midnight for about six months, and has saved a lot of time maintaining a SQL Server 2005 demonstration environment for sales and marketing concerns. There is only one basic flaw and that is that if the SQL Server software is updated, for instance service packs and Microsoft hotfixes, the system databases need to be backed up again, or the system database restores will fail. 'recoversql.vbs
'given a sql server and its' backups, restore the server and all dbs.
'Assumptions:
' 1. Backups are taken from the same server (the logical and psysical filenames match)
' 2. Backups are named after the database that they came from (ie MASTER is restored from master.bak)
'
' NOTE WELL! Part of this script will delete files from the default backup folder!
' This is a bit outside the realm of recovering sql, so remove these bits
' if you want to re-use this script for something else!
'
'------------- Things you can set -------------------
Dim strBackupFolder 'the folder with the master backups in it w/trailing backslash
strBackupFolder="d:\demo databases\"
Dim strServerCommand 'Path to sqlservr command for single user startup
strServerCommand="""C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Binn\sqlservr"""
Dim strInstance 'The name of the SQL Server instance, MSSQLSERVER for the default
strInstance="MSSQLSERVER"
Dim strDefaultBackupFolder 'The name of the folder where backups go by default (trailing backslash)
strDefaultBackupFolder="d:\MSSQL.1\MSSQL\Backup\" 'we will delete all of these as part of maint.
' ------------------ GLOBALS -------------------------
Dim obShell
Set obShell=WScript.CreateObject("WScript.Shell")
Dim obFSO
Set obFSO=WScript.CreateObject("Scripting.FileSystemObject")
Dim iResult
Dim serverName
serverName=obShell.ExpandEnvironmentStrings("%COMPUTERNAME%")
if strInstance="MSSQLSERVER" Then strInstance=serverName
WScript.Echo("Recovering " &strInstance &" using backups from " &strBackupFolder)
'If SQL Server running, stop it
StopSqlServer strInstance
'Start SQL Server (single-user)
StartSqlServer strInstance, True
'restore system databases
RestoreSystemDatabases strInstance
'read database list from master, restore each database (exc. tempdb)
RestoreAllDatabases strInstance
'Start SQL Server (multi-user)
StartSqlServer strInstance, False
'Start SQL Server support services
obShell.Run "NET START SQLSERVERAGENT", 1, True
obShell.Run "NET START MSSQLServerOLAPService", 1, True
obShell.Run "NET START SQLBROWSER", 1, True
obShell.Run "NET START MSDTSSRVR", 1, True
obShell.Run "NET START MSFTESQL", 1, True
'Delete old backups
WScript.Echo("Deleting backup files from " &strDefaultBackupFolder)
obShell.Run "cmd /c ""del /q /f /s "&strDefaultBackupFolder &"*.*""",1,True
'-- Functions --
'-----------------------------------RestoreAllDatabases----------------------------------
'Restore all databases but master, model, msdb
Function RestoreAllDatabases(InstanceName)
Dim stdin, strBackupType, obDatabasesRS, obMasterConnection, strRestoreCommand, DatabaseName, iResultCode
'Get the list of databases
Set obMasterConnection=WScript.CreateObject("ADODB.Connection")
obMasterConnection.CommandTimeout=9000 'command timeout 15 minutes, hopefully enough?
obMasterConnection.Open="Provider=SQLOLEDB;Data Source="&InstanceName&_";Initial Catalog=master;Integrated Security=SSPI"
Set obDatabasesRS=WScript.CreateObject("ADODB.Recordset")
Set obDatabasesRS.ActiveConnection=obMasterConnection
obDatabasesRS.Open "Select [name] FROM sysdatabases WHERE [name] NOT IN ('master','tempdb', 'msdb', 'model')"
obDatabasesRS.MoveFirst
Do Until obDatabasesRS.EOF
DatabaseName=obDatabasesRS.Fields(0)
obMasterConnection.Execute "RESTORE DATABASE ["&DatabaseName&"] FROM DISK='"&
_strBackupFolder &DatabaseName&".bak' WITH REPLACE"
WScript.Echo("Restored " &DatabaseName &" from " &strBackupFolder &DatabaseName &".bak")
obDatabasesRS.MoveNext
Loop 'databases
obDatabasesRS.Close
obMasterConnection.Close
Set obDatabasesRS=Nothing
Set obMasterConnection=Nothing
End Function
'End Restore database
'-------------------------------- RestoreSystemDatabases --------------------------------
' Ask for location of last good master, model, and msdb SQL Backups
' Restore these and restart the server in multi-user mode
Function RestoreSystemDatabases(InstanceName)
'Attempt restore of master (exit script on fail)
RestoreSystemDatabase "master", InstanceName, strBackupFolder &"master.bak", True
WScript.Echo("Restored Master from " &strBackupFolder &"master.bak")
'Start the SQL Server; restoring MASTER will stop the server.
StartSqlServer InstanceName, False
'Attempt restore of msdb (exit script on fail)
RestoreSystemDatabase "msdb", InstanceName, strBackupFolder &"msdb.bak", True
WScript.Echo("Restored MSDB from " &strBackupFolder &"msdb.bak")
'Attempt restore of model (allow fail?)
RestoreSystemDatabase "model", InstanceName, strBackupFolder &"model.bak", False
WScript.Echo("Restored MODEL from " &strBackupFolder &"model.bak")
END FUNCTION
'------------------------------------ End restoresystemdatabases-----------------------------
'-------------------------------------RestoreSystemDatabase-------------------------------
'Support function for RestoreSystemDatabases
Function RestoreSystemDatabase(DatabaseName, InstanceName, BackupFileName, ExitOnFail)
Dim retCode, stdin, obDBConn
Dim strBackupCommand
strBackupCommand="RESTORE DATABASE ["&DatabaseName&"] FROM DISK='"&BackupFileName&"' WITH REPLACE"
Set obDBConn=WScript.CreateObject("ADODB.Connection")
obDBConn.Open="Provider=SQLOLEDB;Data Source="&InstanceName&";Integrated Security=SSPI"
obDBConn.Execute strBackupCommand
obDBConn.Close
SET obDBConn=Nothing
End Function
'----------------------------- StartSqlServer --------------------------------
' Modify the registry to start sql server in single-user if needed.
' Net Start the SQL Server Service
Function StartSqlServer(InstanceName, SingleUser)
Dim retVal
IF InstanceName=serverName Then
If Not SingleUser Then
retVal=obShell.Run("NET START MSSQLSERVER", 1, True)
End If
If SingleUser Then
obShell.Run strServerCommand &" -m", 1, False
WScript.Sleep(20000) 'Wait for server to start
End If
ELSE: retVal=obShell.Run("NET START MSSQL$"&InstanceName, 1, True)
End If
If retVal > 2 Then
WSCript.Echo("Could not start SQL Server """&InstanceName&
_""". Command Returned "&retVal)
WScript.Quit(1)
End If
End Function
'End modify startup parameters
'----------------------------- StopSQLServer ---------------------------
'Run NET STOP for an instance of SQL Server.
Function StopSQLServer(InstanceName)
obShell.Run "NET STOP SQLSERVERAGENT", 1, True
obShell.Run "NET STOP MSSQLServerOLAPService", 1, True
obShell.Run "NET STOP SQLBROWSER", 1, True
obShell.Run "NET STOP MSDTSSRVR", 1, True
obShell.Run "NET STOP MSFTESQL", 1, True
IF InstanceName=serverName Then
obShell.Run "NET STOP MSSQLSERVER", 1, True
ELSE: obShell.Run "NET STOP MSSQL$"&InstanceName, 1, True
End If
End Function
'-----------------------------End StopSqlServer --------------------------------------
|
-
Posted Saturday, September 15, 2007 11:31 AM |
Some months ago, I took my car in for a safety inspection, and the garage found a small hole on the inside of the wheel-well. In order for the car to pass inspection, they said, that would need to be welded shut.
Great, I said, can we do that now? No, they said. They do not do bodywork. My eye was drawn to the corner of the garage, where I could quite distinctly make out a bottle of acetylene and a welder's mask. Couldn't you make an exception, just this once? I'm a busy fellow and I'd rather not make an appointment at a body shop and come back here for re-inspection. Nope. Company policy and all that... sorry... and so forth.
Even though the equipment was there, and the expertise was there, I was inconvenienced by an arbitrary policy that says the garage does not perform body work on cars, even though the hole was somewhere that nobody could even see. I was certainly surprised it was there!
Unsurprisingly, the same paradigm exists in software support, only users who are frustrated to their wit's end by a software failure refuse to be turned away and sent to Microsoft Product Support Services if the cause of failure is not determined to be our little piece of software!
The complexity of the application doesn't matter. You could, for example, produce a program that prints 'Hello World' to the console. The skill level of the developer writing the application can be minimal or even non-existing, but the support engineer's required skill level is still astronomical! What if you run helloworld.exe across a network share? What if the user wants to redirect the output to a file? What if the domain administrator has set a policy that prevents the user from opening a command prompt?
Luckily, all software vendors have support departments like ours who can help us find the flaw in the underlying platform our software uses. The downside is that some of these vendors aren't as easy-going as we are and our enquiries may cost additional time or money. They may require us to buy a support contract or support us for free after some administrative work to set us up with access to their resources. This makes it all too easy for us to say 'Look, this is not our problem, contact PSS!'. Then the user is standing in the middle of the garage, much like I was, wondering why the welder can't patch my perferated car body and save me so much time.
For the last few months, Red Gate has been intently looking at improving our knowledge base so that the information is more timely, more appropriate, and easier to locate. I had noticed that there are quite a few KB articles that are drawn from other sources such as SQL Server Books Online and Windows product documentation. We distill this information and make it relevant to a particular product. Companies like Microsoft publish a wealth of really good, freely available resources like Technet, Books Online and insider information from developer blogs, so kudos to them! If only some of the third-party vendors we use would be more forthcoming with their knowledge...
For our part, we do not place any restrictions on the product knowledge that we collect, and if we do encounter a problem that is caused by a third-party, we do our best to solve it for you in the context of our own product. This level of obligation is not easy or cheap, and reflects a high level of commitment to customer satisfaction.
|
-
Posted Saturday, September 08, 2007 8:25 PM |
If you are ever tasked to figure out what's happening inside an application, one of the essential bits of information is the Program Database (.PDB) file. The PDB contains symbols, that is to say that it contains a list of methods, properties, and variables in an application and lists exactly where in the application these objects appear in the compiled code. Once an application has been compiled into binary or Intermediate Language, all of the 'friendly' method and variable names are replaced with machine-friendly numbers. The PDB is a cross-reference between the human-friendly names and the machine-friendly addresses as well as a reference as to where these names appear in the source code file.
In addition to debugging, a PDB is also useful for code analysis tasks such as code coverage, performance profiling, and memory usage. ANTS Profiler is a good example of an analysis application that benefits from the knowledge in the PDB file, so it's important to know how the software uses the PDB and what can happen when the PDB is missing or incorrect.
In ANTS Profiler, the PDB is used for two main functions: to find the source code for the application being profiled and to filter out methods that you're not interested in or are powerless to fix, such as the assemblies distributed by Microsoft as part of the .Net Framework or third-party assemblies.
Quite a few ANTS Profiler incidents come through support because of a missing or damaged PDB, to the point where I almost pick up the phone and answer 'Red Gate Software, this is Brian, have you got a PDB?'
The most common solution is simply to ensure that the PDB is in the same folder as the assembly you're profiling. If your program loads dll assemblies from different folder locations, then the PDB relating to that dll should be in the dll's folder. This gets more complicated if the assembly loads from the Global Assembly Cache at %SYSTEMROOT%\Assembly because of the shfusion extension that changes the normal Explorer view of this folder to a special view of the assemblies. The GAC is not a physically single folder, but a complex web of subfolders that make it possible to register multiple versions of the same assembly into the GAC by simply dragging and dropping the file in the GAC folder, which is a shade easier than running GACUTIL.exe from the command prompt!
To copy the PDB into the GAC, you open a command prompt and change directories to %SYSTEMROOT%\Assembly\GAC_MSIL and find the subdirectory for the version of the assembly that you're going to load, and copy the PDB file there. Alternatively, you can use regsvr32 /u shfusion.dll to disable the shell extension and navigate the GAC folder normally using Windows Explorer. You have my sympathy -- this is not easy! What I'd like to see in a future version of Profiler is the ability to have the PDBs copied into the right places automatically, if that's practical.
If your application's assemblies exist in the GAC, it's important to remember that they will be loaded from there first, even if another copy of the assembly exists in the same folder as your application's executable! If you use ANTS Profiler for performance profiling in the 'show only methods that have source code' mode and you don't get any method hits or source code for your assembly, check the GAC -- your application may be loading the assembly from there rather than from the working folder.
It's also worth noting that you can generate a PDB for a RELEASE build the same as you can for a DEBUG build. In some versions of Visual Studio, the PDB is produced only when you have specified the DEBUG configuration by default, but even in RELEASE configuration the PDB can be generated using the 'generate debugging information' setting. Because DEBUG builds are less optimized, the PDB for a debug assembly may not be valid for a release version. Case in point:
'Hello, Red Gate Software, this is Brian, have you got a PDB?'
'Yes, you silly man, of course I do. However, the hit counts in the source code view of my application are all wrong!'
'Wrong? Wrong how?'
'Well, I have a blank line that got hit five times. Explain that!'
'Uhhhhhh...'
I guided him through rebuilding the application in DEBUG mode and redeploying the assembly and the PDB, which solved the problem. I suspect that the PDB was either corrupt or it was built for the release rather than the debug version of the assembly. Since the offsets for method information are potentially different between an optimised release build and a non-optimised debug build, I see this as a potential problem.
In another case, a programmer had everything set up correctly, but ANTS Profiler continually prompted him for the source code file. What went wrong? Absolutely nothing; it was working as-advertised. Since the PDB stores the absolute path to the source code file, it must be present in the folder that the application had originally been built from. This causes some potential inconvenience when moving the application to the testing environment and profiling it there, because the original source code tree needs to be replicated on that computer.
ASP .NET 2.0 uses a dynamic compilation model where the 'assembly' is compiled and cached on-the-fly in the 'Temporary ASP.NET Files' folder when the page is requested, so how do you get a PDB? Basically, the same way as from Visual Studio. Simply edit the web application's web.config file and set debug="True". The PDB is copied into the temporary folder and the web application can be profiled as normal.
Managed code can be extended by dynamically generating assemblies using CodeDom. A program may do this, for example, if it allows users to write their own 'macros' or if some information needs to be gathered first from the user before generating an assembly. In this case, you decide whether or not to generate symbols by setting CompilerParameters.IncludeDebugInformation to true. If the source code for the assembly is a string rather than a code file, though, you may be prompted for the location of the source code file (which of course, doesn't exist!). If the assembly is generated in-memory, then your only options are to profile all methods or specify the filter manually.
Once you understand what debugging symbols files are for and how they work, you can get more information from performance profilers like ANTS.
|
-
Posted Wednesday, September 05, 2007 9:13 AM |
When Fibonacci introduced the number zero to Europe, it must have blown their minds. People in the Fertile Crescent had been using zeros, probably for centuries, but for the people of this continent the concept of a symbol that meant 'nothing' was probably hard to grasp.
Enter the modern computer, which has the capacity to bewilder me on a daily basis. It can not only calculate numbers on both sides of 'nothing', but it also has many wholly separate ways of expressing 'nothing', none of which are compatible with each other. So for me, a hobbyist C programmer in my distant past, grappling with .NET's multiple ways of expressing 'nothing' was akin to the feeling that ancient people had when a mathematician attempted to express the idea of zero to civilizations used to counting in Roman numerals.
For instance, thanks to implicit type conversions in the C compiler (at least the one I was using), you could do this:
int i=0; if (!i) printf("true"); if (i==0) printf("true"); if (i==NULL) printf("true");
This logic does not compute in .NET, however. I first came across this when trying to integrate an ActiveX web browser into a Windows Forms application a few years ago. All that I wanted to do was a simple navigation:
axWebBrowser1.Navigate("about:blank",null, null, null,null);
The last four arguments to the Navigate method were not important to me. I only wanted to make a GET request for "about:blank", but the line above caused a compiler error: Cannot convert from '<null>' to 'ref object'. I need to pass a ref object, which is clearly 'something', but it must have a value of 'nothing'! After pondering this for a bit, it was necessary to lie on the sofa with a damp washcloth on my forehead and turn out all the lights to soothe my aching head.
After a recovery period and a beer, I beset Google to try to find the answer and uncovered some information about the trinity of 'nothings' in .NET: Missing, Nothing (null), and Zero.
Nothing seems to be imbued with magickal powers that can cause real objects to be disposed of by the runtime. For instance, a static object will never be garbage collected unless its' value is set to null, or Nothing in VB .NET. Missing is a special type of object in .NET that surprisingly doesn't cause a universe-crushing paradox by simultaneously being 'nothing' and 'something' at the same time. Zero is our old, comforting friend: although just a number, one that signifies an absence of quantity.
My determination was that a 'Missing' object was the right choice for signifying to the Navigate method that I didn't want to argue unnecessarily:
public object empty = System.Reflection.Missing.Value; axWebBrowser1.Navigate("about:blank",ref empty, ref empty, ref empty,ref empty);
Coincidentally, I had come across a use for null just yesterday, when creating a sample project to demonstrate finding a memory leak using ANTS Profiler. Needless to say, the demonstration was a success, although not in the way that I had planned.
using System; using System.Collections.Generic; using System.Windows.Forms; using System.Threading; namespace EventsExample
{
static class Program
{
static Form1 theForm; [STAThread] static void Main() {
Control.CheckForIllegalCrossThreadCalls = false; Application.EnableVisualStyles(); Application.SetCompatibleTextRenderingDefault(false);
theForm = new Form1(); Thread workThread = new Thread(new ThreadStart(DoWork)); workThread.Start(); Application.Run(theForm); theForm.Dispose();
}
public static void DoWork()
{
CacheManager c = new CacheManager(); c.CacheClear += new CacheEventHandler(theForm.c_onCacheClear); c.CheckCache(); } }
The demonstration was meant to prove that if the DoWork method was running in a background thread, then it was holding a reference to an instance of Form1's c_onCacheClear method and therefore theForm would not be garbage collected after the window is closed, but I got more than I bargained for. Have you spotted it yet? Yes, the 'theForm 'object is declared as static and ANTS Profiler reports that the object is live after closing the window. Adding 'theForm=null;' after Application.Run(theForm); will get the garbage collector to dispose theForm.
Understanding the meanings behind .NET's three nothings is another way to help improve the code that you write.
|
|
|