Click here to monitor SSC

Theo Spears

  • Learnings from trying to write better software: Loud errors from the very start

    Posted Tuesday, October 11, 2011 10:50 AM | 0 Comments

    Microsoft made a very small number of backwards incompatible changes between .NET 1.1 and 2.0, because they wanted to make it as easy and safe as possible to port applications to the new runtime. (Here’s a list.) However, one thing they did change was what happens when a background thread fails with an unhanded exception - in .NET 1.1 nothing happened, the thread terminated, and the application continued oblivious. Try the same trick in .NET 2.0 and the entire application, including all threads, will rudely terminate.

    There are three reasons for this. Firstly if a background thread has crashed, it may have left the entire application in an inconsistent state, in a way that will affect other threads. It’s better to terminate the entire application than continue and have the application perform actions based on a broken state, for example take customer orders, or write corrupt files to disk. 

    Secondly, during software development, it is far better for errors to be loud and obtrusive. Even if you have unit tests and integration tests (and you should), a key part of ensuring software works properly is to actually try using it, both through systematic testing and through the casual use all software gets by its developers during use. Subtle errors are easy to miss if you are not actually doing real work using the application, loud errors are obvious.

    Thirdly, and most importantly, even if catching and swallowing exceptions indiscriminately doesn't cause any problems in your application, the presence of unexpected exceptions shows you do not fully understand the behavior of your code. The currently released version of your application may be absolutely correct. However, because your mental model of the behavior is wrong, any future change you make to the program could and probably will introduce critical errors.

     This applies to more than just exceptions causing threads to exit, any unexpected state should make the application blow up in an un-ignorable way. The worst thing you can do is silently swallow errors and continue. And let's be clear, writing to a log file does not count as blowing up in an un-ignorable way.

     This is all simple as long as the call stack only contains your code, but when your functions start to be called by third party or .NET framework code, it's surprisingly easy for exceptions to start vanishing. Let's look at two examples.

     

    1. Windows forms drag drop events

     Usually if you throw an exception from a winforms event handler it will bring up the "application has crashed" dialog with abort and continue options. This is a good default behavior - the error is big and loud, but it is possible for the user to ignore the error and hopefully save their data, if somehow this bug makes it past testing. However drag and drop are different - throw an exception from one of these and it will just be silently swallowed with no explanation.

     By the way, it's not just drag and drop events. Timer events do it too.

     You can research how exceptions are treated in different handlers and code appropriately, but the safest and most user friendly approach is to always catch exceptions in your event handlers and show your own error message. I'll talk about one good approach to handling these exceptions at the end of this post.

     

    2. SSMS integration for SQL Tab Magic

     A while back wrote an SSMS add-in called SQL Tab Magic (learn more about the process here). It works by listening to certain SSMS events and remembering what documents are opened and closed. I deployed it internally and it was used for a few months by a number of people without problems, so I was reasonably confident in its quality. Before releasing I made a few cleanups, including introducing error reporting.

    Bam. A few days later I was looking at over 1,000 error reports in my inbox. In turns out I wasn't handling table designers properly. The exceptions were there, but again SSMS was helpfully swallowing them all for me, so I was blissfully unaware. Had I made my errors loud from the start, I would have noticed these issues long before and fixed them.

     

    Handling exceptions

     Now you are systematically catching exceptions throughout your application, you need to do something with them. I've tried 3 options: log them, alert the user, and automatically send them home.

     There are a few good options for logging in .NET. The most widespread is Apache log4net, which provides a very capable and configurable logging framework. There is also NLog which has a compatible interface, with a greater emphasis on fluent rather than XML configuration.

     Alerting the user serves two purposes. Firstly it means they understand their action has failed to they don't just assume it worked (Silent file copy failure is a problem if you then delete the originals) or that they should keep waiting for a background task to complete. Secondly, it means the users can report the bug to your support team, and then you can fix it. This means the message you show the user should contain the information you need as a developer to identify and fix it. And the user will probably just send you a screenshot of the dialog, so it shouldn't be hidden by scroll bars.

     This leads us to the third option, automatically sending error reports home. By automatic I mean with minimal effort on the part of the user, rather than doing it silently behind their backs. The advantage of this is you can send back far more detailed and precise information than you can expect a user to include in an email, and by making it easier to report errors, you make it more likely users will do so.

     We do this using a great tool called SmartAssembly (full disclosure: this is a product made by Red Gate). It captures complete stack traces including the values of all local variables and then allows the user to send all this information back with a single click. We also capture log files to help understand what lead up to the error. We then use the free SmartAssembly Sync for Jira to dedupe these reports and raise them as bugs in our bug tracking system.

     The combined effect of loud errors during development and then automatic error reporting once software is deployed allows us to find and fix more bugs, correct misunderstandings on how our software works, and overall is a key piece in delivering higher quality software. However it is no substitute for having motivated cunning testers in the building - and we're looking to hire more of those too.

     

    If you found this post interesting you should follow me on twitter.

     

  • Introducing: SQL Tab Magic

    Posted Thursday, August 11, 2011 10:53 AM | 1 Comments

    Yesterday I wrote about Down Tools Week and trying to build a working product in 5 days. I also released the first version of the tool to a group of people in our early access program, and they have spent the last 24 hours trying it out, reporting bugs, and giving me lots of feedback. I've spent the last 6 hours frantically fixing some of the bugs, getting ready for a public release, and trying to remember to breathe.

    So, with a big fat not-even-beta-yet label slapped on, here is SQL Tab Magic:

    Tabs are automatically restored when you reopen SSMS

    Reopen tabs that you have closed manually

    Search open tabs and jump directly to the one you want

    Download SQL Tab Magic from the Red Gate website.

  • Slaying Man-Months: Building SQL Tab Magic in 5 days

    Posted Wednesday, August 10, 2011 10:18 AM | 1 Comments

    Every three months, all work at Red Gate stops. Projects are suspended, and the developers, testers, user experiences specialists, and technical authors here spend a week working on projects of their choice. It's a great way to get back energy and enthusiasm for our jobs, but is it really possible to build a useful new product in a week?

    I had no idea, but with a slightly naive optimism towards time estimates I set out to try. My vision was to build an add-in for SQL Server Management Studio that remembered what tabs I have open when I restart my computer. I often have many ad-hoc queries open, and could never be bothered to save all 20 of them before restarting to apply the omnipresent windows critical update patches, so inevitably hours of work got lost. But Firefox just restores all my tabs when I open it back up, why couldn't SSMS do the same?

    The first step towards making this happen was some blatant, shameless theft. I was working on SQL Source Control, so I stole its SSMS add-in code. SQL Prompt has code to interact with currently open query windows, so I stole that. SQL Search has code to open up new query windows, so I stole that too. I learnt a lot about writing maintainable code in the process - maintainable code is code which is easy to steal.

    After 2 days of hacking i had the product basically done. It remembered tabs when you closed SSMS. It opened them up when you started it up again. Things were looking up. I had even implemented some additional bonus features, like an option to reopen the last closed tab.

    Except it was a million miles away from Red Gate quality. Tabs when reopened didn't get the correct names. I wanted to show a list of recently closed tabs as a drop down from the toolbar, but the SSMS APIs couldn't support this. Connction information wasn't correctly saved and restored. Just minor tweaks to make the product that much better.

    Minor tweaks aren't so minor. Finding a hack to get the drop down working took the best part of a day. Saving connection information took a day, and I still didn't get it working. Getting tab names right took half a day and some of the most ugly code I've ever written. If you're keeping track you'll have noticed there isn't any time left. Two and a half days for no new features, just a few tiny tweaks. Head, meet desk.

    However, it was done. Finished. It had to be; the demo was starting in 5 minutes. And it really was ready, it worked, and lots of people asked me for copies.

    It wasn't really done though. It was some code, that ran, but it wasn't a product. There was no installer. There was no facility for updating to newer versions. It wasn't obfuscated. There were no licensing restrictions. There was no web site explaining how to use it. That's what I've been doing this week, and tomorrow I'll be releasing it into the wild for you try.

    Check back here for the link, or follow me on twitter.

    Update: Download SQL Tab Magic from the Red Gate website

    So can you make and release a product in 5 days? No, it takes at least 8.

  • Sharing configuration settings between Windows Azure roles

    Posted Wednesday, March 16, 2011 4:45 PM | 0 Comments

    If you are working on a medium-large Windows Azure project it's likely it will involve more than one role, for example separate web and worker roles. Unfortunately although all the windows azure configuration settings are stored in a single cscfg file, there is no way to share configuration settings between multiple roles. This means you have to duplicate common settings like connection strings across all your roles. There is an open Connect issue about this topic, but Microsoft have not said when they will fix it.

    In the mean time I've put together a dirty dirty hack cunning workaround that creates a fake role containing your shared configuration settings, and copies it to all roles as part of the build process. Here's how you set it up:

    1. Download the zip file attached to this post, and unzip it into the folder containing your Azure project (not your solution folder).

    2. Edit your csdef and cscfg files to include the placeholder project

    ServiceDefinition.csdef

    <?xml version="1.0" encoding="utf-8"?> 
    <ServiceDefinition
    name="AzureSpendNotifier"
    xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceDefinition">
    <WorkerRole name="GLOBAL">
    <ConfigurationSettings>
    <Setting name="ExampleSetting" />
    </ConfigurationSettings>
    </WorkerRole>
    <WorkerRole name="MyWorker">
    <ConfigurationSettings>
    </ConfigurationSettings>
    </WorkerRole>
    <WebRole name="MyWeb">
    <Sites>
    <Site name="Web">
    <Bindings>
    <Binding name="WebEndpoint" endpointName="WebEndpoint" />
    </Bindings>
    </Site>
    </Sites>
    <ConfigurationSettings>
    </ConfigurationSettings>
    </WebRole>
    </ServiceDefinition>

    ServiceConfiguration.cscfg

    <?xml version="1.0" encoding="utf-8"?>
    <ServiceConfiguration

    serviceName="AzureSpendNotifier"

    xmlns=http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceConfiguration

    osFamily="1" osVersion="*">
    <Role name="GLOBAL">
    <ConfigurationSettings>
    <Setting name="ExampleSetting" value="Hello World" />
    </ConfigurationSettings>
    <Instances count="1" />
    </Role>

    <Role name="MyWorker">
    <Instances count="1" />
    <ConfigurationSettings>
    </ConfigurationSettings>
    </Role>
    <Role name="MyWeb">
    <Instances count="1" />
    <ConfigurationSettings>
    </ConfigurationSettings>
    </Role>
    </ServiceConfiguration>

    It is important that all your roles contain a ConfigurationSettings entry in both cscfg and csdef files, even if it's empty- otherwise the shared configuration settings will not be inserted.

    3. Open your azure deployment (.ccproj) project in notepad, and add the highlighted line below:

      ... 
    <Import Project="$(CloudExtensionsDir)Microsoft.CloudService.targets" />
    <Import Project="globalsettings/globalsettings.targets" />
    </Project>

    It is important you add this below the Microsoft.CloudService.targets import line, as it replaces some of the rules defined in that file.

    Visual studio will prompt you to reload the project, say yes. At this point you will have a new Azure role called 'GLOBAL' with settings you can edit through the visual studio properties panel as normal. This role will never be deployed, but any settings you add to it will be copied to all your other roles when deployed or tested locally within visual studio.

<February 2012>
SuMoTuWeThFrSa
2930311234
567891011
12131415161718
19202122232425
26272829123
45678910
Exploring SSIS Architecture and Execution History Through Scripting
 When you are using SSIS, there soon comes a time when you are confronted with having to do a tricky... Read more...

A Testing Perspective of Controllers and Orchestrators
 The neat separation between processing and rendering in ASP.NET MVC guarantees you an application... Read more...

TortoiseSVN and Subversion Cookbook Part 4: Sharing Common Code
 Michael Sorens continues his series on Source Control with Subversion and TortoiseSVN by describing... Read more...

How to Kill a Company in One Step or Save it in Three
 The majority of companies that suffer a major data loss subsequently go out of business. Wesley David... Read more...

Migrating from OCS 2007 R2 to Lync: Part 4
 Having migrated the rest of our users and legacy resources across and started getting ready to... Read more...