28 January 2014

Collect Your SQL Server Auditing and Troubleshooting Information Automatically

If you have a number of SQL Server instances with versions ranging from 2005 upwards, with a whole host of databases, and you want to be alerted about a number of diverse events that are useful for first-line problem-diagnosis and auditing, then Feodor's homebrew solution, using SSIS and Robocopy is likely to be what you're looking for.

2091-stage3.png

DATABASE LIFECYCLE MANAGEMENT PATTERNS & PRACTICES LIBRARY

      Stage 3: Continuous Integration

After many years, the Default Trace still remains the simplest  way of auditing SQL Server. It gives you so much useful information about significant events, when they happened and, where relevant, the login associated with the event. The Default trace is, however, now deprecated in favor of Extended Events and  so has not evolved much over the years. The biggest problem with it is that it consists only of 5 files of 20Mb each, and they get overwritten often, especially in a busy SQL Server environment.  

This article shows how you can get around this difficulty in order to maintain an unbroken record of trace events. This is only the start.

We then tackle the problems of maintaining  a record of these default trace events for a whole group of servers and their databases, and use this archive for reporting and alerting for potential problems.  We will do this by  automating the process of extracting the default trace data from several SQL Server instances to a centralized location, persisting the data to a single database and preparing it for further analysis. The information that it can provide you about the way these servers and databases are being used is extremely valuable and difficult to get any other way.  We can, for example get a complete record of every change to every database object, when it happened and who did it.

For the purpose, we will be using a Robocopy script which offloads the default trace files from the remote servers, then SSIS package which will import the data into a database and will delete the imported files.

The steps are as follows:

  • Configure the Robocopy to access the remote server and to store the default trace files locally
  • Configure the SSIS package to look for the default trace files copied by Robocopy

 We’ll use Robocopy because  the tool can be used to

  1. monitor the folder in a remote server  that contains the default trace files
  2. detect any changes and copy over any changed file  periodically

We choose Robocopy over  SSIS to do this because  we would have to schedule an SSIS package to run quite often and the copying process is not as lightweight.

Setting up Robocopy

The purpose of the Robocopy script in this case is to use it to maintain a copy of the Default Trace files in a centralized location, since the default trace log files in the  SSQL Server instance are overwritten after a certain time.

This is a bit tricky to schedule and it is based on each individual SQL Server instance. For example, on a very busy production server it might be so that, every 10 minutes, all 5 default trace files are overwritten and on another SQL Server instance it may take 5 days for the files to be overwritten. The overwrite of the files depends on the volume of the traced events occurring on the system and also on instance restarts.

This is why it will take some investigation to understand and to schedule the Robocopy script  in individual cases.

For the purpose of this article I will use a setting for Robocopy to check for changes in the default trace files every 10 minutes, though, in the assumption that this interval would be geared to the number events being recorded in the trace for the individual server.

The following script will execute Robocopy and will look at the default trace folder for the SQL Server instance and will copy over the changes to a local folder:

Note that the script is using the UNC path for the file storage locations. This means that it is up to the user to decide whether the robocopy script will be scheduled to run on the source machine or on the destination machine. (From my personal experience, it is better to have all Robocopy scripts to run from the same destination machine – it is easier to monitor and maintain).

Also note that the Destination folder contains a sub-folder for each monitored server. This is used later on in the configuration of the SSIS package.

Setting up the database

Here is the database script:

After creating the objects, we have to populate the config table:

The table contains 3 columns:

  1. Server name – the name of the server which is audited
  2.  Trace path – the local folder where the default trace files are stored for the server
  3.  isActive – this flag indicates whether the files should be processed

Importing the default trace files

The SSIS package takes its configurations from the dbo.ProcessingTrace_Config table.

Then the ForEachLoop container executes for every record in the config table and it imports each trace file into a scrubbing table called dbo.temp_trc.

From there the default trace data is queried by event groups and merged into separate tables.

The idea is that since we do not know how often the default trace files are changing for each server, and since the files have a maximum size of 20Mb each (but they may be much smaller), it is actually more efficient to import them and merge them than to write custom logic to check which file was imported and which has not. (The performance overhead of importing 20Mb trace files and using the MERGE script is minimal. I performed a test by populating 1 million rows in each table by using Redgate’s Data Generator and even in such case the import was fast. )

Technically, the Robocopy script makes sure that the files are stored and updated on our local storage and later on we can schedule the SSIS package to import them at any time we would like.

The events are split in the following categories, and each category is represented by a database table:

  • FileGrowAndShrink
  • LogFileAutoGrowAndShrink
  • ErrorLog
  • SortAndHashWarnings
  • MissingStatsAndPredicates
  • FTSearch
  • AlteredObjects
  • CreatedUsersAndLogins
  • DroppedUsersAndLogins
  • LoginFailed
  • ServerStarts
  • MemoryChangesEvents

A typical merge operation is this, for sort and hash warnings. (the rest are in the SSIS package that you can download from the link at the head of the article.) The scripts can be viewed here.

After extracting and merging the data, the last step is to delete all the files from the filesystem that are older than 1 day.

Note that the scheduling of the Robocopy and the SSIS package is individual and it depends on the systems which are audited. If the default trace files are overwritten often by the source system then we might want to run the Robocopy task and the SSIS package more often.

For the purpose of this article I have set up the SSIS Script Component to delete files older than 1 day.

Here is the C# script for the component:

Conclusions

This article shows how the default trace logs of a number of SQL Servers can be aggregated and preserved on a  centralized auditing server , and then imported into a central auditing database via an SSIS task that filters and merges the results into a number of tables that give a central record of  a number of diverse events that are useful for first-line problem-diagnosis, such as database and log File growth and shrinkage,  Error Log information,  a variety of warnings, notice of created or altered  or deleted database objects, users  or logins, failed logins, server starts and memory change events.

Now we have all this information in one place for all our servers, we have the opportunity for  first-line alerting for a number of signs that things are going wrong, and that we need to reach for our monitoring system to find out more about what is going on within that server, and maybe also database.

With this database in place we can then have a number of data mining possibilities for this data. We’ll do into more detail about this in a subsequent article.

The SSIS package is downloadable from the link at the head of the article, as is the SQL source of the scripts. You can view all the SQL merge scripts via the browser by clicking here.

This article is part of our database delivery patterns & practices series on Simple Talk.

Find more articles for version control, automated testing, continuous integration & deployment.

Keep up to date with Simple-Talk

For more articles like this delivered fortnightly, sign up to the Simple-Talk newsletter

Downloads

This post has been viewed 14774 times – thanks for reading.

  • Rate
    [Total: 14    Average: 4.9/5]
  • Share

Feodor Georgiev

View all articles by Feodor Georgiev

Related articles

Also in Database

Relational Algebra and its implications for NoSQL databases

With the rise of NoSQL databases that are exploiting aspects of SQL for querying, and are embracing full transactionality, is there a danger of the data-document model's hierarchical nature causing a fundamental conflict with relational theory? We asked our relational expert, Hugh Bin-Haad to expound a difficult area for database theorists.… Read more

Also in Database Administration

The SQL Server 2016 Query Store: Forcing Execution Plans using the Query Store

The SQL Server 2016 Query Store can give you valuable performance insights by providing several new ways of troubleshooting queries, studying their plans, exploring their context settings, and checking their performance metrics. However, it can also directly affect the performance of queries by forcing Execution Plans for specific queries.… Read more

Also in Source control

PowerShell Desired State Configuration: LCM and Push Management Model

PowerShell's Desired State Configuration (DSC) framework depends on the Local Configuration Manager (LCM) which has a central role in a DSC architecture. It runs on all nodes that have PowerShell 4.0 or above installed in order to control the execution of DSC configurations on target nodes. Nicolas Prigent illustrates the role of the LCM in the 'Push' mode of configuring nodes.… Read more

Also in SQL

SQL Server System Functions: The Basics

Every SQL Server Database programmer needs to be familiar with the System Functions. These range from the sublime (such as @@rowcount or @@identity) to the ridiculous (IsNumeric()) Robert Sheldon provides an overview of the most commonly used of them.… Read more
  • Jeff_yao

    Nice idea but …
    With a few hundred instances (lots of named instances), and 10% of instances will be gone by replaced with new ones or consolidated in a year, it will be a nightmare to maintain the RoboCopy script itself.

    For trace collection, we actually only need to know one piece information, i.e. sql instance name.

    So a better solution is to use PowerShell to do the following:
    1. sql instance inventory collection system (can be combined with AD info for physical server name)
    2. Execute the local trace collection script on each target sql instance
    3. Write back the info collected back to a central repository

    Nevertheless, for a small number of sql instances (yet not changing frequently), I think your solution is solid.

  • Abdul Majeed

    DTS Package is not working
    Hi,
    Can you send me the DTS package as the attached package is not working

    Thanks Majeed

  • luismarinaray@gmail.com

    Availability merge command
    Hi,

    Excelent article, but if you are using MS Sql Server version 2005, how could I use the merge command ? , it is not available in 2005.

    Thanks

Join Simple Talk

Join over 200,000 Microsoft professionals, and get full, free access to technical articles, our twice-monthly Simple Talk newsletter, and free SQL tools.

Sign up