Tony Davis

Simple-Talk Editor
News, views and good brews

The DBA Daily Checklist

Published Tuesday, May 13, 2008 3:56 PM

The average DBA has to perform many routine checks on his or her servers. There will be daily checks, weekly checks and, probably, monthly checks. A short while ago, we were having quite a debate about what items would be on the DBA Daily checklist. Because we all like lists with ten items in them, we compiled a list that goes something like this:

 

  1. Connectivity: Make sure each database is available and accessible both by logging in via user applications, as well as running test scripts.
  2. Backups: Check database and log backups, archiving and offsite storage.
  3. Events: check all database logs, application logs and system logs, Agent history, device logs, NIC logs etc. Investigate any job failures
  4. Processes: Check that all required processes on the server are running, including Replication
  5. Integrity: Perform all database and server integrity checks; look for objects that break rules
  6. Indexes: Check on indexes to see if they are being used, need re-creating, or if any are missing
  7. Volumetrics: Check resources on the server such as files sizes and disk space, and monitor growth
  8. Performance: Check application performance, and performance statistics, using the Perfmon tool; research and resolve any issues.
  9. Procedures: Check all Disaster Recovery Plans
  10. Security: Look for security policy violations

 

We thought we had a fairly exhaustive list, but then a Simple-Talk forum entry made us think again. Randyvol had posted in response to our DDL Trigger workbench:

 

"…last week some idiot turned a host of triggers off in our ERP system, causing a cascade of posting problems on dozens of orders before we caught the root cause…"

 

He wanted to know if there was some way of checking whether Triggers have been disabled. It may be unusual and obscure, but it makes perfect sense that on some production servers you will need to check to see if triggers have somehow been disabled. The consequences could be dire.

 

We added this check to our list but then began thinking: what else are we missing? There are so many different ways that SQL Server is used that there must be a host of other checks that could make all the difference.

 

As always, we'd like to hear what you think. Post your suggestions as a comment to this blog, and you'll go into the draw to receive one of five Simple-Talk gift bags!

 

Cheers,

Tony.

Comments

 

ChssAddct said:

You mention checking for missing Indexes.  How about checking for missing Tables!

For Instance:  A curious type sees a table named Numbers.  They look inside and see it has a single column, filled only with the numbers from 1 to 10,000.  What a ridiculous, useless table, right?  Clearly a dust-bunny from somebody experimenting with some SQL!  What use could a table with all the numbers from 1 to 10,00 possibly have?  I think I'll drop this table right now.  And anyway, if it's really needed, how hard can it be to write a loop and create a new one?

Of course, they've now just dropped the "Swiss Army Knife" of SQL developers (see
http://www.simple-talk.com/sql/t-sql-programming/the-helper-table-workbench/).  It's used in UDFs, Stored Procs, Scripts, Scheduled Jobs, and apps.  A seemingly useless looking table, yet can't do without it!

Note, Robin & Phil, a 'cleverer' way in this article of creating the table instead of using a loop:  http://www.sqlservercentral.com/articles/TSQL/62867/

Of course, checking for missing tables would also entail those tables that contain crucial data, but this is the first table that comes to mind that would simply start to wreak havoc were it to go missing.  Easily created -- sure.  Unrecoverable loss of business data? -- nope.  Crucial to have around?  Absolutely.  

Although, as much as I use it, I wouldn't ever have to perform a check for it having gone missing.  The cell phone would be blaring "help! things just stopped working".

Oh, and I
May 14, 2008 4:20 AM
 

Phil Factor said:

Agreed that there are loads of clever ways of doing the numbers table, but we thought that introducing them would distract from the main point.

Having spent a while doing a daily policing job on a database-driven website, my daily checks always included a lot of checks for intrusion. I was always amazed by  the number of such attempts, some of them of great ingenuity. In the end, having done the obvious checks, I was looking for any unusual activity or changes in the pattern of usage.  Snort (http://www.snort.org/) is great, but I'd love a way of doing it specifically for SQL Server that is less propellor-head.
May 14, 2008 4:55 AM
 

RLaubert-MCITP said:

One of the things that is becoming almost daily are patches.  I currently have patches for the OS, SQL Server, Assorted OS programs (IE, .NET, ASP etc), the applications, IIS, Visual Studio and the list goes on and on.

Each of these patches has to be tested on development, staging and then production.  So it is an almost daily process that somewhere, one or more of these are taking up some of my time.
May 14, 2008 7:34 AM
 

MVV said:

- Find what it is there and shouldn't (Unused indexes/tables/SP) left behind by previous incarnations of the apps. Check as well for unused or must delete access (fired workers or quitters.) I was amazed by the resilience of logins. Some of the logins i found were so ancients , nobody remembered the guys/gals.

May 14, 2008 9:23 AM
 

Rodney said:

These may fall somewhere within the listed categories, but these are some I thought of as I read through; ones that I check regularly.

* Long running or "hung" processes, especially those tied to a SQL Agent job.
* Make sure that SQL Mail, SQL Agent Mail, Database Mail, SQL Backup or xp_SMTP_Sendmail is working correctly, especially if you have jobs or backup code notifications sewed to these. Issues with mail queues with MAPI clients are common (not sql 2005 or smpt thankfully)
* SQL Agent Job modifications. In organizations that do not have a centralized scheduling system and rely on SQL Agent, any schedule modificaiton can have negative impact. I use a report that shows which jobs were modified and when, unfortuantely it does not reveal "who", but with SSRS execution snapsots, I can see what was changed.
May 14, 2008 10:42 AM
 

johnc said:

SQL Agent up and running.

SQL Agent Jobs that fail.

Databases that automatically grew.

-jfc-

May 14, 2008 11:01 AM
 

jimbow said:

Clearly, depending on your enterprise scope, how many hours are there in the business day to deal with all the 10 or more points - on a daily basis?   Just checking the complete status of 100 database backups is not a trivial task. Most times you would rely on the engine  to report success for the whole process in a single notification.    Is that good enough to get on with the next item in the day's chores ?

Similarly if it can be automated and all the health checks can be trusted to be done that way - daily checks are then to ensure that the automation itself is working and the notifications to be dealt with etc are as reliable and minimal  as possible.

The DBA is not necessarily an individual.
May 14, 2008 11:26 AM
 

randyvol said:

Randyvol here again (in the process of creating DDLtriggers for my databases!)

Referring to the post about missing tables, (I think I've posted this elsewhere, but I will do so here again), what about a way to list the last time a table had a SELECT, INSERT, UPDATE  or DELETE operation against it?

Why would I want to know that?  I have a legacy SQL Server 2000 system with literally hundreds, maybe thousands of tables across a dozen or so databases that I suspect are in the 'dustbunny' category.  But we still have so many jobs running against this system that it would take more time than I have to parse through all the scripts, sprocs, triggers and what have you to collect a list of 'keepers'.

If I could just figure a way to know the above and run a query to find tables that say, have had no activity in the last qtr, 6 months, or year; I'd say it is a candidate to get blown away and give me back some disk!

Regards.
May 15, 2008 8:57 AM
 

imassi said:

Checking to make sure there is a pot of coffee brewing actually has proven an effective check.  There've been a couple of times this week that someone asked me to look into something, but have it turn out that they weren't paying attention and didn't click the right thing.  They apologized and said that they hadn't had their coffee yet.  If only I could write a script to automate this process...
May 15, 2008 2:20 PM
 

Ron Dameron said:

We use log shipping extensively.  

So, checking that all the different log shipping plans are synchronized is on my top ten list.



May 16, 2008 12:18 PM
 

sagreene said:

There used to be a good page here with daily/weekly/monthly tasks. Maybe it will be back up:
It's the sql 2000 Operations Guide.
http://www.microsoft.com/err/technet/prodtechnol/sql/2000/maintain/sqlops2

The trigger issue is easy to fix - take away DDL rights from people.

It's best to have competent people at the controls.  Trying to plug up every possible mistake with monitoring systems is probably a poor idea.

Monitoring a few critical functions and restricting access is a nice balance.

Here is how I did a few monitoring tasks.  Props to Rodney for the SSIS package which was the basis for all of this.

Connectivity:  

Set up a SSIS which polls all servers 'select @@servername' every 10 minutes.  Pages and emails on failures.  www.zabbix.com - this monitoring software can monitor applications and post data to forms - then you can use regex to verify you get the correct results.  Also gives you nice graphs of response times and can alert if they fall out of bounds.  I was hoping to use this to monitor sql reports as a start, but it cannot authenticate to windows...and we don't have kerberos working.

Backups: Check database and log backups, archiving and offsite storage.

I've been working on a SSIS package that retrieves the most recent backup file location and restores it on a test server.  I've been told the only way to really test a backup is to restore and use it.  This will let me restore a chunk of databases - say 50 or so and notify analysts to change their connections and test.  Sweet!

Volumetrics/Performance:

We've been using software called Zabbix for monitoring this type of stuff, free, template features and nice graphing.  I monitor, drive space, cpu usage and memory - handles have been useful and have alerted us to problems on a few servers.

Security: Look for security policy violations

You can do the same thing with an SSIS package, gathering security facts and putting them in a repository then generating sql reports that can be scheduled.

I also do a weekly pager test from our main clusters by raising a high severity alert and making sure that sends an email and page.  

This article was great for getting that going:  http://www.simple-talk.com/sql/database-administration/sql-server-alerts-soup-to-nuts/




May 21, 2008 3:53 PM
 

Eric Russell said:

Regarding the previously described issues of disabled triggers and dropped tables in production, it sounds like enforcing a change management policy would make the DBA's life a lot easier. Ideally, only the DBAs and the application or services have a login to production.
One thing I like to do occasionally (especially when troubleshooting) is use SQL Compare to generate a report of object schema differences between a daily snapshot of production and the pre-prod QA server, just to confirm that nothing has been ad-hocly deployed since the last official release. In addition to just schemas, changes to meta data and configuration settings that can cause subtle changes in the applications behaviour that are hard to diagnose, and using SQL Data Compare on specific user and system tables can cover that base as well.
May 22, 2008 8:59 AM
You need to sign in to comment on this blog

















<May 2008>
SuMoTuWeThFrSa
27282930123
45678910
11121314151617
18192021222324
25262728293031
1234567
Go With the Flow
 Knowing enough about the routes that messages take is vital to being an effective Exchange admin,... Read more...

When Email Collaboration Could Have Changed History
 In our mission to make history relevant to the busy IT executive, we speculate how Email might have... Read more...

Bunnikins!
 When an IT manager is selected as a victim of office politics of a large corporate, it is time for him... Read more...

Exchange Database Technologies
 One of the most misunderstood technologies in Exchange Server, regardless of its version, is the... Read more...

Top Tips for Exchange Admins
 Michael Francis hands out imaginary Olympic medals to the winner of the August 'Top Tips for Exchange... Read more...