How many indexes is the right amount?Published 14 December 2011 3:00 am
I would be deafened by the chorus of DBA’s shouting “It depends” at me after a question like that in person, I somehow feel safer from the far side of Live Writer.
I have recently been looking at the number of indexes in one of my databases and have found a curiously high number of indexes in comparison to the number of columns.
Right, all the code now and some discussion underneath it
To locate tables that may be over populated with indexes and
list them in order of concern.
IF OBJECT_ID('tempdb..#IndexStats') > 0
DROP TABLE #IndexStats ;
AS ( SELECT [ddips].[object_id] ,
COUNT([ddips].[index_id]) AS [Index_Count]
FROM [sys].[dm_db_index_physical_stats](DB_ID(), NULL, NULL,
NULL, NULL) AS ddips
WHERE [ddips].[index_id] <> 0
GROUP BY [ddips].[object_id]
AS ( SELECT [c].[object_id] ,
COUNT([c].[object_id]) AS [Column Count]
FROM [sys].[columns] AS c
GROUP BY [c].[object_id]
AS ( SELECT OBJECT_SCHEMA_NAME([C_Count].[object_id]) + '.'
+ OBJECT_NAME([C_Count].[object_id]) AS [TableName] ,
[C_Count].[Column Count] ,
CONVERT(DECIMAL(4, 2), ( [I_Count].[Index_Count] * 1.0
/ [C_Count].[Column Count]
* 1.0 )) AS [I/C ratio]
INNER JOIN [I_Count]
ON [C_Count].[object_id] = [I_Count].[object_id]
SELECT [m].[TableName] AS [Schema/Table name] ,
[m].[Column Count] AS [Column count] ,
[m].[Index_Count] AS [Index count] ,
[m].[I/C ratio] AS [Index:Column ratio] ,
CASE WHEN [m].[I/C ratio] < 0.1
THEN '1 - Low. Having fewer indexes on a table allows Insert,
Update and Delete performance to remain high but be sure to check for missing
indexes.' -- 1 in 10 indexes to columns
WHEN [m].[I/C ratio] < 0.25
THEN '2 - Moderate. There are plenty of indexes, in relation
to the number of columns, on the table. Consider altering existing indexes rather
than creating new ones if you need to index data to improve query
performance.' -- 1 in 4 indexes to columns
WHEN [m].[I/C ratio] < 0.5
THEN '3 - High. Having lots of indexes will make everything
except Select queries, run more slowly. Consider removing/merging indexes that
are similar or get used very little.' -- 1 in 2 indexes to columns
WHEN [m].[I/C ratio] BETWEEN 0.5 AND 10 -- 1 to 1 indexes to columns
THEN CASE WHEN [column count] < 4
THEN '3 - High. Many people recommend having a
clustered index in every table and then creating a 2nd index is not unrealistic.
You should be concerned though if there are more than
this.' -- in a table with < 3 columns it may not be too big an issue
ELSE '7 - V High. Having a high ratio of indexes
to columns could mean you are slowing down an awful lot of your OLTP transactions
while the indexes and statistics are updated every time the base data is
altered.' -- You have an average of 1 index per two columns.
ELSE '9 - Really ? ! ' -- Really? more than 10 indexes to columns?!?!?
END AS [Concern]
FROM merged AS m
ORDER BY concern DESC ,
[m].[I/C ratio] DESC ,
[m].[Column Count] DESC
DECLARE @sum INT
SELECT @sum = COUNT(*)
-- Summary of index:column ratio stats, by % of tables affected
SELECT [is].[Concern] ,
COUNT([Concern]) [Table count],
COUNT([Concern]) * 1.0 / @sum * 1.0) * 100 AS [Pct_Split]
FROM [#IndexStats] AS [is]
GROUP BY [is].[Concern]
ORDER BY pct_split DESC
-- All index:column details by concern
FROM [#IndexStats] AS [is]
So this query collects data regarding column counts per table and then counts indexes per table and then merges (JOINS) them together to compare the details. When I am querying data like this and want to analyse it in a variety of ways I prefer to insert it into a temporary table. A personal preference.
The first extract from the temporary table is purely a summary of the data – how many tables are there in each category of concern? The last query lists the whole data set.
Running this against my (somewhat battered) copy of AdventureWorks we get the results as below
In the summary section we can see that 21 tables (29% of the tables in the database) have an index count that is more than half of the column count. In the detail in the second grid we can see that one table has 5 indexes on a table with 4 columns.
Now I am not promoting any particular ratio of indexes to columns, all I am suggesting is that keeping an eye on this might give you a pointer that you are adding too many indexes and possibly altering existing ones would be a better solution for the performance of your server. Do you have any thoughts on whether this is a good way, or a worthwhile way to review your indexes?
Feel free to alter the percentages in the TSQL to something that you feel more appropriate.
Usual conditions apply. Don’t trust this code on your production servers without reading and understanding how it works. You will not get any support if this code does bad things.