Wednesday, June 6, 2012

Unused/Duplicate Indexes in SQL Server

One of the most common tasks when maintaining a database still under development is cleanup. 

Cleaning up unused/duplicate indexes is important, say you created a great index in the past for a particularly slow query, two months after that, there was a redesign for that section of the code making your old index obsolete. 

Unused/duplicate indexes waste space, more so, they slow down updates and can even slow down queries if the query engine chooses a very bad index as its datasource, we don't want that in our pristine database.

So now what? in a large database going after every table, every query and every index is sometimes impractical and narrowing the number of indexes you need to verify could be the only way to go.

There are two almost exact ways of doing it, one is with Microsoft SQL Server Management Studio and the other is via query.

In the Management Studio, go to Object Explorer, right click the desired database, Reports, Standard Reports, Index Usage Statistics.

You'll get something similar to this:


So what are all these numbers? What should you look at?

The most important information to you are the number of seeks, scans and updates. The last user seek, scan, lookup times.

The number of user Seeks, Scans and Updates are important to know which index is most/least active, but what if the most active index is also a one deprecated a month ago and nobody noticed it? that's why the dates are there.

Note that Index Usage Statistics only shows indexes that were actually touched by the system, either by updates or by queries.

To find really dead indexes, you should also take a look at Index Operational Statistics, look for indexes without any Range Scans or Singleton Lookups.

In the table below I've added an index but didn't bother to query by it, so the last time time that index was touched was on creation, the table was only touched by queries not requiring it so it doesn't show up in the Index Usage Statistics.

Another very important thing to notice are indexes with excessive scans and no seeks, indexes are there to reduce the load on the database, its cool that the database can use them also by their includes to shrink down the subset of the data to work with, but a lot of the time its an indication there's a rogue query on the loose, catch it before it becomes an issue.


In my opinion the information piece missing most from this report is the size of the index, as much as its bad practice to keep an unused index, for example, if its 1k in size and it doesn't slow down any updates its a waste of your time and resources to even look at it (unless its your task to clean up and not to enhance performance).

Use common sense, if the index is on a table you don't know what it does, check with your team mates who last handled that table before you make ANY decisions, could be a report table for the bookkeeping department, executed once a year and when they need it all of the sudden they are doing over night because the reports didn't came out in time... measure twice, cut once.


Martin Thwaites has a nice article in CodeProject, he wrote a query (I Just added the last write time) that shows you all this info and the size of the index, thanks Martin!


SELECT 
    sch.name + '.' + t.name AS [Table Name], 
    i.name AS [Index Name], 
    i.type_desc,  
    ISNULL(user_updates,0) AS [Total Writes], 
    ISNULL(user_seeks + user_scans + user_lookups,0) AS [Total Reads],
    s.last_user_seek, 
    s.last_user_scan ,
    s.last_user_lookup,
    s.last_user_update,
    ISNULL(user_updates,0) - ISNULL((user_seeks + user_scans + user_lookups),0) AS [Difference],  
    p.reserved_page_count * 8.0 / 1024 as SpaceInMB
FROM sys.indexes AS i WITH (NOLOCK)  
    LEFT OUTER JOIN sys.dm_db_index_usage_stats AS s    WITH (NOLOCK) ON s.object_id = i.object_id  AND i.index_id = s.index_id  AND s.database_id=db_id()  AND objectproperty(s.object_id,'IsUserTable') = 1  
    INNER JOIN        sys.tables                    AS t    WITH (NOLOCK) ON i.object_id = t.object_id  
    INNER JOIN        sys.schemas                    AS sch    WITH (NOLOCK) ON t.schema_id = sch.schema_id  
    LEFT OUTER JOIN sys.dm_db_partition_stats    AS p    WITH (NOLOCK) ON i.index_id = p.index_id and i.object_id = p.object_id
WHERE (1=1)
    --AND ISNULL(user_updates,0) >= ISNULL((user_seeks + user_scans + user_lookups),0) --shows all indexes including those that have not been used  
    --AND ISNULL(user_updates,0) - ISNULL((user_seeks + user_scans + user_lookups),0)>0 --only shows those indexes which have been used  
    --AND i.index_id > 1            -- Only non-first indexes (I.E. non-primary key)
    --AND i.is_primary_key<>1        -- Only those that are not defined as a Primary Key)
    --AND i.is_unique_constraint<>1 -- Only those that are not classed as "UniqueConstraints".  
ORDER BY [Table Name], [index name]




No comments:

Post a Comment