Author: Austin Dolezal

Many businesses are wasting time and money cleaning up their data when small fixes and tweaks could optimize their processing time and storage space. Database indexes and partitions are often created and forgotten about until there is an issue or results are returning slower than expected. In this blog post, we are going to cover a quick summary of what indexes and partitions are in SQL Server, when they are most beneficial, and also go over a few scripts that can be used to identify specific problem areas and correct only those areas that need to be addressed.
What are Indexes and Partitions?
In SQL Server, indexes are partitions are a key part of maintaining your data and increasing the speed that data is accessed. An index is a database object configured on a table that acts as a reference point for the table. Querying a table with an index allows the query to receive directions on where to find particular data, increasing the speed the execution finds the specific information it is searching for. If a table does not have an index, the query instead is forced to do a full table scan to find any filters that are being applied in the WHERE clause.
The two main types of indexes are clustered and non-clustered indexes. Both have their own pros and cons, and serve particular purposes in an environment.
Clustered indexes sort the table itself based on the specified index value. This allows the system to save some storage space compared to non-clustered indexes, but it can be slower in some queries than non-clustered indexes if you are filtering based on values outside of the Primary Key. Only one clustered index can exist per table. It does allow for faster querying if you are pulling ranges of data based on the Primary Key. A clustered index is created automatically when a Primary Key is created for a table.
Non-clustered indexes are stored separately from the data table based on the columns that are specified at creation and then create a reference to the row in the actual table. This does lead to more storage space being used since another object is saved, but it can lead to increased performance on a table that is frequently being queried against the specified columns of the non-clustered index. Multiple non-clustered indexes can be created on a table allowing more flexibility and performance based on user needs.
Partitions are used to further break down indexes into smaller, more manageable chunks. These are particularly helpful for tables that contain a larger volume of data. For example, a partition can be created for a non-clustered index on a date column to divide the data by year. As new data is inserted in the table and the indexes are rebuilt, data will be split into their designated partitions within the index based on the date. These partitions being defined allow queries to further narrow down the search for data during execution
Fragmentation of Partitions
One problem that is frequently seen with partitioned tables and indexes is that they become fragmented. Fragmentation occurs when the data inside of a partition that was once organized based on the definition of the index, becomes disorganized due to data inserts or changes. Inserting and altering data does not cause indexes and partitions to be automatically rebuilt. Rebuilding an index is a simple process. This can be done by executing the SQL below:
ALTER INDEX [index_name] ON [database_name].[schema_name].[object_name]
By running the line above, it will rebuild all partitions attached to that index. While this does address the fragmented partitions, it is also spending time processing other partitions that might not be fragemented much. This leads to additional execution time and resource usage going over all partitions instead of targeting the specific partitions that are causing issues. The ALTER INDEX statement does contain an optional parameter that allows users to target specific partitions:
ALTER INDEX [index_name] ON [database_name].[schema_name].[object_name] REBUILD PARTITION = [Partition_ID]
Rebuilding the index based on a particular index reduces your execution time and resource usage by allowing you only run the partitions that are most fragmented.
Finding Fragmented Partitions
At this point, you are probably wondering how you can identify which partitions are the problem. For this, we can use the script below to find the fragmentation of all indexes created for a database object.
SELECT SELECT DISTINCT ips.object_id
,s.[name] AS SchemaName
,o.[name] AS ObjectName
,i.[name] AS IndexName
,ips.index_id
,ips.partition_number
,index_type_desc
,avg_fragmentation_in_percent
,avg_page_space_used_in_percent
,page_count
,ips.record_count
FROM sys.dm_db_index_physical_stats(DB_ID(),NULL,NULL,NULL,’SAMPLED’) ips
INNER JOIN sys.indexes i WITH(NOLOCK) ON ips.object_id = i.object_id AND ips.index_id = i.index_id
INNER JOIN sys.objects o WITH(NOLOCK) ON o.object_id = i.object_id
INNER JOIN sys.schemas s WITH(NOLOCK) ON s.schema_id = o.schema_id
INNER JOIN (SELECT Object_ID,SUM(row_count) TotalRowCount
FROM sys.dm_db_partition_stats
GROUP BY Object_id
HAVING SUM(row_count) > 10000000
) sp ON o.Object_ID = sp.Object_ID
WHERE ips.avg_fragmentation_in_percent > 60
AND o.Type = ‘U’
This script is going to comb through the index statistics and show indexes that are over the assigned fragmentation percent, which is set in the WHERE clause. This number can be changed to the value that is deemed ‘acceptable’ fragmentation. While the number can be reduced to identify a larger number of fragmented partitions, be aware that this will increase the number of objects that get rebuilt and thus increase the amount of time and resources the system will take up.
The join with sys.dm_db_partition_stats adds the additional filter of insuring that you are only looking through tables that contain a minimum number of records. This further helps to reduce the number of objects that are being looked at by the query. The value inserted above is set to 10 million to ensure that the size of the partition is significant enough to provide the best results.
Rebuilding Fragmented Partitions
In the below SQL script, the fragmented partitions that were previously identified will be rebuilt. The script utilizes the temporary table being populated in the dynamic SQL statement to build individual ALTER statements targeting a particular partition. A WHILE loop is used to loop through the table in order to create the ALTER statements before they are then executed sequentially.
The top section labeled “Configuration Parameters” is the only area where changes are needed. It defines the database name that should be checked, the fragmentation minimum of what needs rebuilt, and there is a debug setting to prints a list of partitions that would be updated and their associated ALTER statements.
Indexes and index partitions are powerful tools to increase performance of SQL Server. When properly maintained, execution times are reduced allowing users to get to their data faster. However, when not maintained, those same tools can lead to headaches and poor performance. While many users know to rebuild indexes and setup regular jobs to rebuild database objects on a schedule, they might not be doing everything they can to do so efficiently. Instead of simply rebuilding larger indexes and all of their partitions, try to limit the amount of work your system is doing to only those partitions that need to be rebuilt. The script provided will simplify your workload and ensure that problem partitions are being corrected.
