Optimizations with Indexing
✅ DO implement proper indexing
An index is a data structure that speeds up data retrieval from a database tables.
Indexes are also used to define primary keys with unique indexes to avoid duplicate key values.
Indexes occupy disk space. The more indexes you have, the greater the space used on the disk. In SQL Server, a clustered index requires no additional disk space, but any non-clustered index needs additional disk space as it is stored separately from the list.
Using proper indexing allows for very fast SELECT and ORDER BY operations.
Indexing may slow down INSERT, UPDATE, and DELETE operations.
When inserting a batch of a million records, it may be best to drop the index and then re-create it afterwards.
This diagram outlines the logic flow of adding indexes to a table.
An indexing operation can be a scan or a seek. An index scan is traversing the entire index for matching criteria whereas index seek is filtering rows on a matching filter.
For example,
SELECT
p.Name,
Color,
ListPrice
FROM
SalesLT.Product p
INNER JOIN SalesLT.ProductCategory pc ON P.ProductCategoryID = pc.ProductCategoryID
INNER JOIN SalesLT.SalesOrderDetail sod ON p.ProductID = sod.ProductID
WHERE
p.ProductID > 1
In the above query, we can see that a total of 99% of the query execution time goes in index seek operation. Therefore, it is an important part of the optimization process.
Guidelines for choosing index:
- Indexes should be made on keys that frequently occur in WHERE clause and join statements.
- Indexes should not be made on columns that are frequently modified i.e UPDATE command is applied on these columns frequently.
- Indexes should be made on Foreign keys where INSERT, UPDATE, and DELETE are concurrently performed. This allows UPDATES on the master table without shared locking on the weak entity.
- Indexes should be made on attributes that occur together commonly in WHERE clause using AND operator.
- Indexes should be made on ordering key values.
✅ DO check missing indexes
When we run a SQL query and look for the actual execution plan in SQL Server Management Studio (SSMS) , we occasionally get a suggestion for an index that could help us enhance our SQL query.
You can also inspect the details of missing indexes in your environment using the dynamic management views.
- sys.dm_db_missing_index_details
- sys.dm_db_missing_index_group_stats
- sys.dm_db_missing_index_groups
- sys.dm_db_missing_index_columns
DBAs usually design indexes based on SSMS’s recommendations. For the time being, it may help SQL query optimization. However, you should not make the index solely based on those suggestions. It may have an impact on the performance of other queries and slow down your INSERT and UPDATE operations.
Execution Plans in SQL Server
The Execution Plan tool in SQL Server can be useful for creating indexes.
Its main function is to graphically display the data retrieval methods chosen by the SQL Server query optimizer.
To retrieve the execution plan (in SQL Server Management Studio), just click “Include Actual Execution Plan” (Ctrl+M) before running your query.
Afterwards, a third tab named “Execution Plan” will appear. You might see a detected missing index. To create it, just right click in the execution plan and choose the “Missing Index Details…”.
✅ DO check for unused indexes
You may encounter a situation where indexes exist but are not being used. One of the reasons for that might be implicit data type conversion. Let’s consider the following query:
SELECT Column1, Column2 FROM TestTable WHERE IntColumn = '1';
When executing this query, SQL Server will perform implicit data type conversion, i.e. convert int data to varchar and run the comparison only after that. In this case, indexes won’t be used. How can you avoid this? We recommend using the CAST()
function that converts a value of any type into a specified datatype. Look at the query below.
SELECT Column1, Column2 FROM TestTable WHERE IntColumn = CAST(@char AS INT);
Let’s study one more example.
SELECT Column1, Column2 FROM TestTable WHERE DATEPART(YEAR, SomeMyDate) = '2021';
In this case, implicit data type conversion will take place too, and the indexes won’t be used. To avoid this, we can optimize the query in the following way:
SELECT Column1, Column2 FROM TestTable WHERE SomeDate >= '20210101' AND SomeDate < '20220101'
Filtered indexes can affect performance too. Suppose, we have an index on the Customer table.
CREATE UNIQUE NONCLUSTERED INDEX IX ON Customer (MembershipCode)
WHERE MembershipCode IS NOT NULL;
The index won’t work for the following query:
SELECT Name, Address FROM Customer WHERE MembershipCode = '258410';
To get use of the index, you’ll need to optimize the query in the following way:
SELECT Name, Address FROM Customer
WHERE MembershipCode = '258410' AND MembershipCode IS NOT NULL;
Note that if your tables are constantly hammered by INSERT
, UPDATE
, and DELETE
, you should be careful when indexing—you could end up decreasing performance
as all indexes need to be modified after these operations.
Learn More
A Detailed Guide on SQL Query Optimization
SQL Query Optimization: How to Tune Performance of SQL Queries