Selection
✅ DO use SELECT
fields, rather than using SELECT *
Instead of SELECT * use SELECT column_1, column_2, ….
SELECT *
is highly inefficient as it scans the entire database.SELECT *
may return unused columns which negatively affects bandwidth and database resources.- When table changes are made,
SELECT *
may automatically return columns that are undesirable.
Inefficient
SET STATISTICS TIME ON
SELECT * FROM SalesLT.Product
Efficient
SET STATISTICS TIME ON
SELECT ProductNumber, Name, Color,Weight FROM SalesLT.Product
As we can see from the above two outputs, the time is reduced to one-fourth when we use the SELECT
statement for selecting only those columns that are required.
✅ DO retrieve limited and accurate results
The lesser data is retrieved, the faster the query will execute.
Rather than applying many filters or conditions to the actual query, you can do this on the server or application side first which will limit the number or results the database has to retrieve.
You can also use the TOP statement to limit the number of results to return.
Having pagination as well on the application can help increase the database performance.
TOP
Command
The TOP
command is used to control the number of rows to be displayed from the result set. The result set needs to display only those rows that are required. Therefore, one must use limit with the production dataset and provide an on-demand computation of rows for the production purpose.
SET STATISTICS IO ON
SELECT TOP 10 Name, Color, ListPrice
FROM SalesLT.Product
The above query prints the top 10 rows of the resultset. This drastically improves the performance of the query.
✅ DO fully qualify database object names
The use of fully qualified database object names is intended to eliminate ambiguity. The following is an example of a fully qualified object name:
DATABASE.SCHEMA.OBJECTNAME
It’s critical to declare what you want to access when you have access to numerous databases, schemas, and tables. This isn’t required until you’re working with huge databases with various users and schemas, but it’s a good practice to follow.
As a result, rather of utilizing a statement like:
SELECT Column FROM Employee
Use these words:
SELECT column FROM dbo.Employee
❌ AVOID using SELECT DISTINCT
The DISTINCT
keyword removes duplicates from a query.
To ensure there are no duplicates requires extensive processing power.
You can select more fields from the database table to create more unique results instead of using DISTINCT
.
Inefficient
SELECT DISTINCT FirstName, LastName, Country
FROM Customer
This query will group duplicate first name, last name, and country customers causing inaccurate results.
Imagine having John Smith or Jane Smith multiple times with exactly the same country origin but different customers.
For larger sets of data, grouping these duplicates will cause the query to be executed slowly.
Efficient
SELECT FirstName, LastName, City, Country
FROM Customer
By omitting the DISTINCT
and adding more fields to compare, the records will be easily identified as duplicated or not. The database will not group the records as well so same names and address won’t be an issue.
❌ AVOID using correlated subqueries
- Correlated subqueries is dependent on the parent or outer query.
- Since this type of query executes row by row, it decreases overall database performance.
Inefficient
SELECT
b.Name,
b.Phone,
b.Address,
b.Zip,
(
SELECT
CompanyName
FROM
Company
WHERE
ID = b.CompanyID
) AS CompanyName
FROM
Business b
The query above may have performance issues as it will run the inner query for each Business
row. This can easily be solved by using a JOIN
instead.
Efficient
SELECT
b.Name,
b.Phone,
b.Address,
b.Zip,
c.CompanyName
FROM
Business b
Join Company c ON b.CompanyID = c.ID
Learn More
A Detailed Guide on SQL Query Optimization
SQL Query Optimization: How to Tune Performance of SQL Queries
Top 10 SQL Query Optimization Techniques