Filter Predicate
✅ DO select the appropriate data type
In SQL, each table column has an associated data type. Integers, dates, varchars, boolean, text, and other data types are available. It is critical to select the appropriate data type when building. Numeric numbers should be used, dates should be dates, and so forth. This is critical for indexing purposes.
Take a look at the following example:
SELECT empName, empRole FROM emp WHERE empID = 98422;
The above query retrieves the employee ID and name for the 98422.
What if the empID data type is a string? When employing indexing, you may run into problems because it will take ages to scan what should be a simple document.
✅ DO use EXISTS
to check if record is found
- Commonly, the
COUNT()
function is used to check if a record exists from the database. - This function scans the entire table even if the record is found on the first row which makes it efficient.
- Instead of using
COUNT()
, theEXISTS()
function can be used so when a record is found, the execution stops as well and return the result right away.
✅ DO use EXISTS
instead of IN
whenever possible.
IN
operator is more costly than EXISTS
in terms of scans especially when the result of the subquery is a large dataset. So we should try to use EXISTS rather than using IN for fetching results with a subquery.
Let us see this with an example,
Inefficient
SET
STATISTICS TIME ON
SELECT
ProductNumber,
Name,
Color
FROM
SalesLT.Product
WHERE
ProductID IN (
SELECT
ProductID
FROM
SalesLT.ProductDescription
)
Efficient
SET
STATISTICS TIME ON
SELECT
ProductNumber,
Name,
Color
FROM
SalesLT.Product
WHERE
EXISTS (
SELECT
ProductID
FROM
SalesLT.ProductDescription
)
We have executed the same query having a subquery with IN command and EXISTS commands and we observe that the EXISTS command takes half of the time as compared to IN command and the number of physical and logical scans is very low.
✅ DO use WHERE
Instead of HAVING
HAVING
statements are all calculated after theWHERE
statements. Therefore, it is quicker to execute theWHERE
query.- If the goal of the
HAVING
statement is just to filter result, it is recommended to use theWHERE
statement instead for higher efficiency. - For instance, if you wish to see a date range, the
HAVING
statement will retrieve all records then filter them whileWHERE
statement instantly limits the number of records pull based on the condition.
Inefficient
SELECT
c.ID,
c.CompanyName,
b.CreatedDate
FROM
Business b
JOIN Company c ON b.CompanyID = c.ID
GROUP BY
c.ID,
c.CompanyName,
b.CreatedDate
HAVING
b.CreatedDate BETWEEN ‘2020 - 01 - 01’
AND ‘2020 - 12 - 31’
Efficient
SELECT
c.ID,
c.CompanyName,
b.CreatedDate
FROM
Business b
JOIN Company c ON b.CompanyID = c.ID
WHERE
b.CreatedDate BETWEEN ‘2020 - 01 - 01’
AND ‘2020 - 12 - 31’
GROUP BY
c.ID,
c.CompanyName,
b.CreatedDate
✅ DO use wildcards at the end of a phrase only
- Wildcards create a widest search possible which is also considered as the most inefficient type of search.
- When using a lot of wildcards to search for a string, the database is tasked to search records for a match anywhere.
- As much as possible, try to limit using wildcard searching.
Wildcards serve as a placeholder for words and phrases and can be added at the beginning/end of them. To make data retrieval faster and more efficient, you can use wildcards in the SELECT statement at the end of a phrase. For example:
SELECT
p.BusinessEntityID,
p.FirstName,
p.LastName,
p.Title
FROM
Person.Person p
WHERE
p.FirstName LIKE 'And%';
As a result, the query will retrieve a list of customers whose First Name matches the specified condition, i.e. their First Name starts with ‘And’.
However, you might encounter situations where you regularly need to search by the last symbols of a word, number, or phrase—for example, by the last digits of a telephone number. In this case, we recommend creating a persisted computed column and running the REVERSE() function on it for easier back-searching.
CREATE TABLE dbo.Customer (
id INT IDENTITY PRIMARY KEY,
CardNo VARCHAR(128),
ReversedCardNo AS REVERSE(CardNo) PERSISTED
)
GO
CREATE INDEX ByReversedCardNo ON dbo.Customer (ReversedCardNo)
GO
CREATE INDEX ByCardNo ON dbo.Customer (CardNo)
GO
INSERT INTO dbo.Customer (CardNo)
SELECT
NEWID()
FROM
master.dbo.spt_values sv
--searching for CardNo that end in 510c
SELECT
id,
CardNo
FROM
dbo.Customer
WHERE
CardNo LIKE '%510c'
SELECT
id,
CardNo
FROM
dbo.Customer
WHERE
ReversedCardNo LIKE REVERSE('%510c')
❌ AVOID using multiple OR in the FILTER predicate
When you need to combine two or more conditions, it is recommended to eliminate the usage of the OR operator or split the query into parts separating search expressions. SQL Server can not process OR within one operation. Instead, it evaluates each component of the OR which, in turn, may lead to poor performance.
Let’s consider the following query.
SELECT Name, login FROM USER WHERE Name = @P OR login = @P;
If we split this query into two SELECT
queries and combine them by using the UNION
operator, SQL Server will be able to make use of the indexes, and the query will be optimized.
SELECT Name, login FROM USER WHERE Name = @P
UNION
SELECT Name, login FROM USER WHERE login = @P;
Learn More
A Detailed Guide on SQL Query Optimization
SQL Query Optimization: How to Tune Performance of SQL Queries
Top 10 SQL Query Optimization Techniques