Filter Predicate

DO select the appropriate data type

In SQL, each table column has an associated data type. Integers, dates, varchars, boolean, text, and other data types are available. It is critical to select the appropriate data type when building. Numeric numbers should be used, dates should be dates, and so forth. This is critical for indexing purposes.

Take a look at the following example:

SELECT empName, empRole FROM emp WHERE empID = 98422;

The above query retrieves the employee ID and name for the 98422.

What if the empID data type is a string? When employing indexing, you may run into problems because it will take ages to scan what should be a simple document.

DO use EXISTS to check if record is found

  • Commonly, the COUNT() function is used to check if a record exists from the database.
  • This function scans the entire table even if the record is found on the first row which makes it efficient.
  • Instead of using COUNT(), the EXISTS() function can be used so when a record is found, the execution stops as well and return the result right away.

DO use EXISTS instead of IN whenever possible.

IN operator is more costly than EXISTS in terms of scans especially when the result of the subquery is a large dataset. So we should try to use EXISTS rather than using IN for fetching results with a subquery.

Let us see this with an example,

Inefficient

SET 
  STATISTICS TIME ON 
  
SELECT 
  ProductNumber, 
  Name, 
  Color 
FROM 
  SalesLT.Product 
WHERE 
  ProductID IN (
    SELECT 
      ProductID 
    FROM 
      SalesLT.ProductDescription
  )

Efficient

SET 
  STATISTICS TIME ON 
SELECT 
  ProductNumber, 
  Name, 
  Color 
FROM 
  SalesLT.Product 
WHERE 
  EXISTS (
    SELECT 
      ProductID 
    FROM 
      SalesLT.ProductDescription
  )

We have executed the same query having a subquery with IN command and EXISTS commands and we observe that the EXISTS command takes half of the time as compared to IN command and the number of physical and logical scans is very low.

DO use WHERE Instead of HAVING

  • HAVING statements are all calculated after the WHERE statements. Therefore, it is quicker to execute the WHERE query.
  • If the goal of the HAVING statement is just to filter result, it is recommended to use the WHERE statement instead for higher efficiency.
  • For instance, if you wish to see a date range, the HAVING statement will retrieve all records then filter them while WHERE statement instantly limits the number of records pull based on the condition.

Inefficient

SELECT 
  c.ID, 
  c.CompanyName, 
  b.CreatedDate 
FROM 
  Business b 
  JOIN Company c ON b.CompanyID = c.ID 
GROUP BY 
  c.ID, 
  c.CompanyName, 
  b.CreatedDate 
HAVING 
  b.CreatedDate BETWEEN 2020 - 01 - 01 
  AND 2020 - 12 - 31

Efficient

SELECT 
  c.ID, 
  c.CompanyName, 
  b.CreatedDate 
FROM 
  Business b 
  JOIN Company c ON b.CompanyID = c.ID 
WHERE 
  b.CreatedDate BETWEEN 2020 - 01 - 01 
  AND 2020 - 12 - 31 
GROUP BY 
  c.ID, 
  c.CompanyName, 
  b.CreatedDate

DO use wildcards at the end of a phrase only

  • Wildcards create a widest search possible which is also considered as the most inefficient type of search.
  • When using a lot of wildcards to search for a string, the database is tasked to search records for a match anywhere.
  • As much as possible, try to limit using wildcard searching.

Wildcards serve as a placeholder for words and phrases and can be added at the beginning/end of them. To make data retrieval faster and more efficient, you can use wildcards in the SELECT statement at the end of a phrase. For example:

SELECT 
  p.BusinessEntityID, 
  p.FirstName, 
  p.LastName, 
  p.Title 
FROM 
  Person.Person p 
WHERE 
  p.FirstName LIKE 'And%';

As a result, the query will retrieve a list of customers whose First Name matches the specified condition, i.e. their First Name starts with ‘And’.

However, you might encounter situations where you regularly need to search by the last symbols of a word, number, or phrase—for example, by the last digits of a telephone number. In this case, we recommend creating a persisted computed column and running the REVERSE() function on it for easier back-searching.

CREATE TABLE dbo.Customer (
  id INT IDENTITY PRIMARY KEY, 
  CardNo VARCHAR(128), 
  ReversedCardNo AS REVERSE(CardNo) PERSISTED
) 
GO 

CREATE INDEX ByReversedCardNo ON dbo.Customer (ReversedCardNo) 
GO 

CREATE INDEX ByCardNo ON dbo.Customer (CardNo) 
GO 

INSERT INTO dbo.Customer (CardNo) 
SELECT 
  NEWID() 
FROM 
  master.dbo.spt_values sv 
  
--searching for CardNo that end in 510c  
SELECT 
  id,
  CardNo
FROM 
  dbo.Customer 
WHERE 
  CardNo LIKE '%510c' 
  
SELECT 
  id,
  CardNo
FROM 
  dbo.Customer 
WHERE 
  ReversedCardNo LIKE REVERSE('%510c')

AVOID using multiple OR in the FILTER predicate

When you need to combine two or more conditions, it is recommended to eliminate the usage of the OR operator or split the query into parts separating search expressions. SQL Server can not process OR within one operation. Instead, it evaluates each component of the OR which, in turn, may lead to poor performance.

Let’s consider the following query.

SELECT Name, login FROM USER WHERE Name = @P OR login = @P;

If we split this query into two SELECT queries and combine them by using the UNION operator, SQL Server will be able to make use of the indexes, and the query will be optimized.

SELECT Name, login FROM USER WHERE Name = @P
UNION
SELECT Name, login FROM USER WHERE login = @P;

Learn More

A Detailed Guide on SQL Query Optimization

SQL Query Optimization: How to Tune Performance of SQL Queries

SQL Query Optimization

Top 10 SQL Query Optimization Techniques

11 Tips to Boost SQL Query Optimization

SQL Database Performance Tuning for Developers