LINQ

  • DO pull only the needed columns.

    When working with LINQ, only pull the needed columns in the Select clause instead of loading all the columns in the table.

    Consider the following LINQ query.

    using(var context = new LINQEntities()) {
      var fileCollection = context.FileRepository.Where(a => a.IsDeleted == false).ToList();
    }
    

    This query will be compiled into SQL as shown the following screenshot.

    Though we might need only a few columns, we have loaded them all. This consumes more memory than necessary and slows down query execution. We can change the query as follows for better execution.

    using(var context = new LINQEntities()) {
      var fileCollection = context.FileRepository.Where(a => a.IsDeleted == false).
      Select(a => new {
        FilePath = a.FilePath
      }).ToList();
    }
    

    This query will be compiled into SQL in an optimized way as shown in the following screenshot.

  • DO filter the data.

    Using where you can filter your data before projecting, reducing the amount of data that needs to be projected. This can further improve query performance.

  • DO use Join instead of writing multiple queries. 

    When you join tables with LINQ, you can retrieve all the required data in a single query instead of making multiple trips to the database. This can improve query performance and reduce the number of queries executed.

  • DO use of IQueryable and Skip/Take.

    When working with a large collection of data and binding it with a table or grid control, we should not load all records for the user in a single instance because it takes a long time.

    Instead, we can load a certain number of records initially, say 10 or 20. When the user wants to see the next set of records, we can load the next 10 or 20 records on demand.

    The IQueryable type in C# can hold unevaluated SQL queries, which can later be converted to run against the data collection after applying skip and take.

    private IEnumerable < object > LoadAllFiles(int skip, int take, string fileRevision, string fileNumber) {
      using(var context = new LINQEntities()) {
        //Select and perform join on the needed tables and build an IQueryable collection
        IQueryable < FileRepository > fileCollection = context.FileRepository;
    
        //Build queries dynamically over Queryable collection
        if (!string.IsNullOrEmpty(fileRevision))
          fileCollection = fileCollection.Where(a => a.FileRevision == fileRevision && a.IsDeleted == false);
    
        //Build queries dynamically over Queryable collection
        if (!string.IsNullOrEmpty(fileNumber))
          fileCollection = fileCollection.Where(a => a.FileRevision == fileNumber && a.IsDeleted == false);
    
        //Apply skip and take and load records
        return fileCollection.OrderBy(a => a.Id).Skip(() => skip).Take(() => take).Select(a => new {
          FileIssuedBy = a.FileIssuedBy
        }).ToList();
      }
    }
    

    SQL

    exec sp_executesql N'SELECT
    [Project1].[C1] as [C1],
    [Project1].[FileIssuedBy] as [FileIssuedBy],
    FROM (SELECT
         [Extent1].[Id] as [Id],
    	 [Extent1].[FileIssuedBy] as [FileIssuedBy],
    	 1 as [C1]
    	 FROM [dbo].[FileRepository] as [Extent1]
    	 WHERE ([Extent1].[FileRevision] = @p_linq_0) AND (0=[Extent1].[IsDeleted]) AND ([Extent1].[FileRevision] = @p_linq_1)
    	 AND (0=[Extent1].[IsDeleted])) AS [Project1]
         ORDER BY row_number() OVER (ORDER BY [Project1].[Id] ASC)
    	 OFFSET @p_linq_2 ROWS FETCH NEXT @p_linq_3 ROWS ONLY ',N'@p_linq_0 nvarchar(4000),@p_linq_1 nvarchar(4000),
    	 @p_linq_2 int,@p_linq_3
    int',@p_linq_0=N'A',@p_linq_1=N'A',@p_pinq_2=0,@p_linq_3=10
    

Tips: When writing skip and take in LINQ queries, please consider the following for better performance:

AVOID fileCollection.OrderBy(a=>a.Id).Skip(skip).Take(take); SQL query generated will use hard-coded value, so caching will not be applied.

USE

fileCollection.OrderBy(a=>a.Id).Skip(()=>skip).Take(()=>take); SQL query generated will reserve a variable and hold a value. The variable will be used in the query execution and the query will be cached.

  • DO look for parameter mismatch.

    Data types may not match when querying, which often leads to significant time consumption when using LINQ to Entities. Consider a scenario where we have a column, FileNumber, in a table, which is of type varchar and holds 10 characters. Hence, it was declared as varchar(10) data type.

    We need to load records that the FileNumber field values match in File1.

    using(var context = new LINQEntities()) {
      string fileNumber = "File1";
      var fileCollection = context.FileRepository.Where(a => a.FileNumber == fileNumber).ToList();
    }
    

    SQL

    In the highlighted section in the previous screenshot, we can see that the variable we passed has been declared in SQL as nvarchar(4000), whereas in the table column, we have the column type as varchar(10). So, there will be a conversion that the SQL will perform internally as there is a parameter type mismatch.

    To overcome this parameter mismatch, we need to mention the type of column with the property name, as in the following code.

    public string FilePath { get; set; }
    
    [Column(TypeName = "varchar")]
    public string FileNumber { get; set; }
    

    Now the SQL parameter type will be generated as varchar.

  • DO use AddRange for bulk data insert.

    Another situation to consider is when working with bulk data inserts, such as adding hundreds or thousands of records to a SQL table.

    using(var context = new LINQEntities()) {
      for (int i = 1; i <= 1000; i++) {
        var file = new FileRepository {
          FilePath = "" + i + "", FileDescription = "" + i + ""
        };
        context.FileRepository.Add(file);
      }
      context.SaveChanges();
    }
    

    Each time we add a new entity to FilesRepository in the previous code sample, DetectChanges() from Data.Entity.Core will be triggered, and query execution will become slower.

    To overcome this, use AddRange, which best suits bulk inserts. AddRange was introduced in EF 6.0 for doing insertions in a single database round trip to reduce the performance overhead. Take a look at the following modified code.

    using(var context = new LINQEntities()) {
      var fileCollection = new List < FileRepository > ();
      for (int i = 1; i <= 1000; i++) {
        var file = new FileRepository {
          FilePath = "" + i + "", FileDescription = "" + i + ""
        };
        fileCollection.Add(file);
      }
      context.FileRepository.AddRange(fileCollection);
      context.SaveChanges();
    }
    
  • DO bulk processing in chunks. When querying and updating, bulk updates in smaller chink.

    Large bulk inserts/updates can take much longer than small bulk.

    • Sometimes raw SQL is good Avoid this as much as possible because you lose a lot of good features like If we do migration like adding new property raw query will not work Intelligence.

      But I found it useful when updating/deleting if you must update all the records.

      To do this in EF Core we need to fetch all the rows then update or delete and then save changes that are not efficient so use raw SQL in this case.

  • DO check SQL query submitted to database

    Checking a SQL query before submitting it to a database is the most important thing you can do when trying to improve the performance of a LINQ-to-Entities query. We all know that the LINQ-to-Entities query will be converted to a SQL query and will be executed against a database. A SQL query that is generated as a result of a LINQ query will be effective for better performance.

    Let’s see an example. Consider the query that follows, which uses navigational LINQ between Room, RoomProducts, and Brands.

    Assume the following:

    • The Room table will hold a list of rooms for a hotel.
    • The RoomProducts table will hold a list of products in Room, and refers to Room.Id as a foreign key.
    • The Brands table will hold the RoomProducts brands and will refer to RoomProducts.Id as a foreign key.
    • The records in all three tables will definitely have a relational record in them.

    Let me write a LINQ query that maps all tables with join and pulls the matching records.

    using(var context = new LINQEntities()) {
      var roomCollection = context.Rooms.
      Include(x => x.RoomProducts).Select(x => x.RoomProducts.Select(a => a.Brands)).
      ToList();
    }
    

    We are getting Room, RoomProducts, and Brands collections. The query’s equivalent SQL is presented in the following code.

    SELECT
      [Project1].[Id] AS [Id],
      [Project1].[C2] AS [C1],
      [Project1].[Id1] AS [Id1],
      [Project1].[C1] AS [C2],
      [Project1].[Id2] AS [Id2],
      [Project1].[Brand] AS [Brand],
      [Project1].[RoomProductsParentId] AS [RoomProductsParentId],
      [Project1].[IsDeleted] AS [IsDeleted],
      [Project1].[ModifiedDate] AS [ModifiedDate],
      [Project1].[ModifiedBy] AS [ModifiedBy]
    FROM
      (
        SELECT
          [Extent1].[Id] AS [Id],
          [Join1].[Id1] AS [Id1],
          [Join1].[Id2] AS [Id2],
          [Join1].[Brand] AS [Brand],
          [Join1].[RoomProductsParentId] AS [RoomProductsParentId],
          [Join1].[IsDeleted1] AS [IsDeleted],
          [Join1].[ModifiedDate1] AS [ModifiedDate],
          [Join1].[ModifiedBy1] AS [ModifiedBy],
          CASE WHEN ([Join1].[Id1] IS NULL) THEN CAST(NULL AS int) WHEN ([Join1].[Id2] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1],
          CASE WHEN ([Join1].[Id1] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C2]
        FROM
          [dbo].[Rooms] AS [Extent1]
          LEFT OUTER JOIN (
            SELECT
              [Extent2].[Id] AS [Id1],
              [Extent2].[RoomParentId] AS [RoomParentId],
              [Extent3].[Id] AS [Id2],
              [Extent3].[Brand] AS [Brand],
              [Extent3].[RoomProductsParentId] AS [RoomProductsParentId],
              [Extent3].[IsDeleted] AS [IsDeleted1],
              [Extent3].[ModifiedDate] AS [ModifiedDate1],
              [Extent3].[ModifiedBy] AS [ModifiedBy1]
            FROM
              [dbo].[RoomProducts] AS [Extent2]
              LEFT OUTER JOIN [dbo].[Brands] AS [Extent3] ON [Extent2].[Id] = [Extent3].[RoomProductsParentId]
          ) AS [Join1] ON [Extent1].[Id] = [Join1].[RoomParentId]
      ) AS [Project1]
    ORDER BY
      [Project1].[Id] ASC,
      [Project1].[C2] ASC,
      [Project1].[Id1] ASC,
      [Project1].[C1] ASC
    

    We can see that the query generated as part of the navigation property uses a left-outer join for establishing a relation between the tables and loads all columns. The queries formed by the left-outer join run slower when compared with inner-join queries.

    Since we know that all three tables will definitely have a relation in them, let us modify this query slightly.

    using(var context = new LINQEntities()) {
      var roomCollection = (from room in context.Rooms
                          join products in context.RoomProducts on room.Id equals products.RoomParentId
                          join brands in context.Brands on products.Id equals brands.RoomProductsParentId
                          select new {
                            Room = room.Room,
                              Product = products.RoomProduct,
                              Brand = brands.Brand
                          }).ToList();
    }
    

    This produces the following:

    SELECT
      [Extent1].[Id] AS [Id],
      [Extent1].[Room] AS [Room],
      [Extent2].[RoomProduct] AS [RoomProduct],
      [Extent3].[Brand] AS [Brand]
    FROM
      [dbo].[Rooms] AS [Extent1]
      INNER JOIN [dbo].[RoomProducts] AS [Extent2] ON [Extent1].[Id] = [Extent2].[RoomParentId]
      INNER JOIN [dbo].[Brands] AS [Extent3] ON [Extent2].[Id] = [Extent3].[RoomProductsParentId]
    

    Now the code looks cleaner and executes faster than our previous attempt.

  • DO use of AsNoTracking().

    When we load records from a database via LINQ-to-Entities queries, we will be processing them and update them back to the database. For this purpose, entities be tracked.

    When we are performing only read operations, we won’t make any updates back to the database, but entities will assume that we are going to make updates back to the database and will process them accordingly. So, we can use AsNoTracking() to restrict entities from assuming and processing, thus reducing the amount of memory that entities will track.

    using(var context = new LINQEntities()) {
      var fileCollection = context.FileRepository.AsNoTracking().Where(a => a.IsDeleted == false).
      Select(a => new {
        FilePath = a.FilePath
      }).ToList();
    }
    
  • DO turn ObjectTrackingEnabled property off if not required.

    Note that if don’t need to change data but just read it, it is always advisable to turn this property off so as to turn off the unnecessary identity management of objects. The ObjectTrackingEnabled property is set to true by default. This implies that LINQ to SQL would keep track of every change that you make to your data so that it can remember those changes when you need to persist those changes to the underlying database at a later point in time. This will help boost the application’s performance to a considerable extent. The following code snippet illustrates how you can turn this property off (set it to false):

    using(TestDataContext dataContext = new TestDataContext()) {
      dataContext.ObjectTrackingEnabled = false;
    }
    
  • DO turn Optimistic Concurrency off

    Concurrency handling enables you to detect and resolve conflicts that arise out of concurrent requests to the same resource. Note that there are two types of concurrency - Optimistic and Pessimistic and LINQ follows an optimistic concurrency model by default. You should turn optimistic concurrency off unless it is needed. Note that UpdateCheck is set to Always which implies LINQ to SQL will check the property value, i.e., the value contained in that property against the value stored in the column of the database table that is mapped to that property.

    You should avoid using optimistic concurrency if not needed. You can use the UpdateCheck property to turn off optimistic concurrency. The following code snippet illustrates how you can set this property in the attribute of your entity class:

    [Column(Storage = "_Address", DbType = "NText",
      UpdateCheck = UpdateCheck.Never)]
    
  • DO use Compiled Queries judiciously

    You can take advantage of Compiled Query to boost query performance in your application. But, remember that compiled query could be costly when used for the first time. So, do ensure you use compiled queries only in situations where you need them, i.e., when you need a query to be used repeatedly.

    At the time when a query is to be executed by the LINQ engine, LINQ to SQL translates the LINQ queries to SQL — this is repeated every time the query is to be executed. This involves traversing the expression tree recursively again and hence it is a performance overhead. No worries at all — you have the CompiledQuery class for the rescue.

    You can leverage CompiledQuery to eliminate this performance overhead for queries that need to be executed again and again. A word of caution though: Use CompiledQuery judiciously and only when it is needed.

  • DO filter down your required data appropriately using DataLoadOptions.AssociateWith

    You should also filter down your required data appropriately using DataLoadOptions.AssociateWith so that only the data that is required is returned. Here is an example that shows how you can use DataLoadOptions.AssociateWith to retrieve selective data in LINQ:

    using(TestDataContext dataContext = new TestDataContext()) {
      DataLoadOptions dataLoadOptions = new DataLoadOptions();
    
      dataLoadOptions.AssociateWith < Employee > (emp => emp.Department.Where < Department > (dept => dept.DeptCode == 1));
    
      dataContext.LoadOptions = dataLoadOptions;
    }
    
  • DO use Caching.

    Caching is a technique that can improve the performance of LINQ queries by reducing the number of database round trips. By caching the results of a query, subsequent requests for the same data can be served from the cache, avoiding the need to execute the query against the database. You can use caching frameworks like Redis or Memcached to implement caching in your application.

  • DO use Lazy Loading.

    Lazy loading is a feature of Entity Framework that can help optimize your LINQ queries by loading related entities only when needed. This can reduce the data retrieved from the database and improve query performance. The LazyLoadingEnabled flag value by default is TRUE which means Lazy Loading is enabled for all entities by default and once we set the LazyLoadingEnabled Flag value to false, then Lazy Loading is disabled for all Entities that are defined inside this context class.

  • DO debug and optimize your queries.

    There are times when the query itself is the problem, and the best way to speed it up is to re-write it or apply other standard SQL optimization techniques (like adding indexes). Of course, to know which approach to take, you need to analyze your query and find out where the “slow points” are.

    If you’ve optimized SQL queries in SQL Server before, you’ve probably used SQL Server Management Studio to view the execution plan for a query and then used the plan (combined with suggestions by SQL Server) to optimize your query. The process when using LINQ to SQL is basically the same except for two extra steps. First, you must get the SQL code LINQ to SQL is generating for your query so you can analyze it. Second, if re-writing the query is required, you must coax LINQ to SQL into generating the SQL syntax you’re after.

    For the first step — getting the SQL code that LINQ to SQL is generating — you have two options. The first is to hook into LINQ to SQL’s log writer (there are numerous articles on the Internet explaining how to do this) and look at the SQL code it writes to the log. The second option — and the one I generally prefer — is to use a free tool called LINQPad available at www.linqpad.net .

    LINQPad can be thought of as a “scratch pad” for writing and testing LINQ to SQL queries (and for that matter, any type of LINQ query — LINQ to Objects, LINQ to XML, etc.). In the simplest use case for LINQ to SQL, you create a database connection for your database and start writing LINQ to SQL expressions against it in the editor. LINQPad will automatically create the DataContext and entity classes for you behind the scenes, so none of that work is required. However, you can configure a database connection to use a custom DataContext (like one from your application) if you wish.

    LINQPad also comes with tons of readily executable sample queries and some helpful utility methods — like the Dump() method — that are immensely helpful when debugging and optimizing queries.

    As a simple example, I could write the following LINQ to SQL code in LINQPad:

    var ordersByCountry =
    	from o in Orders
    	group o by o.ShipCountry;
    
    var query =
    	from o in ordersByCountry.Dump()
    	select new
    	{
    		Country = o.Key,
    		Orders = o
    	};
    
    query.Dump();
    

    The Dump() method outputs the contents of whatever it’s given to the Results window in LINQPad and then returns the same object, allowing me to view the output of intermediate steps in my expressions.

    Also, any time you execute a LINQ to SQL query in LINQPad, the SQL code generated by LINQ is written to the SQL window and is ready to be copied and pasted into SQL Server for analysis, parameter declarations and all. Then you can analyze the query in SQL Server like you normally would and figure out what optimizations are required.

    Once you know what needs to be done, then comes the next step: convincing LINQ to SQL to see things your way and generate the code you want. This process could go pretty quickly, or it might be more like convincing your wife it’s a good idea for you to go to the pub with your buddies on Friday night rather than spending time at home. The good news is that regardless of how it goes, you can test variations of your LINQ code in LINQPad and look at the SQL that’s generated until you’re getting the exact results you want. LINQPad even includes the query execution time in the status bar so you can evaluate how your queries perform.

    I can’t stress enough what a wonderful tool LINQPad is for working with LINQ to SQL. Do yourself a favor and start using it to analyze and improve your queries.

  • AVOID executing queries inside loops. Retrieving all required data in a single query and performing any necessary processing outside the query can improve query performance by reducing the number of queries executed. Avoid executing queries inside loops whenever possible.

  • AVOID to put all the DB Objects into One Single Entity Model

    Entity Model specifies a single unit of work, not all our database. If we have many database objects that are not connected to one another or these(log tables, objects used by batch processes,etc.) are not used at all. Hence these objects are consuming space in the memory and cause performance degrades. So try to make separate entity models of related database objects.

  • AVOID using Contains

    In LINQ, we use Contains method for checking existence. It is converted to WHERE IN in SQL which cause performance degrades. Instead, use Join or Any where appropriate to optimize your queries.

  • AVOID using Views

    Views degrade the LINQ query performance costly. These are slow in performance and impact the performance greatly. So avoid using views in LINQ to Entities.

Learn More