Performance optimization in Entity Framework

Posted on April 2, 2023
databaseperformanceEntity FrameworkAsNoTrackingLazy loadingEager loadingIndexingQuery execution planEntity Framework performance optimization

Writing efficient queries with EF Core is sometimes critical as entity frameworks hide internal details such as the SQL being executed. In this article, I will attempt to provide a few tricks to achieve good performance with EF Core. We split this article into two sections i.e. "Tip" and "Code to Seek"

Common Tips and Codes to Seeks

Tip - #01: Use Index Properly

The large factor for using the indexes for query optimization but do you know the deciding factor, when or when not to use many indexes? A query that traverses the entire table can give serious performance issues.

Indexes speed up queries, but they also slow down the "add, update, or delete" as they need to be kept up to date, so avoid indexes when not necessary so the decision to use many indexes depends on the size of the table, the types of queries, the frequency of updates, and the available resources. In general, it is best to only use the necessary indexes to optimize performance and avoid unnecessary overhead.

:white_check_mark: Do's

// Matches on start, so uses an index (on SQL Server)
var posts1 = context.Posts.Where(p => p.Title.StartsWith("A")).ToList();
// Matches on end, so does not use the index
var posts2 = context.Posts.Where(p => p.Title.EndsWith("A")).ToList();

Tip - #02: Project only properties you need

To maximize performance you need to include the properties that are necessary only. By doing this, you can reduce the amount of data that needs to be processed, which can improve the speed and efficiency

:white_check_mark: Do's

using (var db = new YourDbContext())
{
    var products = db.Products.Select(p => new { p.Name, p.Price }).ToList();
}

Tip - #03: Avoid cartesian explosion when loading related entities

:x: Don'ts :

Large Result set

var blogs = ctx.Blogs
    .Include(b => b.Posts)
    .Include(b => b.Contributors)
    .ToList();

It will produce the following SQL:

SQL Without Split

This means that if a given blog has 10 posts and 10 contributors, the database returns 100 rows for that single blog. This phenomenon - sometimes called cartesian explosion - can cause huge amounts of data to unintentionally get transferred to the client

:white_check_mark: Do's

using (var context = new BloggingContext())
{
    var blogs = context.Blogs
        .Include(blog => blog.Posts)
        .AsSplitQuery()
        .ToList();
}

It will produce the following SQL:

Split SQL Output


Tip - #04: Load related entities eagerly

:x: Don'ts

Lazy loading fetch data in extra roundtrip

foreach (var blog in context.Blogs.ToList())
{
   foreach (var post in blog.Posts.OrderByDescending(post => post.Title).Take(5))
   {
      if (post.BlogId == 1)
      {
         Console.WriteLine($"Blog {blog.Url}, Post: {post.Title}");
      }
   }
}

:white_check_mark: Do's

EF can fetch all the required data in one roundtrip with eagerly loading

using (var context = new BloggingContext())
{
    var filteredBlogs = context.Blogs
        .Include(
            blog => blog.Posts
                .Where(post => post.BlogId == 1)
                .OrderByDescending(post => post.Title)
                .Take(5))
        .ToList();
}

Tip - #05: Loading all the data into memory at once. (Applied to case when a lot of data)

:x: Don'ts

ToList() or ToArray() method in Entity Framework to load data can be appropriate in certain situations, but it is not always the best approach.

ToList - Use Case

:white_check_mark: Do's

use .AsEnumerable() is also a streams, allowing you to execute LINQ operators on the client-side

AsEnumerable-Use


Tip - #06: Use SQL Queries

:x: Don'ts

Do not handle complex database operation using Entity Framework

:white_check_mark: Do's

use FromSql, FromSqlRaw, SqlQuery, and ExecuteSql to compose the SQL with regular LINQ queries

FromSql

var user = "johndoe";

var blogs = context.Blogs
    .FromSql($"EXECUTE dbo.GetMostPopularBlogsForUser {user}")
    .ToList();

FromSqlRaw

var columnName = "Url";
var columnValue = new SqlParameter("columnValue", "http://SomeURL");

var blogs = context.Blogs
    .FromSqlRaw($"SELECT * FROM [Blogs] WHERE {columnName} = @columnValue", columnValue)
    .ToList();

SqlQuery

var ids = context.Database
    .SqlQuery<int>($"SELECT [BlogId] FROM [Blogs]")
    .ToList();

ExecuteSql

using (var context = new BloggingContext())
{
    var rowsModified = context.Database.ExecuteSql($"UPDATE [Blogs] SET [Url] = NULL");
}

Tip - #07: Asynchronous programming (It is really not required but it can provide significant benefits in terms of performance, scalability, and resource usage)

:x: Don'ts Avoid SaveChanges()

Do not block the thread for the duration of database I/O which means instead of blocking the main thread while waiting for a query to complete, an asynchronous query can be executed in the background, allowing the main thread to continue executing other code.

:white_check_mark: Do's Use SaveChangesAsync()


Tip - #08: Batch together your changes in single round trip

:x: Don'ts

Calling SaveChanges multiple time

var blog = context.Blogs.Single(b => b.Url == "http://someblog.codehacks.com");
blog.Url = "http://someotherblog.codehacks.com";
context.SaveChanges();

context.Add(new Blog { Url = "http://newblog1.codehacks.com" });
context.SaveChanges();

context.Add(new Blog { Url = "http://newblog2.codehacks.com" });
context.SaveChanges();

:white_check_mark: Do's

Batch together all operation, like here in this example update of url and inserting two new blog records.

var blog = context.Blogs.Single(b => b.Url == "http://someblog.codehacks.com");
blog.Url = "http://someotherblog.codehacks.com";
context.Add(new Blog { Url = "http://newblog1.codehacks.com" });
context.Add(new Blog { Url = "http://newblog2.codehacks.com" });
context.SaveChanges();

Tip - #09: Disable change tracking

:x: Don'ts

Change tracking is an important feature of EF that allows it to efficiently manage changes to entities and their relationships. Disabling change tracking can result in decreased performance and increased complexity in managing changes to entities.

:white_check_mark: Do's

No tracking queries are useful when the results are used in a read-only scenario.

var blogs = context.Blogs
    .AsNoTracking()
    .ToList();

Tip - #10: Use Pagination and do not use multiple ToList() in single query

:x: Don'ts

Do not fetch all the result set in once


var blogs = context.Posts
    .Where(p => p.Title.StartsWith("A"))
    .ToList();

:x: Don'ts

Do not use multiple ToList()


var blogs = context.Posts
    .Where(p => p.Title.StartsWith("A")).ToList();

blogs = blogs.Skip(10).Take(20).ToList();

This query retrieves all the posts that have title starting with the letter "A" from the database and as we are using ToList(), it will buffer all the data to memory and then we are applying pagination on the result set. This means that all posts that do not match the search criteria are received even those not required for pagination

:white_check_mark: Do's

Pagination is fetching search results in multiple pages instead of retrieving them all at once, which is commonly implemented for extensive result sets.


var blogs = context.Posts
    .Where(p => p.Title.StartsWith("A"))
    .Skip(10)
    .Take(20)
    .ToList();

There are additional factors that need to be taken into account in order to enhance or optimize performance.

  • Amount of data transfer to Network can be significant when working with databases, network latency can be a major bottleneck, particularly when dealing with large data sets or complex queries
  • Network round trips where each time a request is made to the database server, a network round trip occurs. Network round trips can be a major bottleneck, as they can introduce significant latency, particularly when working with distributed systems or remote databases.
  • Cache outside the database, such as Redis.
  • Load balancing can help distribute the workload across multiple servers,
  • Database design Proper database design can also play a significant role in performance

Tools to diagnose performance issue

  • SQL Profiler
  • Query Execution Plan
  • Enabling Logging - LogTo()
protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
    optionsBuilder
        .UseSqlServer(@"Server=(localdb)\mssqllocaldb;Database=Blogging;Trusted_Connection=True")
        .LogTo(Console.WriteLine, LogLevel.Information);
}

Thanks for reading!


Posted on April 2, 2023
Profile Picture

Arun Yadav

Software Architect | Full Stack Web Developer | Cloud/Containers

Subscribe
to our Newsletter

Signup for our weekly newsletter to get the latest news, articles and update in your inbox.

More Related Articles