实体框架性能问题

发布于 2024-12-04 08:44:46 字数 1124 浏览 0 评论 0原文

我在实体框架方面遇到了一个有趣的性能问题。我正在使用代码优先。

这是我的实体的结构:

一本书可以有很多评论。 评论与一本书相关联。 一篇评论可以有一个或多个评论。 一条评论与一条评论相关联。

public class Book
{
    public int BookId { get; set; }
    // ...
    public ICollection<Review> Reviews { get; set; }
}

public class Review 
{
    public int ReviewId { get; set; }
    public int BookId { get; set; }
    public Book Book { get; set; }
    public ICollection<Comment> Comments { get; set; }
}

public class Comment
{
     public int CommentId { get; set; }
     public int ReviewId { get; set; }
     public Review Review { get; set; }
}

我用大量数据填充了数据库并添加了适当的索引。我正在尝试使用以下查询检索一本有 10,000 条评论的书:

var bookAndReviews = db.Books.Where(b => b.BookId == id)
                       .Include(b => b.Reviews)
                       .FirstOrDefault();

这本书有 10,000 条评论。该查询的性能约为 4 秒。运行完全相同的查询(通过 SQL Profiler)实际上会立即返回。我使用相同的查询、SqlDataAdapter 和自定义对象来检索数据,整个过程在 500 毫秒内完成。

使用 ANTS Performance Profiler,看起来大部分时间都花在做一些不同的事情上:

Equals 方法被调用 5000 万次。

有谁知道为什么需要调用这 5000 万次次以及如何提高性能?

I am running into an interesting performance issue with Entity Framework. I am using Code First.

Here is the structure of my entities:

A Book can have many Reviews.
A Review is associated with a single Book.
A Review can have one or many Comments.
A Comment is associated with one Review.

public class Book
{
    public int BookId { get; set; }
    // ...
    public ICollection<Review> Reviews { get; set; }
}

public class Review 
{
    public int ReviewId { get; set; }
    public int BookId { get; set; }
    public Book Book { get; set; }
    public ICollection<Comment> Comments { get; set; }
}

public class Comment
{
     public int CommentId { get; set; }
     public int ReviewId { get; set; }
     public Review Review { get; set; }
}

I populated my database with a lot of data and added the proper indexes. I am trying to retrieve a single book that has 10,000 reviews on it using this query:

var bookAndReviews = db.Books.Where(b => b.BookId == id)
                       .Include(b => b.Reviews)
                       .FirstOrDefault();

This particular book has 10,000 reviews. The performance of this query is around 4 seconds. Running the exact same query (via SQL Profiler) actually returns in no time at all. I used the same query and a SqlDataAdapter and custom objects to retrieve the data and it happens in under 500 milliseconds.

Using ANTS Performance Profiler it looks like a bulk of the time is being spent doing a few different things:

The Equals method is being called 50 million times.

Does anyone know why it would need to call this 50 million times and how I could increase the performance for this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

请恋爱 2024-12-11 08:44:46

为什么 Equals 被调用了 50M 次?

听起来很可疑。您有 10,000 条评论和 50,000,000 次 Equals 调用。假设这是由 EF 内部实现的恒等映射引起的。身份映射确保具有唯一键的每个实体仅由上下文跟踪一次,因此如果上下文已经具有与从数据库加载的记录具有相同键的实例,则它将不会具体化新实例,而是使用现有实例。现在这怎么能与这些数字相符呢?我可怕的猜测:

=============================================
1st      record read   |  0     comparisons
2nd      record read   |  1     comparison
3rd      record read   |  2     comparisons
...
10.000th record read   |  9.999 comparisons

这意味着每个新记录都会与身份映射中的每个现有记录进行比较。通过应用数学来计算所有比较的总和,我们可以使用称为“算术序列”的东西:

a(n) = a(n-1) + 1
Sum(n) = (n / 2) * (a(1) + a(n))
Sum(10.000) = 5.000 * (0 + 9.999) => 5.000 * 10.000 = 50.000.000

我希望我的假设或计算没有犯错误。等待!我希望我做错了,因为这看起来不太好。

尝试关闭更改跟踪 = 希望关闭身份映射检查。

这可能很棘手。开始于:

var bookAndReviews = db.Books.Where(b => b.BookId == id)
                             .Include(b => b.Reviews)
                             .AsNoTracking()
                             .FirstOrDefault();

但是您的导航属性很可能不会被填充(因为它是通过更改跟踪处理的)。在这种情况下,请使用这种方法:

var book = db.Books.Where(b => b.BookId == id).AsNoTracking().FirstOrDefault();
book.Reviews = db.Reviews.Where(r => r.BookId == id).AsNoTracking().ToList();

无论如何,您能看到传递给 Equals 的对象类型是什么吗?我认为它应该只比较主键,甚至50M整数比较也不应该是这样的问题。

顺便说一句,EF 很慢 - 这是众所周知的事实。在具体化实体时,它还在内部使用反射,因此仅仅 10,000 条记录就可能需要“一些时间”。除非您已经这样做了,否则您还可以关闭动态代理创建 (db.Configuration.ProxyCreationEnabled)。

Why is Equals called 50M times?

It sounds quite suspicious. You have 10.000 reviews and 50.000.000 calls to Equals. Suppose that this is caused by identity map internally implemented by EF. Identity map ensures that each entity with unique key is tracked by the context only once so if context already has instance with the same key as loaded record from the database it will not materialize new instance and instead uses the existing one. Now how this can coincide with those numbers? My terrifying guess:

=============================================
1st      record read   |  0     comparisons
2nd      record read   |  1     comparison
3rd      record read   |  2     comparisons
...
10.000th record read   |  9.999 comparisons

That means that each new record is compared with every existing record in identity map. By applying math to compute sum of all comparison we can use something called "Arithmetic sequence":

a(n) = a(n-1) + 1
Sum(n) = (n / 2) * (a(1) + a(n))
Sum(10.000) = 5.000 * (0 + 9.999) => 5.000 * 10.000 = 50.000.000

I hope I didn't make mistake in my assumptions or calculation. Wait! I hope I did mistake because this doesn't seem good.

Try turning off change tracking = hopefully turning off identity map checking.

It can be tricky. Start with:

var bookAndReviews = db.Books.Where(b => b.BookId == id)
                             .Include(b => b.Reviews)
                             .AsNoTracking()
                             .FirstOrDefault();

But there is a big chance that your navigation property will not be populated (because it is handled by change tracking). In such case use this approach:

var book = db.Books.Where(b => b.BookId == id).AsNoTracking().FirstOrDefault();
book.Reviews = db.Reviews.Where(r => r.BookId == id).AsNoTracking().ToList();

Anyway can you see what object type is passed to Equals? I think it should compare only primary keys and even 50M integer comparisons should not be such a problem.

As a side note EF is slow - it is well known fact. It also uses reflection internally when materializing entities so simply 10.000 records can take "some time". Unless you already did that you can also turn off dynamic proxy creation (db.Configuration.ProxyCreationEnabled).

拧巴小姐 2024-12-11 08:44:46

我知道这听起来很蹩脚,但是您是否尝试过其他方式,例如:

var reviewsAndBooks = db.Reviews.Where(r => r.Book.BookId == id)
                       .Include(r => r.Book);

我注意到,当您以这种方式处理查询时,EF 有时会有更好的性能(但我没有时间弄清楚为什么)。

I know this sounds lame, but have you tried the other way around, e.g.:

var reviewsAndBooks = db.Reviews.Where(r => r.Book.BookId == id)
                       .Include(r => r.Book);

I have noticed sometimes better performance from EF when you approach your queries this way (but I haven't had the time to figure out why).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文