EF 4.1：为什么将常量转换为变量会导致额外的子查询？

发布于 2024-11-05 12:04:21 字数 2906 浏览 0 评论 0原文

今天我发现实体框架向它生成的 SQL 添加了不必要的子查询。我开始挖掘我的代码，试图缩小它可能来自的范围。过了（很长）一段时间后，我查明了造成这种情况的原因。但现在我比开始时更困惑，因为我不知道为什么会导致这种情况。

基本上我发现，在某些情况下，只需将常量转换为变量就可以改变实体框架生成的 SQL。我已将所有内容缩小到最低限度，并将其打包到一个小控制台应用程序中：

using System;
using System.Data.Entity;
using System.Linq;

class Program
{
    private static readonly BlogContext _db = new BlogContext();

    static void Main(string[] args)
    {
        const string email = "[email protected]";

        var comments = from c in _db.Comments
                       where c.Email == email
                       select c;

        var result = (from p in _db.Posts
                      join c in comments on p.PostId equals c.PostId
                      orderby p.Title
                      select new { p.Title, c.Content });

        Console.WriteLine(result);
    }
}

public class BlogContext : DbContext
{
    public DbSet<Post> Posts { get; set; }
    public DbSet<Comment> Comments { get; set; }
}

public class Post
{
    public int PostId { get; set; }
    public string Title { get; set; }
}

public class Comment
{
    public int CommentId { get; set; }
    public int PostId { get; set; }
    public string Email { get; set; }
    public string Content { get; set; }
}

这显示了以下输出，这是完美的：

SELECT
[Extent1].[PostId] AS [PostId],
[Extent1].[Title] AS [Title],
[Extent2].[Content] AS [Content]
FROM  [dbo].[Posts] AS [Extent1]
INNER JOIN [dbo].[Comments] AS [Extent2] ON [Extent1].[PostId] = [Extent2].[PostId]
WHERE N'[email protected]' = [Extent2].[Email]
ORDER BY [Extent1].[Title] ASC

现在，如果我将 email 设为变量：

/*const*/ string email = "[email protected]";

输出将发生根本性变化

SELECT
[Project1].[PostId] AS [PostId],
[Project1].[Title] AS [Title],
[Project1].[Content] AS [Content]
FROM ( SELECT
        [Extent1].[PostId] AS [PostId],
        [Extent1].[Title] AS [Title],
        [Extent2].[Content] AS [Content]
        FROM  [dbo].[Posts] AS [Extent1]
        INNER JOIN [dbo].[Comments] AS [Extent2] ON [Extent1].[PostId] = [Extent2].[PostId]
        WHERE [Extent2].[Email] = @p__linq__0
)  AS [Project1]
ORDER BY [Project1].[Title] ASC

：附注，LINQ to SQL 似乎没有这样做。我知道忽略这一点可能是可以的，因为这两个命令返回相同的数据。但我非常好奇为什么会发生这种情况。直到今天，我一直有这样的印象（也许是错误的？）：只要值保持不变（在本例中确实如此），将常量转换为变量总是安全的。所以我必须问...

为什么看似微不足道的更改会导致生成的 SQL 出现如此大的差异？

更新：

需要明确的是，我的问题不是关于email 的值在第一个查询中是一个硬编码值，在第二个查询中是一个变量（这在世界上是有意义的）。我的问题是为什么变量版本会导致额外的子查询。

谢谢！

原文

Today I discovered that Entity Framework was adding an unnecessary sub query to the SQL it generates. I started digging my code trying to narrow down where it might come from. A (long) while later I pin-pointed what's causing it. But now I'm more confused than when I started, as I have no clue why it causes it.

Basically what I discovered is that on certain scenarios, simply converting a constant into a variable can alter the SQL that Entity Framework generates. I've shrunk everything to the bare minimum and packed it in a little console app:

using System;
using System.Data.Entity;
using System.Linq;

class Program
{
    private static readonly BlogContext _db = new BlogContext();

    static void Main(string[] args)
    {
        const string email = "[email protected]";

        var comments = from c in _db.Comments
                       where c.Email == email
                       select c;

        var result = (from p in _db.Posts
                      join c in comments on p.PostId equals c.PostId
                      orderby p.Title
                      select new { p.Title, c.Content });

        Console.WriteLine(result);
    }
}

public class BlogContext : DbContext
{
    public DbSet<Post> Posts { get; set; }
    public DbSet<Comment> Comments { get; set; }
}

public class Post
{
    public int PostId { get; set; }
    public string Title { get; set; }
}

public class Comment
{
    public int CommentId { get; set; }
    public int PostId { get; set; }
    public string Email { get; set; }
    public string Content { get; set; }
}

This shows the following output, which is perfect:

SELECT
[Extent1].[PostId] AS [PostId],
[Extent1].[Title] AS [Title],
[Extent2].[Content] AS [Content]
FROM  [dbo].[Posts] AS [Extent1]
INNER JOIN [dbo].[Comments] AS [Extent2] ON [Extent1].[PostId] = [Extent2].[PostId]
WHERE N'[email protected]' = [Extent2].[Email]
ORDER BY [Extent1].[Title] ASC

Now if I make email a variable:

/*const*/ string email = "[email protected]";

The output changes radically:

SELECT
[Project1].[PostId] AS [PostId],
[Project1].[Title] AS [Title],
[Project1].[Content] AS [Content]
FROM ( SELECT
        [Extent1].[PostId] AS [PostId],
        [Extent1].[Title] AS [Title],
        [Extent2].[Content] AS [Content]
        FROM  [dbo].[Posts] AS [Extent1]
        INNER JOIN [dbo].[Comments] AS [Extent2] ON [Extent1].[PostId] = [Extent2].[PostId]
        WHERE [Extent2].[Email] = @p__linq__0
)  AS [Project1]
ORDER BY [Project1].[Title] ASC

As a side note, LINQ to SQL does not seem to do this. I know it's probably okay to ignore this, as both commands return the same data. But I'm extremely curious as to why this happens. Up until today I always had the (perhaps false?) impression that it is always safe to turn a constant into a variable, providing that the value remains the same (which in this case does). So I have to ask...

Why a seemingly insignificant change causes such a big difference in the generated SQL?

Update:

Just to be clear, my question isn't about the value of email being a hard-coded value in the first query and a variable in the second (which makes all the sense in the world). My question is about why the variable version results in the extra sub query.

Thanks!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

疯狂的代价 2024-11-12 12:04:21

答案相当简单。您的 LINQ 查询是用表达式树表达的。 const 变量与非 const 变量的区别在于常量表达式和
参数表达式。

当您使用 const 时，您的 LINQ 查询将使用 ConstExpression 作为此变量，而当您使用非 const 时，它会使用 ParameterExpression，EF 运行时对它们的解释会有所不同。

常量实际上意味着该值永远不会改变并且该值可以内联到查询中。

回复收藏 0 原文

所有深爱都是秘密 2024-11-12 12:04:21

不是问题的答案 - 只是使用参数的上下文。

这与创建查询有关，以便它将重新使用现有的查询计划。

如果将变量（而不是对参数的引用）注入到生成 SQL 中，则当变量更改时，SQL Server（可能还有其他数据库引擎）将无法重新使用相同的计划。

对于常量，这不是问题，因为您知道值始终相同，但对于每次执行查询时的变量，SQL 和查询计划都会略有不同。

这听起来可能不多，但 SQL 只为查询计划分配了一定量的空间，因此在缓存中拥有数百/数千个微小的变化可以说是真正的“空间浪费”！

回复收藏 0 原文

终难愈 2024-11-12 12:04:21

这实际上是 SQL 中的一个很大的区别吗？内部查询与原始查询相同，外部查询只是内部查询的包装，不会更改结果集。

除非这会引起问题，否则我个人不会担心。两种查询风格的查询计划是否不同？我的猜测是它们是相同的。

回复收藏 0 原文

十六岁半 2024-11-12 12:04:21

正如人们所说。两个查询之间的差异很小。

原因是，当您使用变量和常量时，创建 LINQ 时创建的表达式是不同的。 EF 将捕捉到这一点并相应地生成 SQL。它知道它永远不会改变，因此可以将其硬编码到查询中以获得（可能的）性能增益。

编辑：
我认为这个问题没有答案，除了“EF 就是这么做的”。但众所周知，EF 喜欢创建许多子选择。对于更复杂的查询，它可能会产生许多子选择。有些人甚至因为这个事实而拒绝使用 EF。但这只是使用 EF 这样的工具的代价。您失去了对某些事物的细粒度控制，这可以带来巨大的性能提升。当您可以使用 C 并获得更高性能时，为什么还要使用 .NET？当可以使用汇编来获得更多性能提升时，为什么还要使用 C？

确保安全并仍然能够使用高抽象层 EF 的唯一方法是经常使用 SQL 分析器并检查是否存在对真实数据花费太长时间的查询。如果您找到一些，则将它们转换为直接 SQL 或存储过程。

回复收藏 0 原文

~没有更多了~