NHibernate:为什么 Linq First() 使用 FetchMany() 强制所有子集合和孙集合中的只有一项
域模型
我有一个 Customer
的规范域,其中包含许多 Orders
,每个 Order
都有许多 OrderItems
:
客户
public class Customer
{
public Customer()
{
Orders = new HashSet<Order>();
}
public virtual int Id {get;set;}
public virtual ICollection<Order> Orders {get;set;}
}
订单
public class Order
{
public Order()
{
Items = new HashSet<OrderItem>();
}
public virtual int Id {get;set;}
public virtual Customer Customer {get;set;}
}
OrderItems
public class OrderItem
{
public virtual int Id {get;set;}
public virtual Order Order {get;set;}
}
问题
无论是使用 FluentNHibernate 还是 hbm 文件进行映射,我都会运行两个单独的查询,它们的 Fetch() 语法相同,但其中一个查询包含 .First() 扩展方法。
返回预期结果:
var customer = this.generator.Session.Query<Customer>()
.Where(c => c.CustomerID == id)
.FetchMany(c => c.Orders)
.ThenFetchMany(o => o.Items).ToList()[0];
仅返回每个集合中的单个项目:
var customer = this.generator.Session.Query<Customer>()
.Where(c => c.CustomerID == id)
.FetchMany(c => c.Orders)
.ThenFetchMany(o => o.Items).First();
我想我明白这里发生了什么,即 .First() 方法应用于每个集合前面的语句,而不仅仅是最初的 .Where() 子句。考虑到 First() 返回的是 Customer,这对我来说似乎是不正确的行为。
编辑 2011-06-17
经过进一步的研究和思考,我相信根据我的映射,此方法链有两种结果:
.Where(c => c.CustomerID == id)
.FetchMany(c => c.Orders)
.ThenFetchMany(o => o.Items);
注意:我认为我无法获得子选择行为,因为我没有使用 HQL。
- 当映射为
fetch="join"
时,我应该在 Customer、Order 和 OrderItem 表之间获得笛卡尔积。 - 当映射为
fetch="select"
时,我应该获取对 Customer 的查询,然后对 Orders 和 OrderItems 分别进行多个查询。
通过将 First() 方法添加到链中,我不知道应该发生什么。
get 发出的 SQL 查询是传统的左外连接查询,前面有 select top (@p0)
。
Domain Model
I've got a canonical Domain of a Customer
with many Orders
, with each Order
having many OrderItems
:
Customer
public class Customer
{
public Customer()
{
Orders = new HashSet<Order>();
}
public virtual int Id {get;set;}
public virtual ICollection<Order> Orders {get;set;}
}
Order
public class Order
{
public Order()
{
Items = new HashSet<OrderItem>();
}
public virtual int Id {get;set;}
public virtual Customer Customer {get;set;}
}
OrderItems
public class OrderItem
{
public virtual int Id {get;set;}
public virtual Order Order {get;set;}
}
Problem
Whether mapped with FluentNHibernate or hbm files, I run two separate queries, that are identical in their Fetch() syntax, with the exception of one including the .First() extension method.
Returns expected results:
var customer = this.generator.Session.Query<Customer>()
.Where(c => c.CustomerID == id)
.FetchMany(c => c.Orders)
.ThenFetchMany(o => o.Items).ToList()[0];
Returns only a single item in each collection:
var customer = this.generator.Session.Query<Customer>()
.Where(c => c.CustomerID == id)
.FetchMany(c => c.Orders)
.ThenFetchMany(o => o.Items).First();
I think I understand what's going on here, which is that the .First() method is being applied to each of the preceding statements, rather than just to the initial .Where() clause. This seems incorrect behavior to me, given the fact that First() is returning a Customer.
Edit 2011-06-17
After further research and thinking, I believe that depending on my mapping, there are two outcomes to this Method Chain:
.Where(c => c.CustomerID == id)
.FetchMany(c => c.Orders)
.ThenFetchMany(o => o.Items);
NOTE: I don't think I can get subselect behavior, since I'm not using HQL.
- When the mapping is
fetch="join"
I should get a cartesian product between the Customer, Order and OrderItem tables. - When the mapping is
fetch="select"
I should get a query for Customer, and then multiple queries each for Orders and OrderItems.
How this plays out with adding the First() method to the chain is where I lose track of what should be happening.
The SQL Query that get's issued is the traditional left-outer-join query, with select top (@p0)
in front.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
First()
方法被转换为 SQL(至少是 T-SQL),如SELECT TOP 1 ...
。与您的联接提取相结合,这将返回一行,其中包含一名客户、该客户的一份订单以及该订单的一项商品。您可能会认为这是 Linq2NHibernate 中的一个错误,但由于连接获取很少见(而且我认为您实际上会损害您的性能,因为在网络中拉动相同的客户和订单字段值作为每个项目行的一部分)我怀疑团队会修复它。您想要的是一个客户,然后是该客户的所有订单以及所有这些订单的所有项目。这是通过让 NHibernate 运行 SQL 来实现的,该 SQL 将提取一条完整的客户记录(每个订单行对应一行)并构建客户对象图。将 Enumerable 转换为 List,然后获取第一个元素是可行的,但以下方法会稍微快一些:
AsEnumerable() 函数强制评估由 Query 创建并使用其他方法修改的 IQueryable,吐出内存中的 Enumerable,无需将其放入具体的列表中(如果愿意,NHibernate 可以简单地从 DataReader 中提取足够的信息来创建一个完整的顶级实例)。现在,First() 方法不再应用于要转换为 SQL 的 IQueryable,而是应用于对象图的内存中 Enumerable,在 NHibernate 完成其工作并给出您的Where子句之后,应该是零个或一个带有水合订单集合的客户记录。
就像我说的,我认为你使用连接获取会伤害自己。每行包含客户数据和订单数据,连接到每个不同的行。这是大量的冗余数据,我认为这会比 N+1 查询策略花费更多。
我能想到的处理此问题的最佳方法是每个对象执行一个查询来检索该对象的子对象。它看起来像这样:
这需要对每个订单进行查询,再加上客户级别的两个查询,并且不会返回重复数据。这比返回包含客户和订单的每个字段以及每个项目的行的单个查询要好得多,也比发送每个项目的查询加上每个订单的查询加上客户的查询要好。
The
First()
method is translated into SQL (T-SQL at least) asSELECT TOP 1 ...
. Combined with your join fetching, this will return a single row, containing one customer, one order for that customer and one item for the order. You might consider this a bug in Linq2NHibernate, but as join fetching is rare (and I think you're actually hurting your performance pulling the same Customer and Order field values across the network as part of the row for each Item) I doubt the team will fix it.What you want is a single Customer, then all Orders for that customer and all Items for all those Orders. That happens by letting NHibernate run SQL that will pull one full Customer record (which will be a row for each Order Line) and construct the Customer object graph. Turning the Enumerable into a List and then getting the first element works, but the following will be slightly faster:
the AsEnumerable() function forces evaluation of the IQueryable created by Query and modified with the other methods, spitting out an in-memory Enumerable, without slurping it into a concrete List (NHibernate can, if it wishes, simply pull enough info out of the DataReader to create one full top-level instance). Now, the First() method is no longer applied to the IQueryable to be translated to SQL, but it is instead applied to an in-memory Enumerable of the object graphs, which after NHibernate has done its thing, and given your Where clause, should be zero or one Customer record with a hydrated Orders collection.
Like I said, I think you're hurting yourself using join fetching. Each row contains the data for the Customer and the data for the Order, joined to each distinct Line. That is a LOT of redundant data, which I think will cost you more than even an N+1 query strategy.
The best way I can think of to handle this is one query per object to retrieve that object's children. It would look like this:
This requires a query for each Order, plus two at the Customer level, and will return no duplicate data. This will perform far better than a single query returning a row containing every field of the Customer and Order along with each Item, and also better than sending a query per Item plus a query per Order plus a query for the Customer.
我想用我的发现更新答案,以便可以帮助其他遇到同样问题的人。
由于您是根据 ID 查询实体,因此可以使用 .Single 代替 .First 或 .AsEnumerable().First():
这将生成带有 where 子句且没有 TOP 1 的普通 SQL 查询。
在其他情况下,如果结果有多个客户,则会抛出异常,因此如果您确实需要基于条件的系列中的第一项,它将无济于事。您必须使用 2 个查询,一个针对第一个客户,让延迟加载执行第二个查询。
I'd like to update the answer with my found so that could help anybody else with the same problem.
Since you are querying the entity base on their ID, you can use .Single instead of .First or .AsEnumerable().First():
This will generate a normal SQL query with where clause and without the TOP 1.
In other situation, if the result has more than one Customer, exception will be thrown so it won't help if you really need the first item of a series based on condition. You have to use 2 queries, one for the first Customer and let the lazy load do the second one.