向我推荐一个合适的 LINQ 提供程序(SQL 服务器、复杂查询)
我一直在使用 LINQ to SQL &对实体已经有一段时间了,总的来说我对他们非常满意。 然而,我知道它们的局限性,尤其是其中一个对我来说正在成为一个大问题。 当您以“我测试过的提供程序是 LINQ TO SQL / LINQ TO 实体”的形式执行复杂的嵌套查询时
MyContext.SomeTable
.Select(item=>new{
item.SomeProperty1,
item.SomeProperty2,
item.NavigationProperty1
.Select(nav1=> new {// retrieve some properties}), // This triggers a single query as long as don't have more than one subquery
item.NavigationProperty2
.Select(nav2=> new {// retrieve some properties}) // This triggers one query PER ROW in the original query
});
(更糟糕的是,devart LINQConnect 的情况更糟,并且在第一个导航属性上每行生成 1 个),
我现在得到的是生成(伪代码):
select t1.a,t1.b,t2.c,t2.d from mytable as t1
join navproperty1table as t2
以及 100 万个(如果第一组中有 100 万个结果)查询,如下所示: select t3.e,t3.f from navproperty2table as t3 where id = X
(X 在 X 查询上更改为第一个查询返回的下一个元素)
我想要什么:
select t1.a,t1.b,t2.c,t2.d,t3.e,t3.f from mytable as t1
join navproperty1table as t2
join navproperty2table as t3
现在当然如果有 3 行原始表这不会是问题,但我的表中有数十万到数百万行“并且”我需要在单个选择中进行更复杂的查询(我想立即获得一个复杂的图表)。想象 20 个以上的表,具有 3-6 级嵌套,每个表访问额外的 2-5 个表。
我的 SQL 服务器可以完美地应对它,我也不关心带宽,它位于通过千兆位连接链接的实例上,我无法以延迟方式获取该数据,我实际上立即“使用”所有数据,所以这不仅仅是懒惰。现在,出于性能原因,我必须将查询拆分为许多小查询,并在 LINQ 上手动将它们连接到对象大小,这为维护它的人提供了一些非常令人讨厌的代码,但这是我拥有的唯一实际解决方案,因此总的来说,包括所有小查询和最终连接,我在一个方法中使用了 600 多行不可分割的代码,这是完全无法维护的。
在我去评估所有以这种心态工作的 LINQ 提供程序之前,今天实际上是否有“任何”LINQ 提供程序已经准备好生产,或者我最好自己编写代码并将其商业化? (令我感到非常惊讶的是,实际上它们并不都是这样工作的,我看不到任何一个实例,您可以更好地使用 foreach 案例以及我尝试过的声称摆脱 n +1 使用 loadwith,不要摆脱它,因为它们仍然执行 n+1 查询,但只是在一次调用中对其进行批处理,当 1 是 10 000 然后 10 000 时,1 次往返和 n+1 查询并不令人满意。 000 然后 10 000 000 000)
- (请注意,我正在猜测到底是什么触发了这个,但这不是问题,无论什么“完全”触发了这个,我肯定会在当前的上下文中触发它)
PS :请注意,我在 Windows Server 2008 或更高版本以及 SQL Server 2008 或更高版本上运行 .NET 4.0 完整配置文件,不支持其他任何内容的提供程序将没关系,我对迁移、可移植性、较低的 .net 版本、较低的 sql 服务器支持等的要求为零。如果需要,可以选择迁移到更新的版本。我也没有任何建模或高级功能的先决条件,数据库已经存在,我只想查询表,所以没有建模/视图/DML/存储过程/函数支持的东西就可以了,这是我唯一的要求在复杂查询和对象图上生成明智的 SQL
编辑:为了澄清,这里是每个人都可以获得的数据库上的问题的实际示例,adventureworks
查询每个联系人的员工
Contacts
.Select(cont=>new
{
cont.EmailAddress,
cont.EmailPromotion,
Employees = cont.Employees
.Select(emp=>new
{
emp.Gender,
emp.HireDate
}).ToList()
}).ToList()
生成
SELECT [t0].[EmailAddress], [t0].[EmailPromotion], [t1].[Gender], [t1].[HireDate], (
SELECT COUNT(*)
FROM [HumanResources].[Employee] AS [t2]
WHERE [t2].[ContactID] = [t0].[ContactID]
) AS [value]
从 [Person].[Contact] AS [t0] 左外连接 [HumanResources].[Employee] AS [t1] ON [t1].[ContactID] = [t0].[ContactID] ORDER BY [t0].[ContactID], [t1].[EmployeeID]
现在仅查询每个联系人的供应商 联系方式 .Select(cont=>new { 续电子邮件地址, 续电子邮件促销, 供应商 = cont.VendorContacts.Select(vend=>new { vend.ContactTypeID, vend.修改日期 }).ToList() }).ToList()
仍然可以:
SELECT [t0].[EmailAddress], [t0].[EmailPromotion], [t1].[ContactTypeID], [t1].[ModifiedDate], (
SELECT COUNT(*)
FROM [Purchasing].[VendorContact] AS [t2]
WHERE [t2].[ContactID] = [t0].[ContactID]
) AS [value]
FROM [Person].[Contact] AS [t0] 左外连接 [采购].[供应商联系人] AS [t1] ON [t1].[联系人 ID] = [t0].[联系人 ID] ORDER BY [t0].[ContactID], [t1].[VendorID]
现在同时查询两者(触发 X 行查询)
Contacts
.Select(cont=>new
{
cont.EmailAddress,
cont.EmailPromotion,
Employees = cont.Employees
.Select(emp=>new
{
emp.Gender,
emp.HireDate
}).ToList(),
Vendors = cont.VendorContacts.Select(vend=>new
{
vend.ContactTypeID,
vend.ModifiedDate
}).ToList()
}).ToList()
生成丑陋且缓慢的结果(由于明显的原因没有全部粘贴,但你明白了):
SELECT [t0].[EmailAddress], [t0].[EmailPromotion], [t1].[Gender], [t1].[HireDate], (
SELECT COUNT(*)
FROM [HumanResources].[Employee] AS [t2]
WHERE [t2].[ContactID] = [t0].[ContactID]
) AS [value], [t0].[ContactID]
FROM [Person].[Contact] AS [t0]
LEFT OUTER JOIN [HumanResources].[Employee] AS [t1] ON [t1].[ContactID] = [t0].[ContactID]
ORDER BY [t0].[ContactID], [t1].[EmployeeID]
GO
-- Region Parameters
DECLARE @x1 Int = 1
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO
-- Region Parameters
DECLARE @x1 Int = 2
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO
-- Region Parameters
DECLARE @x1 Int = 3
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO
-- Region Parameters
DECLARE @x1 Int = 4
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO
-- Region Parameters
DECLARE @x1 Int = 5
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO
-- Region Parameters
DECLARE @x1 Int = 6
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO
-- Region Parameters
DECLARE @x1 Int = 7
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO
-- Region Parameters
DECLARE @x1 Int = 8
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO
-- Region Parameters
DECLARE @x1 Int = 9
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO
-- Region Parameters
DECLARE @x1 Int = 10
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO
我是什么期望/希望看到生成:
SELECT [t0].[EmailAddress], [t0].[EmailPromotion], [t1].[Gender], [t1].[HireDate], [t2].[ContactTypeID], [t2].[ModifiedDate] ,[t0].[ContactID]
FROM [Person].[Contact] AS [t0]
LEFT OUTER JOIN [HumanResources].[Employee] AS [t1] ON [t1].[ContactID] = [t0].[ContactID]
LEFT OUTER JOIN [Purchasing].[VendorContact] AS [t2] ON [t2].[ContactID] = [t0].[ContactID]
GO
I've been using LINQ to SQL & to entities for a while and am overall very happy with them.
However i know of their limitations and one in particular is becoming a large issue for me.
When you do a complex nested query in the form of
MyContext.SomeTable
.Select(item=>new{
item.SomeProperty1,
item.SomeProperty2,
item.NavigationProperty1
.Select(nav1=> new {// retrieve some properties}), // This triggers a single query as long as don't have more than one subquery
item.NavigationProperty2
.Select(nav2=> new {// retrieve some properties}) // This triggers one query PER ROW in the original query
});
The providers i have tested are LINQ TO SQL / LINQ TO entities (and even worse, devart LINQConnect that fares worse and generates 1 per row on the first navigation property)
What i get now that is generated(pseudocode):
select t1.a,t1.b,t2.c,t2.d from mytable as t1
join navproperty1table as t2
and 1 millions (if there is 1 million results in the first set) of queries like this:select t3.e,t3.f from navproperty2table as t3 where id = X
(X changing on X query to next element returned by first query)
What i want:
select t1.a,t1.b,t2.c,t2.d,t3.e,t3.f from mytable as t1
join navproperty1table as t2
join navproperty2table as t3
Now of course if there were 3 rows in the original table it wouldn't be an issue, but i have 10s of thousands to millions of rows in my tables "and" i need a much much much more complex query in a single select (i want to get a complex graph at once). Think 20 + tables with 3-6 levels of nesting accessing an additional 2-5 tables each.
My SQL server can perfectly cope with it, i don't care for the bandwidth either, it's on an instance linked by a gigabit connection, i can't get that data in deferred manner, i actually "use" all of it immediately so it's not just laziness. Right now for performance reasons i had to split the query in many small queries and join them manually on the LINQ to object size, which gives some really nasty code for whoever maintains it but was the only actual solution i had, so overall including all the small queries and final joining, I'm at over 600 lines of unsplitable code in a single method that is totally unmaintainable.
Are there actually "any" LINQ providers production ready today before i go and evaluated them all that work in such a mindset or am i better off coding and commercializing my own? (I'm very surprised that they don't all work that way actually, i can't see a single instance where you'd be better off with the foreach case and the ones i've tried that claim to get rid of n+1 with loadwith, don't get rid of it as they still do n+1 queries but just batch it in a single call, 1 round trip & n+1 queries isn't satisfying when 1 is 10 000 then 10 000 000 and then 10 000 000 000)
- (note that I'm speculating on what exactly triggers this, but it isn't the question, no matter what triggers this "exactly" I'm sure to hit it in my current context)
PS: Note that I'm running .NET 4.0 full profile on a windows server 2008 or higher and on SQL server 2008 or higher, a provider that doesn't support anything else would be fine, i have zero requirements for migration, portability, lower .net versions, lower sql server support etc. Migrating to even more recent versions is an option if required. I also don't have any prerequisites for modeling or advanced features, the DB is already there, i only want to query tables, so something with no modeling / views / DML / stored procedure / functions support is fine, my one and only requirement is sensible SQL generation on complex queries and object graphs
EDIT: for clarification here is an actual example of the issue on a DB everyone can get, adventureworks
Querying employees for each contact
Contacts
.Select(cont=>new
{
cont.EmailAddress,
cont.EmailPromotion,
Employees = cont.Employees
.Select(emp=>new
{
emp.Gender,
emp.HireDate
}).ToList()
}).ToList()
Generates
SELECT [t0].[EmailAddress], [t0].[EmailPromotion], [t1].[Gender], [t1].[HireDate], (
SELECT COUNT(*)
FROM [HumanResources].[Employee] AS [t2]
WHERE [t2].[ContactID] = [t0].[ContactID]
) AS [value]
FROM [Person].[Contact] AS [t0]
LEFT OUTER JOIN [HumanResources].[Employee] AS [t1] ON [t1].[ContactID] = [t0].[ContactID]
ORDER BY [t0].[ContactID], [t1].[EmployeeID]
Now querying just vendors for each Contact
Contacts
.Select(cont=>new
{
cont.EmailAddress,
cont.EmailPromotion,
Vendors = cont.VendorContacts.Select(vend=>new
{
vend.ContactTypeID,
vend.ModifiedDate
}).ToList()
}).ToList()
still ok:
SELECT [t0].[EmailAddress], [t0].[EmailPromotion], [t1].[ContactTypeID], [t1].[ModifiedDate], (
SELECT COUNT(*)
FROM [Purchasing].[VendorContact] AS [t2]
WHERE [t2].[ContactID] = [t0].[ContactID]
) AS [value]
FROM [Person].[Contact] AS [t0]
LEFT OUTER JOIN [Purchasing].[VendorContact] AS [t1] ON [t1].[ContactID] = [t0].[ContactID]
ORDER BY [t0].[ContactID], [t1].[VendorID]
Now querying both at once (triggers X row query)
Contacts
.Select(cont=>new
{
cont.EmailAddress,
cont.EmailPromotion,
Employees = cont.Employees
.Select(emp=>new
{
emp.Gender,
emp.HireDate
}).ToList(),
Vendors = cont.VendorContacts.Select(vend=>new
{
vend.ContactTypeID,
vend.ModifiedDate
}).ToList()
}).ToList()
Generates the ugly and slow (not pasting it all for obvious reasons but you get the point):
SELECT [t0].[EmailAddress], [t0].[EmailPromotion], [t1].[Gender], [t1].[HireDate], (
SELECT COUNT(*)
FROM [HumanResources].[Employee] AS [t2]
WHERE [t2].[ContactID] = [t0].[ContactID]
) AS [value], [t0].[ContactID]
FROM [Person].[Contact] AS [t0]
LEFT OUTER JOIN [HumanResources].[Employee] AS [t1] ON [t1].[ContactID] = [t0].[ContactID]
ORDER BY [t0].[ContactID], [t1].[EmployeeID]
GO
-- Region Parameters
DECLARE @x1 Int = 1
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO
-- Region Parameters
DECLARE @x1 Int = 2
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO
-- Region Parameters
DECLARE @x1 Int = 3
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO
-- Region Parameters
DECLARE @x1 Int = 4
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO
-- Region Parameters
DECLARE @x1 Int = 5
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO
-- Region Parameters
DECLARE @x1 Int = 6
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO
-- Region Parameters
DECLARE @x1 Int = 7
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO
-- Region Parameters
DECLARE @x1 Int = 8
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO
-- Region Parameters
DECLARE @x1 Int = 9
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO
-- Region Parameters
DECLARE @x1 Int = 10
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO
What i expect / would like to see generated:
SELECT [t0].[EmailAddress], [t0].[EmailPromotion], [t1].[Gender], [t1].[HireDate], [t2].[ContactTypeID], [t2].[ModifiedDate] ,[t0].[ContactID]
FROM [Person].[Contact] AS [t0]
LEFT OUTER JOIN [HumanResources].[Employee] AS [t1] ON [t1].[ContactID] = [t0].[ContactID]
LEFT OUTER JOIN [Purchasing].[VendorContact] AS [t2] ON [t2].[ContactID] = [t0].[ContactID]
GO
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
解决方法是
根据您的定义
创建一个
视图
,并使用 linq-2-sql 查询该视图。不确定我是否完全理解你的查询,但你可能会这样做
等等......我现在无法测试它,但我很确定 yhis 会创建你想要的选择(并且只有一个)。
A workaround would be to create a
view
from your definition
and use linq-2-sql to query that view.
Not sure if I understand your query completely, but uou might just do
and so on.. I cannot test it right now but I am pretty sure yhis will create the select (and only one) you want.
我认为你能得到的最接近的是 NHibernate 的 Fetch (没有 linq)。
对于深度嵌套的数据(例如,ThenFetchMany),如果您也很快达到 NHibernate 的极限,我不会感到惊讶。复杂的查询对于 O/RM 工具来说总是非常困难。 NHibernate 在生成 SQL 方面总是给我留下深刻的印象(linq-to-nhibernate 还没有!)。但即使是不太复杂的场景,有时要防止 1 + N 问题的发生也是一项艰巨的任务。
也许使用 NHibernate 的 HQL 你可以实现你想要的。
使用 linq,我认为您能做的最好的事情就是以尽可能少的查询次数检索所需的对象图。
I think the closest you can get then is NHibernate's Fetch (no linq).
With deeply nested data (e.g. ThenFetchMany), I wouldn't be surprised if you'd quickly hit the limits of NHibernate too. Complex querying is always very hard on O/RM tools. NHibernate has always impressed me when it comes to generated SQL (linq-to-nhibernate not yet!). But even with less complex scenarios it sometimes was a hell of a job to prevent the 1 + N problem.
Maybe with NHibernate's HQL you can achieve what you want.
With linq, I think the best you can do is retrieve the required object graph in the least possible number of queries.
你也可以这样做:
You could also do something like this:
我找到了一个似乎可以处理我的核心问题(合理生成 SQL 与为子查询生成数百万条语句)的提供商,但不确定它是否合适,因为这取决于他们的答案。
http://www.llblgen.com/tinyforum/ Messages.aspx?ThreadID=20658&StartAtMessage=0
我应该知道的任何其他提供程序 的?如果我直到现在才错过这个,可能还有其他的,我很乐意对它们进行比较。我现在所拥有的是
完全无法满足我对每行没有 1 个查询问题的要求:
- linq 到 SQL
- 链接到实体
- devart linqconnect
似乎可以工作
- llblgen
未经测试/需要反馈
- Telerik OpenAccess
-NHibernate
- Mindscape lightspeed
还有其他我应该知道的吗?
I've found a provider that seems to handle my core problem (sane generation of SQL vs generating millions of statements for subqueries), not sure if it is a good fit yet as this depends on their answers.
http://www.llblgen.com/tinyforum/Messages.aspx?ThreadID=20658&StartAtMessage=0
Any other providers i should know of? If i managed to miss this one untill now there may be others and i'd be most happy to compare them. What i have now is
Totally fails my requirement for no 1 query per row issue:
- linq to SQL
- linq to Entities
- devart linqconnect
Seems to work
- llblgen
Untested / need feedback
- Telerik OpenAccess
- NHibernate
- Mindscape lightspeed
any others i should know off?