SQL Server 查询的最大大小? IN 子句?有更好的方法吗
可能的重复:
T-SQL WHERE col IN (…)
SQL Server 的最大大小是多少询问? (字符数)
IN 子句的最大大小?我想我看到了有关 Oracle 有 1000 个项目限制的信息,但您可以通过将 2 个 IN 进行 AND 运算来解决这个问题。 SQL Server 中存在类似问题吗?
更新 那么,如果我需要从另一个系统(非关系数据库)获取 1000 个 GUID 并对 SQL Server 执行“JOIN in code”,最好的方法是什么?是将 1000 个 GUID 列表提交到 IN 子句吗? 或者还有其他更有效的技术吗?
我还没有对此进行测试,但我想知道是否可以将 GUID 作为 XML 文档提交。例如
<guids>
<guid>809674df-1c22-46eb-bf9a-33dc78beb44a</guid>
<guid>257f537f-9c6b-4f14-a90c-ee613b4287f3</guid>
</guids>
,然后对文档和表进行某种 XQuery JOIN。效率低于 1000 项 IN 子句?
Possible Duplicate:
T-SQL WHERE col IN (…)
What is the maximum size for a SQL Server query? (# of characters)
Max size for an IN clause? I think I saw something about Oracle having a 1000 item limit but you could get around this with ANDing 2 INs together. Similar issue in SQL Server?
UPDATE
So what would be the best approach if I need to take say 1000 GUIDs from another system (Non Relational Database) and do a "JOIN in code' against the SQL Server? Is it to submit the list of 1000 GUIDs to an IN clause?
Or is there another technique that works more efficiently?
I haven't tested this but I wonder if I could submit the GUIDs as an XML doc. For example
<guids>
<guid>809674df-1c22-46eb-bf9a-33dc78beb44a</guid>
<guid>257f537f-9c6b-4f14-a90c-ee613b4287f3</guid>
</guids>
and then do some kind of XQuery JOIN against the Doc and the Table. Less efficient than 1000 item IN clause?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
每个 SQL 批处理都必须符合批处理大小限制:65,536 * 网络数据包大小。
除此之外,您的查询受到运行时条件的限制。它通常会耗尽堆栈大小,因为 x IN (a,b,c) 只是 x=a OR x=b OR x=c ,它创建了一个类似于 x=a OR (x=b OR (x =c)),因此大量 OR 会使它变得非常深。 SQL 7 将在 IN 中的大约 10k 值处达到 SO SO,但现在的堆栈更深了(因为 x64),所以它可以非常深入。
更新
您已经找到了 Erland 的关于将列表/数组传递到 SQL Server 的文章。使用 SQL 2008,您还可以使用表值参数,它允许您传递整个 DataTable 作为单个表类型参数并连接它。
XML 和 XPath 是另一个可行的解决方案:
Every SQL batch has to fit in the Batch Size Limit: 65,536 * Network Packet Size.
Other than that, your query is limited by runtime conditions. It will usually run out of stack size because x IN (a,b,c) is nothing but x=a OR x=b OR x=c which creates an expression tree similar to x=a OR (x=b OR (x=c)), so it gets very deep with a large number of OR. SQL 7 would hit a SO at about 10k values in the IN, but nowdays stacks are much deeper (because of x64), so it can go pretty deep.
Update
You already found Erland's article on the topic of passing lists/arrays to SQL Server. With SQL 2008 you also have Table Valued Parameters which allow you to pass an entire DataTable as a single table type parameter and join on it.
XML and XPath is another viable solution:
SQL Server 最高限额已公开 http://msdn.microsoft.com/en-us /library/ms143432.aspx(这是 2008 版本)
SQL 查询可以是 varchar(max),但显示为限制为 65,536 * 网络数据包大小,但即便如此,最有可能让您出错的是什么是每个查询的 2100 个参数。如果 SQL 选择参数化 in 子句中的文字值,我认为您会首先达到该限制,但我还没有测试过它。
编辑:测试它,即使在强制参数化的情况下它仍然存在 - 我进行了一个快速测试并让它在 In 子句中使用 30k 项执行。 (SQL Server 2005)
在 100k 项时,花了一些时间然后下降:
所以 30k 是可能的,但仅仅因为你可以做到 - 并不意味着你应该:)
编辑:由于其他问题而继续。
50k 有效,但 60k 退出,所以顺便说一句,在我的测试设备上的某个地方。
至于如何在不使用大 in 子句的情况下进行值的联接,我个人会创建一个临时表,将值插入到该临时表中,为其建立索引,然后在联接中使用它,从而为其提供最佳机会优化连接。 (在临时表上生成索引将为它创建统计信息,这作为一般规则将有助于优化器,尽管 1000 个 GUID 并不一定会发现统计信息太有用。)
The SQL Server Maximums are disclosed http://msdn.microsoft.com/en-us/library/ms143432.aspx (this is the 2008 version)
A SQL Query can be a varchar(max) but is shown as limited to 65,536 * Network Packet size, but even then what is most likely to trip you up is the 2100 parameters per query. If SQL chooses to parameterize the literal values in the in clause, I would think you would hit that limit first, but I havn't tested it.
Edit : Test it, even under forced parameteriztion it survived - I knocked up a quick test and had it executing with 30k items within the In clause. (SQL Server 2005)
At 100k items, it took some time then dropped with:
So 30k is possible, but just because you can do it - does not mean you should :)
Edit : Continued due to additional question.
50k worked, but 60k dropped out, so somewhere in there on my test rig btw.
In terms of how to do that join of the values without using a large in clause, personally I would create a temp table, insert the values into that temp table, index it and then use it in a join, giving it the best opportunities to optimse the joins. (Generating the index on the temp table will create stats for it, which will help the optimiser as a general rule, although 1000 GUIDs will not exactly find stats too useful.)
每批 65536 * 网络数据包大小 为 4k,即 256 MB
但是, IN 会在此之前停止,但并不精确。
你最终会出现记忆错误,但我记不起确切的错误。
无论如何,巨大的 IN 都是低效的。
编辑:雷姆斯提醒我:错误与“堆栈大小”有关
Per batch, 65536 * Network Packet Size which is 4k so 256 MB
However, IN will stop way before that but it's not precise.
You end up with memory errors but I can't recall the exact error.
A huge IN will be inefficient anyway.
Edit: Remus reminded me: the error is about "stack size"
您可以将 GUID 加载到临时表中然后执行
Can you load the GUIDs into a scratch table then do a