SQL 连接填充 tempDB
我们正在尝试使用单个查询连接 SQL 中可能数千行的文本。我们当前的查询如下所示:
DECLARE @concatText NVARCHAR(MAX)
SET @concatText = ''
UPDATE TOP (SELECT MAX(PageNumber) + 1 FROM #OrderedPages) [#OrderedPages]
SET @concatText = @concatText + [ColumnText] + '
'
WHERE (RTRIM(LTRIM([ColumnText])) != '')
从功能的角度来看,这工作得非常好。我们遇到的唯一问题是有时 ColumnText 的长度可能有几千字节。因此,当我们有数千个这样的行时,我们就会填满 tempDB。
我们提出的最好理由是,当我们对 @concatText 进行这些更新时,SQL 使用隐式事务,因此字符串实际上是不可变的。
我们正在试图找出解决这个问题的好方法,到目前为止我们有两种可能的解决方案: 1) 在.NET中进行串联。这是一个不错的选择,但是可能会通过网络返回大量数据。
2) 使用.WRITE,其操作方式与.NET 的String.Join 方法类似。我无法弄清楚它的语法,因为 BoL 不涵盖这种级别的 SQL 恶作剧。
这让我想到一个问题:.WRITE 会起作用吗?如果是这样,语法是什么?如果没有,是否有其他方法可以在不将数据发送到 .NET 的情况下执行此操作?我们不能使用 FOR XML
因为我们的文本可能包含非法 XML 字符。
提前致谢。
We are attempting to concatenate possibly thousands of rows of text in SQL with a single query. The query that we currently have looks like this:
DECLARE @concatText NVARCHAR(MAX)
SET @concatText = ''
UPDATE TOP (SELECT MAX(PageNumber) + 1 FROM #OrderedPages) [#OrderedPages]
SET @concatText = @concatText + [ColumnText] + '
'
WHERE (RTRIM(LTRIM([ColumnText])) != '')
This is working perfectly fine from a functional standpoint. The only issue we're having is that sometimes the ColumnText can be a few kilobytes in length. As a result, we're filling up tempDB when we have thousands of these rows.
The best reason that we have come up with is that as we're doing these updates to @concatText, SQL is using implicit transactions so the strings are effectively immutable.
We are trying to figure out a good way of solving this problem and so far we have two possible solutions:
1) Do the concatenation in .NET. This is an OK option, but that's a lot of data that may go back across the wire.
2) Use .WRITE which operates in a similar fashion to .NET's String.Join method. I can't figure out the syntax for this as BoL doesn't cover this level of SQL shenanigans.
This leads me to the question: Will .WRITE work? If so, what's the syntax? If not, are there any other ways to do this without sending data to .NET? We can't use FOR XML
because our text may contain illegal XML characters.
Thanks in advance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我会考虑使用 CLR 集成,正如@Martin 评论中所建议的那样。 CLR 聚合函数可能只是一个解决方案。
I'd look at using CLR integration, as suggested in @Martin's comment. A CLR aggregate function might be just the ticket.
到底什么是填充tempdb?它不能是
@concatText = @concatText + [ColumnText]
,不涉及不变性,@concatText 变量在最坏的情况下将是 2GB 大小(我希望你的 tempdb 比这个大得多,如果不是的话)增加它)。看起来更像是您的查询计划创建了一个用于万圣节保护的线轴,而该线轴是罪魁祸首。作为一般答案,使用 UPDATE ... SET @var = @var + ... 进行串联已知存在正确性问题,并且不受支持。 在 Transact-SQL 中连接行值。
What exactly is filling up tempdb? It cannot be
@concatText = @concatText + [ColumnText]
, there is no immutability involved and the @concatText variable will be at worst case 2GB size (I expect your tempdb is much larger than that, if not increase it). It seems more like your query plan creates a spool for haloween protection and that spool is the culprit.As a generic answer, using the
UPDATE ... SET @var = @var + ...
for concatenation is known to have correctness issues and is not supported. Alternative approaches that work more reliably are discussed in Concatenating Row Values in Transact-SQL.首先,从您的帖子来看,尚不清楚您是否或为什么需要临时表。连接可以在查询中内联完成。如果您向我们展示有关填充 tempdb 的查询的更多信息,我们也许可以帮助您重写它。其次,没有提到的一个选项是完全在 T-SQL 之外进行字符串操作。即,在原始数据的中间层查询中,进行操作并将其推回数据库。最后,您可以使用Xml,以便结果正确处理转义和实体。同样,我们需要更多地了解您想要完成什么以及如何完成。
First, from your post, it isn't clear whether or why you need temp tables. Concatenation can be done inline in a query. If you show us more about the query that is filling up tempdb, we might be able to help you rewrite it. Second, an option that hasn't been mentioned is to do the string manipulation outside of T-SQL entirely. I.e., in your middle-tier query for the raw data, do the manipulation and push it back to the database. Lastly, you can use Xml such that the results handle escapes and entities properly. Again, we'd need to know more about what and how you are trying to accomplish.
同意..CLR 用户定义函数将是你们正在做的事情的最佳方法。实际上,您可以将文本值读取到对象中,然后将它们全部连接在一起(在 CLR 内),并让函数输出 NVARCHAR(MAX) 结果。如果您需要有关如何执行此操作的详细信息,请告诉我。
Agreed..A CLR User Defined Function would be the best approach for what you guys are doing. You could actually read the text values into an object and then join them all together (inside the CLR) and have the function spit out a NVARCHAR(MAX) result. If you need details on how to do this let me know.