SQL Server 2000:执行串联聚合子查询的想法

发布于 2024-08-24 10:21:54 字数 3467 浏览 6 评论 0原文

我有一个返回我想要的行的查询,例如

QuestionID  QuestionTitle  UpVotes  DownVotes  
==========  =============  =======  =========  
2142075     Win32: Cre...        0          0  
2232727     Win32: How...        2          0  
1870139     Wondows Ae...       12          0  

现在我想要返回一个,其中包含逗号分隔的“作者”列表(例如原始海报和编辑)。例如:

QuestionID  QuestionTitle  UpVotes  DownVotes  Authors
==========  =============  =======  =========  ==========
2142075     Win32: Cre...        0          0  Ian Boyd  
2232727     Win32: How...        2          0  Ian Boyd, roygbiv
1870139     Wondows Ae...       12          0  Ian Boyd, Aaron Klotz, Jason Diller, danbystrom


伪造

SQL Server 2000 没有 CONCAT(AuthorName, ', ') 聚合操作,我一直在伪造它 - 对 TOP 执行简单的子选择1 作者,以及作者计数。

QuestionID  QuestionTitle  UpVotes  DownVotes  FirstAuthor  AuthorCount  
==========  =============  =======  =========  ===========  =========== 
2142075     Win32: Cre...        0          0  Ian Boyd               1 
2232727     Win32: How...        2          0  Ian Boyd               2
1870139     Wondows Ae...       12          0  Ian Boyd               3

如果有多个作者,那么我会向用户显示一个省略号(“...”),以表明有多个作者。例如,用户会看到:

QuestionID  QuestionTitle  UpVotes  DownVotes  Authors
==========  =============  =======  =========  ==========
2142075     Win32: Cre...        0          0  Ian Boyd
2232727     Win32: How...        2          0  Ian Boyd, …
1870139     Wondows Ae...       12          0  Ian Boyd, …

这效果很好,因为通常问题不会被编辑 - 这意味着我完美支持 99% 的情况,而 1% 的情况只有一半 -assed 也是如此。


线程重新查询

作为一个更复杂且容易出现错误的解决方案,我正在考虑迭代显示的列表,并为列表中的每个“问题”启动一个线程池工作线程。 list,对数据库执行查询以获取作者列表,然后在内存中聚合该列表。这意味着该列表首先填充(本机)应用程序。然后我会发出几千个单独的查询。

但这将是可怕的、可怕的、可怕的、缓慢的。更不用说漏洞百出,因为这将是线程工作。


是啊是啊

亚当机械师说得很清楚

不要将行连接成分隔符 SQL Server 中的字符串。做客户端 侧面。

告诉我怎么做,我会做的。


/cry

谁能想到一个更好的解决方案,比我原来的“TOP 1 加省略号”解决方案一样快(比如说......在一个数量级内)?

例如,是否有一种方法可以返回结果集,其中到达行具有关联的结果集?因此,对于每个“主”行,我可以获得包含列表的“详细”结果集。


最佳答案代码

Cade的 链接到 Adam Machanic 的解决方案 我喜欢最好的。一个用户定义的函数,似乎通过魔法进行操作:

CREATE FUNCTION dbo.ConcatAuthors(@QuestionID int)
RETURNS VARCHAR(8000)
AS
BEGIN
    DECLARE @Output VARCHAR(8000)
    SET @Output = ''

    SELECT @Output =    CASE @Output 
                WHEN '' THEN AuthorName 
                ELSE @Output + ', ' + AuthorName 
                END
    FROM  (
        SELECT QuestionID, AuthorName, QuestionDate AS AuthorDate FROM Questions
        UNION
        SELECT QuestionID, EditorName, EditDate FROM QuestionEdits
    ) dt
    WHERE dt.QuestionID = @QuestionID
    ORDER BY AuthorDate

    RETURN @Output
END

使用 T-SQL 用法:

SELECT QuestionID, QuestionTitle, UpVotes, DownVotes, dbo.ConcatAuthors(AuthorID)
FROM Questions

i have a query that returns rows that i want, e.g.

QuestionID  QuestionTitle  UpVotes  DownVotes  
==========  =============  =======  =========  
2142075     Win32: Cre...        0          0  
2232727     Win32: How...        2          0  
1870139     Wondows Ae...       12          0  

Now i want to have a column returned, that contains a comma separated list of "Authors" (e.g. original poster and editors). e.g.:

QuestionID  QuestionTitle  UpVotes  DownVotes  Authors
==========  =============  =======  =========  ==========
2142075     Win32: Cre...        0          0  Ian Boyd  
2232727     Win32: How...        2          0  Ian Boyd, roygbiv
1870139     Wondows Ae...       12          0  Ian Boyd, Aaron Klotz, Jason Diller, danbystrom

Faking It

SQL Server 2000 does not have a CONCAT(AuthorName, ', ') aggregation operation, i've been faking it - performing simple sub-selects for the TOP 1 author, and the author count.

QuestionID  QuestionTitle  UpVotes  DownVotes  FirstAuthor  AuthorCount  
==========  =============  =======  =========  ===========  =========== 
2142075     Win32: Cre...        0          0  Ian Boyd               1 
2232727     Win32: How...        2          0  Ian Boyd               2
1870139     Wondows Ae...       12          0  Ian Boyd               3

If there is more than one author, then i show the user an ellipses ("…"), to indicate there is more than one. e.g. the user would see:

QuestionID  QuestionTitle  UpVotes  DownVotes  Authors
==========  =============  =======  =========  ==========
2142075     Win32: Cre...        0          0  Ian Boyd
2232727     Win32: How...        2          0  Ian Boyd, …
1870139     Wondows Ae...       12          0  Ian Boyd, …

And that works well enough, since normally a question isn't edited - which means i'm supporting the 99% case perfectly, and the 1% case only half-assed as well.

Threaded Re-query

As a more complicated, and bug-prone solution, i was thinking of iterating the displayed list, and spinning up a thread-pool worker thread for each "question" in the list, perform a query against the database to get the list of authors, then aggregating the list in memory. This would mean that the list fills first in the (native) application. Then i issue a few thousand individual queries afterwards.

But that would be horribly, horrendously, terribly, slow. Not to mention bug-riddled, since it will be thread work.

Yeah yeah yeah

Adam Mechanic says quite plainly:

Don't concatenate rows into delimited
strings in SQL Server. Do it client
side.

Tell me how, and i'll do it.

/cry

Can anyone think of a better solution, that is as fast (say...within an order of magnitude) than my original "TOP 1 plus ellipses" solution?

For example, is there a way to return a results set, where reach row has an associated results set? So for each "master" row, i could get at a "detail" results set that contains the list.

Code for best answer

Cade's link to Adam Machanic's solution i like the best. A user-defined function, that seems to operate via magic:

CREATE FUNCTION dbo.ConcatAuthors(@QuestionID int)
RETURNS VARCHAR(8000)
AS
BEGIN
    DECLARE @Output VARCHAR(8000)
    SET @Output = ''

    SELECT @Output =    CASE @Output 
                WHEN '' THEN AuthorName 
                ELSE @Output + ', ' + AuthorName 
                END
    FROM  (
        SELECT QuestionID, AuthorName, QuestionDate AS AuthorDate FROM Questions
        UNION
        SELECT QuestionID, EditorName, EditDate FROM QuestionEdits
    ) dt
    WHERE dt.QuestionID = @QuestionID
    ORDER BY AuthorDate

    RETURN @Output
END

With a T-SQL usage of:

SELECT QuestionID, QuestionTitle, UpVotes, DownVotes, dbo.ConcatAuthors(AuthorID)
FROM Questions

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

爱人如己 2024-08-31 10:21:55

看看这些文章:

http://dataeducation.com/ rowset-string-concatenation-which-method-is-best/

http://www.simple-talk.com/sql/t-sql-programming/concatenating-row-values-in-transact-sql/ (请参阅响应中 Phil Factor 的交叉连接解决方​​案 - 该解决方案将在 SQL Server 2000 中工作)

显然,在 SQL Server 2005 中,FOR XML 技巧是最简单、最灵活且通常性能最高的。

至于为每行返回一个行集,如果由于某种原因您仍然想这样做,您可以在存储过程中执行此操作,但客户端将需要消耗第一个行集中的所有行,然后转到下一个行集并将其与第一个行集中的第一行关联起来,等等。您的 SP 需要在作为第一个行集返回的同一集上打开一个游标,并按顺序运行多个选择以生成所有子行集。这是我已经完成的一项技术,但仅限于实际需要所有数据的情况(例如,在完全填充的树视图中)。

不管人们怎么说,在客户端执行此操作通常会浪费大量带宽,因为返回所有行并在客户端执行循环和中断意味着在开始时会传输大量相同的列。每行只是为了获取行末尾的变化列。

无论您在何处执行此操作,都应该根据您的用例做出明智的决定

Have a look at these articles:

http://dataeducation.com/rowset-string-concatenation-which-method-is-best/

http://www.simple-talk.com/sql/t-sql-programming/concatenating-row-values-in-transact-sql/ (See Phil Factor's cross join solution in the responses - which will work in SQL Server 2000)

Obviously in SQL Server 2005, the FOR XML trick is easiest, most flexible and generally most performant.

As far as returning a rowset for each row, if you still want to do that for some reason, you can do that in a stored procedure, but the client will need to consume all the rows in the first rowset and then go to the next rowset and associate it with the first row in the first rowset, etc. Your SP would need to open a cursor on the same set it returned as the first rowset and run multiple selects in sequence to generate all the child rowsets. It's a technique I've done, but only where ALL the data actually was needed (for instance, in a fully-populated tree view).

And regardless of what people say, doing it client-side is often a very big waste of bandwidth, because returning all the rows and doing the looping and breaking in the client side means that huge number of identical columns are being transferred at the start of each row just to get the changing column at the end of the row.

Wherever you do it, it should be an informed decision based on your use case.

永言不败 2024-08-31 10:21:55

我尝试了 3 种方法来解决此问题,即此处发布的一种方法:activex 脚本和 UDF 函数。

对我来说最有效的脚本(速度方面)是奇怪的 Axtive-X 脚本,它运行多个查询来获取要连接的附加数据。

UDF 平均需要 22 分钟来转换,子查询方法(在此处发布)大约需要 5 分钟,而 activeX 脚本需要 4 分 30 分钟,这让我很恼火,因为这是我希望放弃的脚本。我得看看是否可以在其他地方提高效率。

我认为额外的 30 秒被用于存储数据的 tempdb 使用,因为我的脚本需要 order by。

应该指出的是,我正在连接大量的文本数据。

I tried 3 approaches to this solution, the one posted here, activex scripting and UDF functions.

The most effective script (speed-wise) for me was bizzarely Axtive-X script running multiple queries to get the additioanl data to concat.

UDF took an average of 22 minutes to transform, the Subquery method (posted here) took around 5m and the activeX script took 4m30, much to my annoyance since this was the script I was hoping to ditch. I'll have to see if I can iron out a few more efficiencies elsewhere.

I think the extra 30s is used by the tempdb being used to store the data since my script requires an order by.

It should be noted that I am concatanating huge quantities of textual data.

难得心□动 2024-08-31 10:21:55

您还可以查看此脚本。这基本上就是 Cade Roux 在他的帖子中提到的交叉连接方法。

上面的方法看起来非常干净:您必须首先创建一个视图,然后根据视图中的值创建一个语句。您可以在代码中动态构建第二个 sql 语句,因此它应该可以直接使用。

You can also take a look to this script. It's basically the cross join approach that Cade Roux also mentioned in his post.

The above approach looks very clean: you have to do a view first and secondly create a statement based on the values in the view. The second sql statement you can build dynamically in your code, so it should be straight forward to use.

若水般的淡然安静女子 2024-08-31 10:21:55

我不确定这是否适用于 SQL Server 2000,但您可以尝试一下:

--combine parent and child, children are CSV onto parent row
CREATE TABLE #TableA (RowID int, Value1 varchar(5), Value2 varchar(5))
INSERT INTO #TableA VALUES (1,'aaaaa','A')
INSERT INTO #TableA VALUES (2,'bbbbb','B')
INSERT INTO #TableA VALUES (3,'ccccc','C')

CREATE TABLE #TableB (RowID int, TypeOf varchar(10))
INSERT INTO #TableB VALUES (1,'wood')
INSERT INTO #TableB VALUES (2,'wood')
INSERT INTO #TableB VALUES (2,'steel')
INSERT INTO #TableB VALUES (2,'rock')
INSERT INTO #TableB VALUES (3,'plastic')
INSERT INTO #TableB VALUES (3,'paper')


SELECT
    a.*,dt.CombinedValue
    FROM #TableA        a
        LEFT OUTER JOIN (SELECT
                             c1.RowID
                                 ,STUFF(
                                          (SELECT
                                               ', ' + TypeOf
                                               FROM (SELECT
                                                         a.RowID,a.Value1,a.Value2,b.TypeOf
                                                         FROM #TableA                 a
                                                             LEFT OUTER JOIN #TableB  b ON a.RowID=b.RowID
                                                    ) c2
                                               WHERE c2.rowid=c1.rowid
                                               ORDER BY c1.RowID, TypeOf
                                               FOR XML PATH('') 
                                          )
                                          ,1,2, ''
                                       ) AS CombinedValue
                             FROM (SELECT
                                       a.RowID,a.Value1,a.Value2,b.TypeOf
                                       FROM #TableA                 a
                                           LEFT OUTER JOIN #TableB  b ON a.RowID=b.RowID
                                  ) c1
                             GROUP BY RowID
                        ) dt ON a.RowID=dt.RowID

OUTPUT from SQL Server 2005:

RowID       Value1 Value2 CombinedValue
----------- ------ ------ ------------------
1           aaaaa  A      wood
2           bbbbb  B      rock, steel, wood
3           ccccc  C      paper, plastic

(3 row(s) affected)

EDIT query that Replaces FOR XML PATH with FOR XML RAW, so this should work on SQL服务器 2000

SELECT
    a.*,dt.CombinedValue
    FROM #TableA        a
        LEFT OUTER JOIN (SELECT
                             c1.RowID
                                 ,STUFF(REPLACE(REPLACE(
                                          (SELECT 
                                               ', ' + TypeOf as value
                                               FROM (SELECT
                                                         a.RowID,a.Value1,a.Value2,b.TypeOf
                                                         FROM #TableA                 a
                                                             LEFT OUTER JOIN #TableB  b ON a.RowID=b.RowID
                                                    ) c2
                                               WHERE c2.rowid=c1.rowid
                                               ORDER BY c1.RowID, TypeOf
                                               FOR XML RAW
                                          )
                                         ,'<row value="',''),'"/>','')
                                   , 1, 2, '') AS CombinedValue
                             FROM (SELECT
                                       a.RowID,a.Value1,a.Value2,b.TypeOf
                                       FROM #TableA                 a
                                           LEFT OUTER JOIN #TableB  b ON a.RowID=b.RowID
                                  ) c1
                             GROUP BY RowID
                        ) dt ON a.RowID=dt.RowID

输出,与原始查询相同

I'm not sure if this works in SQL Server 2000, but you can try it:

--combine parent and child, children are CSV onto parent row
CREATE TABLE #TableA (RowID int, Value1 varchar(5), Value2 varchar(5))
INSERT INTO #TableA VALUES (1,'aaaaa','A')
INSERT INTO #TableA VALUES (2,'bbbbb','B')
INSERT INTO #TableA VALUES (3,'ccccc','C')

CREATE TABLE #TableB (RowID int, TypeOf varchar(10))
INSERT INTO #TableB VALUES (1,'wood')
INSERT INTO #TableB VALUES (2,'wood')
INSERT INTO #TableB VALUES (2,'steel')
INSERT INTO #TableB VALUES (2,'rock')
INSERT INTO #TableB VALUES (3,'plastic')
INSERT INTO #TableB VALUES (3,'paper')


SELECT
    a.*,dt.CombinedValue
    FROM #TableA        a
        LEFT OUTER JOIN (SELECT
                             c1.RowID
                                 ,STUFF(
                                          (SELECT
                                               ', ' + TypeOf
                                               FROM (SELECT
                                                         a.RowID,a.Value1,a.Value2,b.TypeOf
                                                         FROM #TableA                 a
                                                             LEFT OUTER JOIN #TableB  b ON a.RowID=b.RowID
                                                    ) c2
                                               WHERE c2.rowid=c1.rowid
                                               ORDER BY c1.RowID, TypeOf
                                               FOR XML PATH('') 
                                          )
                                          ,1,2, ''
                                       ) AS CombinedValue
                             FROM (SELECT
                                       a.RowID,a.Value1,a.Value2,b.TypeOf
                                       FROM #TableA                 a
                                           LEFT OUTER JOIN #TableB  b ON a.RowID=b.RowID
                                  ) c1
                             GROUP BY RowID
                        ) dt ON a.RowID=dt.RowID

OUTPUT from SQL Server 2005:

RowID       Value1 Value2 CombinedValue
----------- ------ ------ ------------------
1           aaaaa  A      wood
2           bbbbb  B      rock, steel, wood
3           ccccc  C      paper, plastic

(3 row(s) affected)

EDIT query that replaces FOR XML PATH with FOR XML RAW, so this should work on SQL Server 2000

SELECT
    a.*,dt.CombinedValue
    FROM #TableA        a
        LEFT OUTER JOIN (SELECT
                             c1.RowID
                                 ,STUFF(REPLACE(REPLACE(
                                          (SELECT 
                                               ', ' + TypeOf as value
                                               FROM (SELECT
                                                         a.RowID,a.Value1,a.Value2,b.TypeOf
                                                         FROM #TableA                 a
                                                             LEFT OUTER JOIN #TableB  b ON a.RowID=b.RowID
                                                    ) c2
                                               WHERE c2.rowid=c1.rowid
                                               ORDER BY c1.RowID, TypeOf
                                               FOR XML RAW
                                          )
                                         ,'<row value="',''),'"/>','')
                                   , 1, 2, '') AS CombinedValue
                             FROM (SELECT
                                       a.RowID,a.Value1,a.Value2,b.TypeOf
                                       FROM #TableA                 a
                                           LEFT OUTER JOIN #TableB  b ON a.RowID=b.RowID
                                  ) c1
                             GROUP BY RowID
                        ) dt ON a.RowID=dt.RowID

OUTPUT, same as original query

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文