选择子查询的 COUNT(*) 次而不运行两次
我有一个返回结果集的过程,该结果集受页码和其他一些内容的限制。 作为 OUTPUT 参数,我需要根据除页码之外的参数返回所选行的总数。 所以我有类似的东西:
WITH SelectedItems AS
(SELECT Id, Row1, Row2, ROW_NUMBER() OVER (ORDER BY Row1) AS Position
FROM Items
WHERE Row2 = @Row2)
SELECT Id, Row1, Row2
FROM SelectedItems
WHERE Position BETWEEN @From AND @To
然后我需要将 OUTPUT 参数设置为内部查询中的行数。 我可以复制查询并对其进行计数,但此查询可能会返回数千行(并且将来会更多),因此我正在寻找具有良好性能的方法。 我在考虑表变量,这是个好主意吗? 或者还有其他建议吗?
更具体地说,它是 Microsoft SQL Server 2008。
谢谢,Jan
I've got a procedure to return a result set which is limited by page number and some other stuff. As an OUTPUT parameter I need to return a total amount of selected rows according to the parameters except the page number. So I have something like that:
WITH SelectedItems AS
(SELECT Id, Row1, Row2, ROW_NUMBER() OVER (ORDER BY Row1) AS Position
FROM Items
WHERE Row2 = @Row2)
SELECT Id, Row1, Row2
FROM SelectedItems
WHERE Position BETWEEN @From AND @To
And then I need to set the OUTPUT parameter to the number of rows in the innerquery. I can just copy the query and count it, but this query could returns thousands of rows (and will be more in the future), so I am looking for method to do that with a good performance. I was thinking about table variables, is it a good idea? Or any other suggestions?
To be more specific, it's the Microsoft SQL Server 2008.
Thank you, Jan
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
您可以使用 COUNT(*) 将总行数作为主查询中的单独列进行计数。 像这样:
这将返回结果集中的计数,而不是输出参数中的计数,但这应该符合您的要求。 否则,与临时表结合使用:
您会发现仅对分页结果使用临时表不会使用太多内存(当然取决于您的页面大小),并且您只能将其保留很短的一段时间。 从临时表中选择完整结果集并选择 TotalRows 只会花费一点点时间。
这比运行完全独立的查询要快得多,在我的测试中(重复WITH)使执行时间加倍。
You can count the total rows as a separate column in your main query using COUNT(*). Like this:
This will return the count in your result set rather than in a output parameter, but that should fit your requirements. Otherwise, combine with a temp table:
You will find using a temp table for just your paged result will not use much memory (depending on your page size of course) and you're only keeping it live for a short period of time. Selecting the full result set from the temp table and selecting the TotalRows will only take a tiny bit longer.
This will be much faster than running a totally separate query, which in my test (repeating the WITH) doubled the execution time.
我认为你应该在单独的查询中执行此操作。 虽然这两个查询可能看起来几乎相同,但查询优化器处理它们的方式会有很大不同。
理论上,SQL Server 甚至可能无法遍历子查询中的所有行来对其进行计数。
I think you should do it in a separate query. While those two queries might look pretty much the same, but the way query optimizer deals with them would differ pretty significantly.
Theoretically, SQL Server might not even go through all the rows in the subquery to be able to count it.
我现在无法访问我的代码库,但我相信您可以使用 COUNT() OVER (或类似的命令)来返回总行数作为子查询的一部分。 然后您可以将其作为最终结果集的一部分返回。 它在每一行中都会重复,但在我看来,对于使用分页的应用程序来说,这只是一个较小的性能影响,而且最终结果应该有限。
几个小时后我将发布确切的代码。
编辑:这是我用来生成计数的行。 最后,我们的开发人员想要一个单独的方法来自行获取计数,因此现在我在同一存储过程中的两个位置维护搜索条件。
将其添加到您的 CTE,然后您可以选择 TotalCount,它将成为每行中的一列。
I don't have access to my code base right now, but I believe that you can use COUNT() OVER (or a similar command) to return the total number of rows as part of the subquery. You can then return that as part of the final result set. It gets duplicated in every row, but that's a minor performance hit in my opinion for an application that is using paging and should have limited final results anyway.
In a couple hours I'll post the exact code.
EDIT: Here's the line that I've used to generate the count. In the end our developers wanted a separate method to get the count by itself, so now I'm maintaining the search criteria in two places within the same stored procedure.
Add that to your CTE and then you can select the TotalCount and it will be a column in each of your rows.
您必须在不限制范围的情况下运行整个查询至少一次才能获得完整的行数。 由于无论如何您都将执行此操作,因此您应该选择 @@RowCount 以便输出找到的总行数,而不是在每行中使用冗余的 count(*) 列使数据读取器超载。
1. 第一次运行 NEW 查询时:
2. 仅读取前 X 行
上述查询避免了 SqlDataReader 中出现冗余 COUNT(*) 列,否则每次调用 SqlDataReader.Read() 时都会发送该列。 由于您是第一次运行查询...而不是选择范围,只需读取前 X 行。这正是您想要的...完整结果计数、前 X 条记录,以及结果集的高效流式传输,无需冗余计数列。
3. 对于后续运行 SAME 查询以获取结果的子集
在任何情况下,
@@rowcount
是访问第一次运行查询时的计数而不限制结果的最直接方法设置(无论如何你都会想要前 X 个结果),而不运行单独的 count() 查询,不使用临时表,也不包含冗余的 count() 列。You MUST run the entire query, without limiting the range, at least once in order to get the full row count. Since you're going to do this anyway, you should select @@RowCount in order to output the total rows found, rather than overloading your data reader with a redundant count(*) column in every row.
1. When running NEW query for first time:
2. Only READ the first X rows
The above query avoids flooding your SqlDataReader with a redundant COUNT(*) column, that would otherwise be sent for every call to SqlDataReader.Read(). Since you are running the query for the first time... instead of selecting a range, just READ only the first X rows. This gives you exactly what you want... the full result count, the first X records, and efficient streaming of the result set without the redundant count column.
3. For subsequent runs of the SAME query to get a subset of the results
In any case,
@@rowcount
is the most direct way to access the count on the first run of the query without limiting the result set (ur gonna want first X results anyway), without running a separate count() query, without using a temp table, and without including a redundant count() column.您不能将输出变量设置为@@RowCount 吗? 这将获取受最后执行的语句影响的行:
这应该为您提供所需的内容,并且不涉及再次运行查询。
Couldn't you just set the output variable to @@RowCount? This will get the rows affected by the last executed statement:
This should give you what you need, and doesn't involve running the query again.