SELECT 语句后需要行计数:最佳 SQL 方法是什么?

发布于 2024-07-08 12:43:35 字数 1109 浏览 9 评论 0原文

我试图从单个表中选择一列(无连接),并且我需要行数的计数,最好是在开始检索行之前。 我采用了两种方法来提供我需要的信息。

方法 1:

SELECT COUNT( my_table.my_col ) AS row_count
  FROM my_table
 WHERE my_table.foo = 'bar'

然后

SELECT my_table.my_col
  FROM my_table
 WHERE my_table.foo = 'bar'

或者方法 2

SELECT my_table.my_col, ( SELECT COUNT ( my_table.my_col )
                            FROM my_table
                           WHERE my_table.foo = 'bar' ) AS row_count
  FROM my_table
 WHERE my_table.foo = 'bar'

我这样做是因为我的 SQL 驱动程序 (SQL Native Client 9.0) 不允许我在 SELECT 语句上使用 SQLRowCount,但我需要了解结果中的行数,以便在向数组分配信息之前分配数组。 不幸的是,在我的程序的这个区域中不可以选择使用动态分配的容器。

我担心可能会发生以下情况:

  • 发生 SELECT for count 发生
  • 另一条指令,添加或删除行
  • 发生 SELECT for data 突然数组大小错误。
    - 在最糟糕的情况下,这将尝试写入超出数组限制的数据并使我的程序崩溃。

方法 2 是否禁止此问题?

另外,这两种方法中的一种会更快吗? 如果有,是哪一个?

最后,我是否应该考虑更好的方法(也许是一种指示驱动程序使用 SQLRowCount 返回 SELECT 结果中的行数的方法?)

对于那些询问的人,我正在使用 Native C++ 和上述 SQL 驱动程序(前提是由微软开发。)

I'm trying to select a column from a single table (no joins) and I need the count of the number of rows, ideally before I begin retrieving the rows. I have come to two approaches that provide the information I need.

Approach 1:

SELECT COUNT( my_table.my_col ) AS row_count
  FROM my_table
 WHERE my_table.foo = 'bar'

Then

SELECT my_table.my_col
  FROM my_table
 WHERE my_table.foo = 'bar'

Or Approach 2

SELECT my_table.my_col, ( SELECT COUNT ( my_table.my_col )
                            FROM my_table
                           WHERE my_table.foo = 'bar' ) AS row_count
  FROM my_table
 WHERE my_table.foo = 'bar'

I am doing this because my SQL driver (SQL Native Client 9.0) does not allow me to use SQLRowCount on a SELECT statement but I need to know the number of rows in my result in order to allocate an array before assigning information to it. The use of a dynamically allocated container is, unfortunately, not an option in this area of my program.

I am concerned that the following scenario might occur:

  • SELECT for count occurs
  • Another instruction occurs, adding or removing a row
  • SELECT for data occurs and suddenly the array is the wrong size.
    -In the worse case, this will attempt to write data beyond the arrays limits and crash my program.

Does Approach 2 prohibit this issue?

Also, Will one of the two approaches be faster? If so, which?

Finally, is there a better approach that I should consider (perhaps a way to instruct the driver to return the number of rows in a SELECT result using SQLRowCount?)

For those that asked, I am using Native C++ with the aforementioned SQL driver (provided by Microsoft.)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

做个ˇ局外人 2024-07-15 12:43:35

如果您使用的是 SQL Server,则在查询后您可以选择 @@RowCount 函数(或者,如果您的结果集可能有超过 20 亿行,请使用 RowCount_Big() 函数)。 这将返回上一个语句选择的行数或受插入/更新/删除语句影响的行数。

SELECT my_table.my_col
  FROM my_table
 WHERE my_table.foo = 'bar'

SELECT @@Rowcount

或者,如果您希望将行计数包含在与方法 #2 类似的发送结果中,您可以使用 OVER 子句

SELECT my_table.my_col,
    count(*) OVER(PARTITION BY my_table.foo) AS 'Count'
  FROM my_table
 WHERE my_table.foo = 'bar'

使用 OVER 子句比使用子查询获取行数具有更好的性能。 使用 @@RowCount 将具有最佳性能,因为 select @@RowCount 语句不会有任何查询成本

更新响应评论:我给出的示例将给出分区中的行数 - 在本例中定义通过“PARTITION BY my_table.foo”。 每行中列的值是与 my_table.foo 值相同的行数。 由于您的示例查询具有子句“WHERE my_table.foo = 'bar'”,因此结果集中的所有行都将具有相同的 my_table.foo 值,因此该列中的值对于所有行都将相同并且相等(在在本例中)这是查询中的行数。

下面是一个更好/更简单的示例,说明如何在每行中包含一列,该列是结果集中的总行数。 只需删除可选的 Partition By 子句即可。

SELECT my_table.my_col, count(*) OVER() AS 'Count'
  FROM my_table
 WHERE my_table.foo = 'bar'

If you're using SQL Server, after your query you can select the @@RowCount function (or if your result set might have more than 2 billion rows use the RowCount_Big() function). This will return the number of rows selected by the previous statement or number of rows affected by an insert/update/delete statement.

SELECT my_table.my_col
  FROM my_table
 WHERE my_table.foo = 'bar'

SELECT @@Rowcount

Or if you want to row count included in the result sent similar to Approach #2, you can use the the OVER clause.

SELECT my_table.my_col,
    count(*) OVER(PARTITION BY my_table.foo) AS 'Count'
  FROM my_table
 WHERE my_table.foo = 'bar'

Using the OVER clause will have much better performance than using a subquery to get the row count. Using the @@RowCount will have the best performance because the there won't be any query cost for the select @@RowCount statement

Update in response to comment: The example I gave would give the # of rows in partition - defined in this case by "PARTITION BY my_table.foo". The value of the column in each row is the # of rows with the same value of my_table.foo. Since your example query had the clause "WHERE my_table.foo = 'bar'", all rows in the resultset will have the same value of my_table.foo and therefore the value in the column will be the same for all rows and equal (in this case) this the # of rows in the query.

Here is a better/simpler example of how to include a column in each row that is the total # of rows in the resultset. Simply remove the optional Partition By clause.

SELECT my_table.my_col, count(*) OVER() AS 'Count'
  FROM my_table
 WHERE my_table.foo = 'bar'
乱世争霸 2024-07-15 12:43:35

只有两种方法可以 100% 确定 COUNT(*) 和实际查询会给出一致的结果:

  • COUNT(*) 与查询结合起来,如下所示在你的方法2中。我推荐你在示例中显示的形式,而不是kogus评论中显示的相关子查询形式。
  • SNAPSHOTSERIALIZABLE 隔离级别启动事务后,使用两个查询,如方法 1 中所示。

使用这些隔离级别之一很重要,因为任何其他隔离级别都允许其他客户端创建的新行在当前事务中可见。 阅读有关 设置事务隔离 的 MSDN 文档更多细节。

There are only two ways to be 100% certain that the COUNT(*) and the actual query will give consistent results:

  • Combined the COUNT(*) with the query, as in your Approach 2. I recommend the form you show in your example, not the correlated subquery form shown in the comment from kogus.
  • Use two queries, as in your Approach 1, after starting a transaction in SNAPSHOT or SERIALIZABLE isolation level.

Using one of those isolation levels is important because any other isolation level allows new rows created by other clients to become visible in your current transaction. Read the MSDN documentation on SET TRANSACTION ISOLATION for more details.

何以心动 2024-07-15 12:43:35

方法 2 将始终返回与您的结果集匹配的计数。

不过,我建议您将子查询链接到外部查询,以保证计数条件与数据集条件匹配。

SELECT 
  mt.my_row,
 (SELECT COUNT(mt2.my_row) FROM my_table mt2 WHERE mt2.foo = mt.foo) as cnt
FROM my_table mt
WHERE mt.foo = 'bar';

Approach 2 will always return a count that matches your result set.

I suggest you link the sub-query to your outer query though, to guarantee that the condition on your count matches the condition on the dataset.

SELECT 
  mt.my_row,
 (SELECT COUNT(mt2.my_row) FROM my_table mt2 WHERE mt2.foo = mt.foo) as cnt
FROM my_table mt
WHERE mt.foo = 'bar';
变身佩奇 2024-07-15 12:43:35

如果您担心满足条件的行数可能会在执行查询和检索结果后的几毫秒内发生变化,您可以/应该在事务内执行查询:

BEGIN TRAN bogus

SELECT COUNT( my_table.my_col ) AS row_count
FROM my_table
WHERE my_table.foo = 'bar'

SELECT my_table.my_col
FROM my_table
WHERE my_table.foo = 'bar'
ROLLBACK TRAN bogus

这将始终返回正确的值。

此外,如果您使用的是 SQL Server,则可以使用 @@ROWCOUNT 获取受最后一条语句影响的行数,并将 real 查询的输出重定向到临时表或表变量,因此您可以完全退回所有内容,无需进行交易:

DECLARE @dummy INT

SELECT my_table.my_col
INTO #temp_table
FROM my_table
WHERE my_table.foo = 'bar'

SET @dummy=@@ROWCOUNT
SELECT @dummy, * FROM #temp_table

If you're concerned the number of rows that meet the condition may change in the few milliseconds since execution of the query and retrieval of results, you could/should execute the queries inside a transaction:

BEGIN TRAN bogus

SELECT COUNT( my_table.my_col ) AS row_count
FROM my_table
WHERE my_table.foo = 'bar'

SELECT my_table.my_col
FROM my_table
WHERE my_table.foo = 'bar'
ROLLBACK TRAN bogus

This would return the correct values, always.

Furthermore, if you're using SQL Server, you can use @@ROWCOUNT to get the number of rows affected by last statement, and redirect the output of real query to a temp table or table variable, so you can return everything altogether, and no need of a transaction:

DECLARE @dummy INT

SELECT my_table.my_col
INTO #temp_table
FROM my_table
WHERE my_table.foo = 'bar'

SET @dummy=@@ROWCOUNT
SELECT @dummy, * FROM #temp_table
深巷少女 2024-07-15 12:43:35

这里有一些想法:

  • 采用方法 #1 并调整数组大小以保存其他结果或使用根据需要自动调整大小的类型(您没有提及您正在使用的语言,因此我无法更具体)。
  • 如果您的数据库支持的话,您可以在事务中执行方法 1 中的两条语句,以保证两次计数相同。
  • 我不确定您对数据做了什么,但如果可以在不先存储所有结果的情况下处理结果,这可能是最好的方法。

Here are some ideas:

  • Go with Approach #1 and resize the array to hold additional results or use a type that automatically resizes as neccessary (you don't mention what language you are using so I can't be more specific).
  • You could execute both statements in Approach #1 within a transaction to guarantee the counts are the same both times if your database supports this.
  • I'm not sure what you are doing with the data but if it is possible to process the results without storing all of them first this might be the best method.
无边思念无边月 2024-07-15 12:43:35

如果您确实担心行计数会在 select 计数和 select 语句之间发生变化,为什么不先将行选择到临时表中呢? 这样,您就知道你们会保持同步。

If you are really concerned that your row count will change between the select count and the select statement, why not select your rows into a temp table first? That way, you know you will be in sync.

jJeQQOZ5 2024-07-15 12:43:35

为什么不将结果放入向量中? 这样您就不必事先知道尺寸。

Why don't you put your results into a vector? That way you don't have to know the size before hand.

你列表最软的妹 2024-07-15 12:43:35

您可能需要考虑一种更好的模式来处理此类数据。

没有自卫的 SQL 驱动程序会在返回行之前告诉您查询将返回多少行,因为答案可能会改变(除非您使用事务,这会产生自己的问题。)

行数不会改变 -谷歌搜索 ACID 和 SQL。

You might want to think about a better pattern for dealing with data of this type.

No self-prespecting SQL driver will tell you how many rows your query will return before returning the rows, because the answer might change (unless you use a Transaction, which creates problems of its own.)

The number of rows won't change - google for ACID and SQL.

云裳 2024-07-15 12:43:35
IF (@@ROWCOUNT > 0)
BEGIN
SELECT my_table.my_col
  FROM my_table
 WHERE my_table.foo = 'bar'
END
IF (@@ROWCOUNT > 0)
BEGIN
SELECT my_table.my_col
  FROM my_table
 WHERE my_table.foo = 'bar'
END
£烟消云散 2024-07-15 12:43:35

只是添加这个,因为这是谷歌针对这个问题的最高结果。
在 sqlite 中我用它来获取行数。

WITH temptable AS
  (SELECT one,two
   FROM
     (SELECT one, two
      FROM table3
      WHERE dimension=0
      UNION ALL SELECT one, two
      FROM table2
      WHERE dimension=0
      UNION ALL SELECT one, two
      FROM table1
      WHERE dimension=0)
   ORDER BY date DESC)
SELECT *
FROM temptable
LEFT JOIN
  (SELECT count(*)/7 AS cnt,
                        0 AS bonus
   FROM temptable) counter
WHERE 0 = counter.bonus

Just to add this because this is the top result in google for this question.
In sqlite I used this to get the rowcount.

WITH temptable AS
  (SELECT one,two
   FROM
     (SELECT one, two
      FROM table3
      WHERE dimension=0
      UNION ALL SELECT one, two
      FROM table2
      WHERE dimension=0
      UNION ALL SELECT one, two
      FROM table1
      WHERE dimension=0)
   ORDER BY date DESC)
SELECT *
FROM temptable
LEFT JOIN
  (SELECT count(*)/7 AS cnt,
                        0 AS bonus
   FROM temptable) counter
WHERE 0 = counter.bonus
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文