SELECT COUNT(*) 与使用显式游标获取两次

发布于 2024-07-08 17:02:36 字数 781 浏览 6 评论 0原文

我读过一本书,书名是“Oracle PL SQL 编程”(第二版),作者是 Steven Feuerstein & 。 比尔·普比尔. 在第 99 页上,有一点建议

不要从表中“SELECT COUNT(*)”,除非您确实需要知道“点击”的总数。 如果您只需要知道是否存在多个匹配项,只需使用显式游标获取两次即可。

有人可以通过提供示例向我详细解释这一点吗? 谢谢。

更新:

作为 Steven Feuerstein & Bill Pribyl 建议我们不要使用 SELECT COUNT() 来检查表中的记录是否存在,任何人都可以帮我编辑下面的代码,以避免使用显式游标来避免使用 SELECT COUNT(*) 吗? 这段代码是写在Oracle存储过程中的。

我有一个表 emp(emp_id, emp_name, ...),因此要检查提供的员工 ID 是否正确:

CREATE OR REPLACE PROCEDURE do_sth ( emp_id_in IN emp.emp_id%TYPE )
IS
v_rows INTEGER;
BEGIN
    ...

    SELECT COUNT(*) INTO v_rows
    FROM emp
    WHERE emp_id = emp_id_in;

    IF v_rows > 0 THEN
        /* do sth */
    END;

    /* more statements */
    ...

END do_sth;

I have read a book whose title is "Oracle PL SQL Programming" (2nd ed.) by Steven Feuerstein & Bill Pribyl. On page 99, there is a point suggested that

Do not "SELECT COUNT(*)" from a table unless you really need to know the total number of "hits." If you only need to know whether there is more than one match, simply fetch twice with an explicit cursor.

Could you anyone explain this point more to me by providing example? Thank you.

Update:

As Steven Feuerstein & Bill Pribyl recommends us not to use SELECT COUNT() to check whether records in a table exist or not, could anyone help me edit the code below in order to avoid using SELECT COUNT(*) by using explicit cursor instead? This code is written in the Oracle stored procedure.

I have a table emp(emp_id, emp_name, ...), so to check the provided employee ID corret or not:

CREATE OR REPLACE PROCEDURE do_sth ( emp_id_in IN emp.emp_id%TYPE )
IS
v_rows INTEGER;
BEGIN
    ...

    SELECT COUNT(*) INTO v_rows
    FROM emp
    WHERE emp_id = emp_id_in;

    IF v_rows > 0 THEN
        /* do sth */
    END;

    /* more statements */
    ...

END do_sth;

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

伪心 2024-07-15 17:02:36

开发人员在 PL/SQL 程序中从表中执行 select COUNT(*) 的原因有很多:

1) 他们确实需要知道表中有多少行。

在这种情况下,别无选择:选择 COUNT(*) 并等待结果。 这在许多表上会相当快,但在大表上可能需要一段时间。

2)他们只需要知道一行是否存在。

这并不能保证计算表中的所有行。 可以采用多种技术:

a) 显式游标方法:

DECLARE
   CURSOR c IS SELECT '1' dummy FROM mytable WHERE ...;
   v VARCHAR2(1);
BEGIN
   OPEN c;
   FETCH c INTO v;
   IF c%FOUND THEN
      -- A row exists
      ...
   ELSE
      -- No row exists
      ...
   END IF;
END;

b) SELECT INTO 方法

DECLARE
   v VARCHAR2(1);
BEGIN
   SELECT '1' INTO v FROM mytable 
   WHERE ... 
   AND ROWNUM=1; -- Stop fetching if 1 found
   -- At least one row exists
EXCEPTION
   WHEN NO_DATA_FOUND THEN
      -- No row exists
END;

c) 使用 ROWNUM 方法的 SELECT COUNT(*)

DECLARE
   cnt INTEGER;
BEGIN
   SELECT COUNT(*) INTO cnt FROM mytable 
   WHERE ... 
   AND ROWNUM=1; -- Stop counting if 1 found
   IF cnt = 0 THEN
      -- No row found
   ELSE
      -- Row found
   END IF;
END;

3) 他们需要知道是否存在多于 1 行。

(2) 工作技术的变体:

a) 显式游标方法:

DECLARE
   CURSOR c IS SELECT '1' dummy FROM mytable WHERE ...;
   v VARCHAR2(1);
BEGIN
   OPEN c;
   FETCH c INTO v;
   FETCH c INTO v;
   IF c%FOUND THEN
      -- 2 or more rows exists
      ...
   ELSE
      -- 1 or 0 rows exist
      ...
   END IF;
END;

b) SELECT INTO 方法

DECLARE
   v VARCHAR2(1);
BEGIN
   SELECT '1' INTO v FROM mytable 
   WHERE ... ;
   -- Exactly 1 row exists
EXCEPTION
   WHEN NO_DATA_FOUND THEN
      -- No row exists
   WHEN TOO_MANY_ROWS THEN
      -- More than 1 row exists
END;

c) SELECT COUNT(*) 和 ROWNUM 方法

DECLARE
   cnt INTEGER;
BEGIN
   SELECT COUNT(*) INTO cnt FROM mytable 
   WHERE ... 
   AND ROWNUM <= 2; -- Stop counting if 2 found
   IF cnt = 0 THEN
      -- No row found
   IF cnt = 1 THEN
      -- 1 row found
   ELSE
      -- More than 1 row found
   END IF;
END;

您使用哪种方法很大程度上取决于偏好(以及一些宗教狂热!) Steven Feuerstein始终偏爱显式游标而不是隐式游标(SELECT INTO 和游标 FOR 循环); Tom Kyte 赞成隐式游标(我同意他的观点)。

重要的一点是,在不限制 ROWCOUNT 的情况下选择 COUNT(*) 的成本很高,因此只有在真正需要计数时才应该这样做。

至于你关于如何用显式游标重写它的补充问题:

CREATE OR REPLACE PROCEDURE do_sth ( emp_id_in IN emp.emp_id%TYPE )
IS
v_rows INTEGER;
BEGIN
    ...

    SELECT COUNT(*) INTO v_rows
    FROM emp
    WHERE emp_id = emp_id_in;

    IF v_rows > 0 THEN
        /* do sth */
    END;

    /* more statements */
    ...

END do_sth;

那就是:

CREATE OR REPLACE PROCEDURE do_sth ( emp_id_in IN emp.emp_id%TYPE )
IS
    CURSOR c IS SELECT 1
                FROM emp
                WHERE emp_id = emp_id_in;
    v_dummy INTEGER;
BEGIN
    ...

    OPEN c;    
    FETCH c INTO v_dummy;
    IF c%FOUND > 0 THEN
        /* do sth */
    END;
    CLOSE c;

    /* more statements */
    ...

END do_sth;

但实际上,在你的例子中它没有更好或更坏,因为你正在选择主键并且Oracle足够聪明知道它只需要获取一次。

There are a number of reasons why developers might perform select COUNT(*) from a table in a PL/SQL program:

1) They genuinely need to know how many rows there are in the table.

In this case there is no choice: select COUNT(*) and wait for the result. This will be pretty fast on many tables, but could take a while on a big table.

2) They just need to know whether a row exists or not.

This doesn't warrant counting all the rows in the table. A number of techniques are possible:

a) Explicit cursor method:

DECLARE
   CURSOR c IS SELECT '1' dummy FROM mytable WHERE ...;
   v VARCHAR2(1);
BEGIN
   OPEN c;
   FETCH c INTO v;
   IF c%FOUND THEN
      -- A row exists
      ...
   ELSE
      -- No row exists
      ...
   END IF;
END;

b) SELECT INTO method

DECLARE
   v VARCHAR2(1);
BEGIN
   SELECT '1' INTO v FROM mytable 
   WHERE ... 
   AND ROWNUM=1; -- Stop fetching if 1 found
   -- At least one row exists
EXCEPTION
   WHEN NO_DATA_FOUND THEN
      -- No row exists
END;

c) SELECT COUNT(*) with ROWNUM method

DECLARE
   cnt INTEGER;
BEGIN
   SELECT COUNT(*) INTO cnt FROM mytable 
   WHERE ... 
   AND ROWNUM=1; -- Stop counting if 1 found
   IF cnt = 0 THEN
      -- No row found
   ELSE
      -- Row found
   END IF;
END;

3) They need to know whether more than 1 row exists.

Variations on the techniques for (2) work:

a) Explicit cursor method:

DECLARE
   CURSOR c IS SELECT '1' dummy FROM mytable WHERE ...;
   v VARCHAR2(1);
BEGIN
   OPEN c;
   FETCH c INTO v;
   FETCH c INTO v;
   IF c%FOUND THEN
      -- 2 or more rows exists
      ...
   ELSE
      -- 1 or 0 rows exist
      ...
   END IF;
END;

b) SELECT INTO method

DECLARE
   v VARCHAR2(1);
BEGIN
   SELECT '1' INTO v FROM mytable 
   WHERE ... ;
   -- Exactly 1 row exists
EXCEPTION
   WHEN NO_DATA_FOUND THEN
      -- No row exists
   WHEN TOO_MANY_ROWS THEN
      -- More than 1 row exists
END;

c) SELECT COUNT(*) with ROWNUM method

DECLARE
   cnt INTEGER;
BEGIN
   SELECT COUNT(*) INTO cnt FROM mytable 
   WHERE ... 
   AND ROWNUM <= 2; -- Stop counting if 2 found
   IF cnt = 0 THEN
      -- No row found
   IF cnt = 1 THEN
      -- 1 row found
   ELSE
      -- More than 1 row found
   END IF;
END;

Which method you use is largely a matter of preference (and some religious zealotry!) Steven Feuerstein has always favoured explicit cursors over implicit (SELECT INTO and cursor FOR loops); Tom Kyte favours implicit cursors (and I agree with him).

The important point is that to select COUNT(*) without restricting the ROWCOUNT is expensive and should therefore only be done when a count is trully needed.

As for your supplementary question about how to re-write this with an explicit cursor:

CREATE OR REPLACE PROCEDURE do_sth ( emp_id_in IN emp.emp_id%TYPE )
IS
v_rows INTEGER;
BEGIN
    ...

    SELECT COUNT(*) INTO v_rows
    FROM emp
    WHERE emp_id = emp_id_in;

    IF v_rows > 0 THEN
        /* do sth */
    END;

    /* more statements */
    ...

END do_sth;

That would be:

CREATE OR REPLACE PROCEDURE do_sth ( emp_id_in IN emp.emp_id%TYPE )
IS
    CURSOR c IS SELECT 1
                FROM emp
                WHERE emp_id = emp_id_in;
    v_dummy INTEGER;
BEGIN
    ...

    OPEN c;    
    FETCH c INTO v_dummy;
    IF c%FOUND > 0 THEN
        /* do sth */
    END;
    CLOSE c;

    /* more statements */
    ...

END do_sth;

But really, in your example it is no better or worse, since you are selecting the primary key and Oracle is clever enough to know that it only needs to fetch once.

负佳期 2024-07-15 17:02:36

如果您只对两个感兴趣,请尝试

SELECT 'THERE ARE AT LEAST TWO ROWS IN THE TABLE'
FROM DUAL
WHERE 2 =
(
    SELECT COUNT(*)
    FROM TABLE
    WHERE ROWNUM < 3
)

它比执行手动光标方法需要更少的代码,
而且可能会更快。

rownum 技巧意味着一旦有两行就停止获取行。

如果您不对 count(*) 设置某种限制,则可能需要很长时间才能完成,具体取决于您拥有的行数。 在这种情况下,使用游标循环手动从表中读取 2 行会更快。

If two is all you are interested in, try

SELECT 'THERE ARE AT LEAST TWO ROWS IN THE TABLE'
FROM DUAL
WHERE 2 =
(
    SELECT COUNT(*)
    FROM TABLE
    WHERE ROWNUM < 3
)

It will take less code than doing the manual cursor method,
and it is likely to be faster.

The rownum trick means to stop fetching rows once it has two of them.

If you don't put some sort of limit on the count(*), it could take a long while to finish, depending on the number of rows you have. In that case, using a cursor loop, to read 2 rows from the table manually, would be faster.

我很坚强 2024-07-15 17:02:36

这来自于编写类似于以下代码的程序员(这是伪代码!)。

您想要检查客户是否有多个订单:

if ((select count(*) from orders where customerid = :customerid) > 1)
{
    ....
}

这是一种非常低效的做事方式。 正如 Mark Brady 所说,如果你想知道一个罐子里是否有硬币,你会数一下里面的所有硬币吗?罐子,或者只是确保有 1 个(或在您的示例中为 2 个)?

这可以更好地写为:

if ((select 1 from (select 1 from orders where customerid = :customerid) where rownum = 2) == 1)
{
    ....
}

这可以防止“计算所有硬币”的困境,因为 Oracle 将获取 2 行,然后完成。 前面的示例将导致 Oracle 扫描(索引或表)所有行,然后完成。

This comes from programmers writing code similar to the following (this is psuedo code!).

You want to check to see if the customer has more than one order:

if ((select count(*) from orders where customerid = :customerid) > 1)
{
    ....
}

That is a terribly inefficient way to do things. As Mark Brady would say, if you want to know if a jar contains pennies, would you count all the pennies in the jar, or just make sure there is 1 (or 2 in your example)?

This could be better written as:

if ((select 1 from (select 1 from orders where customerid = :customerid) where rownum = 2) == 1)
{
    ....
}

This prevents the "counting all of the coins" dilemma since Oracle will fetch 2 rows, then finish. The previous example would cause oracle to scan (an index or table) for ALL rows, then finish.

撩心不撩汉 2024-07-15 17:02:36

他的意思是打开一个游标,不仅获取第一条记录,还获取第二条记录,然后你就会知道有多个记录。

因为我似乎从来不需要知道 SELECT COUNT(*)>= 2,所以我不知道为什么这在任何 SQL 变体中都是一个有用的习惯用法。 要么没有记录,要么至少有一个,当然,但不是两个或更多。 不管怎样,总有EXISTS

事实上,Oracle 的优化器似乎相当糟糕...... - 我会质疑该技术的相关性。

针对 TheSoftwareJedi 的评论:

WITH CustomersWith2OrMoreOrders AS (
    SELECT CustomerID
    FROM Orders
    GROUP BY CustomerID
    HAVING COUNT(*) >= 2
)
SELECT Customer.*
FROM Customer
INNER JOIN CustomersWith2OrMoreOrders
    ON Customer.CustomerID = CustomersWith2OrMoreOrders.CustomerID

适当的索引,即使在 SQL Server 中进行像这样的全宇宙查询,我也从未遇到过性能问题。 然而,我经常在这里和其他网站上遇到有关 Oracle 优化器问题的评论。

我自己使用 Oracle 的经验没有很好

OP 的评论似乎是说优化器不能很好地处理表中的完整 COUNT(*) 。 即:(

IF EXISTS (SELECT COUNT(*) FROM table_name HAVING COUNT(*) >= 2)
BEGIN
END

当主键存在时,可以简化为简单的索引扫描 - 在极端优化的情况下,可以简单地查询 sysindexes.rowcnt 中的索引元数据 - 查找条目数 - 全部无需游标)是通常避免的,有利于:

DECLARE CURSOR c IS SELECT something FROM table_name;
BEGIN
    OPEN c
    FETCH c INTO etc. x 2 and count rows and handle exceptions
END;

IF rc >= 2 THEN BEGIN
END

对我来说,这会导致代码可读性较差,可移植性较差,并且可维护性较差。

He means open a cursor and fetch not only the first record but the second, and then you will know there is more than one.

Since I never seem to need to know that SELECT COUNT(*) is >= 2, I have no idea why this is a useful idiom in any SQL variant. Either no records or at least one, sure, but not two or more. And anyway, there's always EXISTS.

That, and the fact that Oracle's optimizer seems to be pretty poor... - I would question the relevance of the technique.

To address TheSoftwareJedi's comments:

WITH CustomersWith2OrMoreOrders AS (
    SELECT CustomerID
    FROM Orders
    GROUP BY CustomerID
    HAVING COUNT(*) >= 2
)
SELECT Customer.*
FROM Customer
INNER JOIN CustomersWith2OrMoreOrders
    ON Customer.CustomerID = CustomersWith2OrMoreOrders.CustomerID

Appropriately indexed, I've never had performance problems even with whole universe queries like this in SQL Server. However, I have consistently run into comments about Oracle optimizer problems here and on other sites.

My own experience with Oracle has not been good.

The comment from the OP appears to be saying that full COUNT(*) from tables are not well handled by the optimizer. i.e.:

IF EXISTS (SELECT COUNT(*) FROM table_name HAVING COUNT(*) >= 2)
BEGIN
END

(which, when a primary key exists, can be reduced to a simple index scan - in a case of extreme optimization, one can simply query the index metadata in sysindexes.rowcnt - to find the number of entries - all without a cursor) is to be generally avoided in favor of:

DECLARE CURSOR c IS SELECT something FROM table_name;
BEGIN
    OPEN c
    FETCH c INTO etc. x 2 and count rows and handle exceptions
END;

IF rc >= 2 THEN BEGIN
END

That, to me would result in less readable, less portable, and less maintainable code.

百善笑为先 2024-07-15 17:02:36

在你太认真地对待 Steven Feuerstein 的建议之前,先做一些基准测试。 在您的情况下, count(*) 是否明显比显式光标慢? 不? 然后更好地使用允许简单、可读代码的构造。 在大多数情况下,这将是“select count(*) into v_cnt ... if v_cnt>0 then ...”

PL/SQL 允许非常可读的程序。 不要仅仅为了纳米优化而浪费时间。

Before you take Steven Feuerstein's suggestions too serious, just do a little benchmark. Is count(*) noticeably slower than the explicit cursor in your case? No? Then better use the construct that allows for simple, readable code. Which, in most cases, would be "select count(*) into v_cnt ... if v_cnt>0 then ..."

PL/SQL allows for very readable programs. Don't waste that just to nano-optimize.

|煩躁 2024-07-15 17:02:36

根据数据库的不同,可能有一个 sys 表存储近似计数并可以在恒定时间内查询。 如果您想知道表是否有 20 行、20,000 行或 20,000,000 行,此功能很有用。

Depending on the DB, there may be a sys table which stores an approximate count and can be queried in constant time. Useful if you want to know whether the table has 20 rows or 20,000 or 20,000,000.

偏闹i 2024-07-15 17:02:36

SQL Server:

if 2 = (
    select count(*) from (
        select top 2 * from (
            select T = 1 union
            select T = 2 union
            select T = 3 ) t) t)
    print 'At least two'

另外,永远不要使用游标。 如果你认为你真的需要它们,那就用铲子敲打自己,直到你改变主意。 让远古的遗迹继续成为远古的遗迹。

SQL Server:

if 2 = (
    select count(*) from (
        select top 2 * from (
            select T = 1 union
            select T = 2 union
            select T = 3 ) t) t)
    print 'At least two'

Also, don't ever use cursors. If you think you really really need them, beat yourself with a shovel until you change your mind. Let relics from an ancient past remain relics from an ancient past.

眼藏柔 2024-07-15 17:02:36

如果你想获取表中的行数,请不要使用count(*),我建议使用count(0),0是主键列的列索引。

If you want to get number of rows in a table, please don't used count(*), I would suggest count(0) that 0 is the column index of your primary key column.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文