如何加快 Oracle 中的 row_number 速度?

发布于 2024-07-18 04:44:14 字数 434 浏览 4 评论 0原文

我有一个看起来像这样的 SQL 查询:

SELECT * FROM(
    SELECT
        ...,
        row_number() OVER(ORDER BY ID) rn
    FROM
        ...
) WHERE rn between :start and :end

本质上,ORDER BY 部分减慢了速度。 如果我删除它,EXPLAIN 成本会下降一个数量级(超过 1000 倍)。 我已经尝试过:

SELECT 
    ...
FROM
    ...
WHERE
    rownum between :start and :end

但这并没有给出正确的结果。 有什么简单的方法可以加快速度吗? 或者我需要花更多时间使用 EXPLAIN 工具吗?

I have a SQL query that looks something like this:

SELECT * FROM(
    SELECT
        ...,
        row_number() OVER(ORDER BY ID) rn
    FROM
        ...
) WHERE rn between :start and :end

Essentially, it's the ORDER BY part that's slowing things down. If I were to remove it, the EXPLAIN cost goes down by an order of magnitude (over 1000x). I've tried this:

SELECT 
    ...
FROM
    ...
WHERE
    rownum between :start and :end

But this doesn't give correct results. Is there any easy way to speed this up? Or will I have to spend some more time with the EXPLAIN tool?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

油焖大侠 2024-07-25 04:44:14

ROW_NUMBEROracle 中效率相当低。

有关性能详细信息,请参阅我的博客中的文章:

对于您的特定查询,我建议您将其替换为 ROWNUM 并确保使用索引:

SELECT  *
FROM    (
        SELECT  /*+ INDEX_ASC(t index_on_column) NOPARALLEL_INDEX(t index_on_column) */
                t.*, ROWNUM AS rn
        FROM    table t
        ORDER BY
                column
        )
WHERE rn >= :start
      AND rownum <= :end - :start + 1

此查询将使用 COUNT STOPKEY

另外,请确保您的 不可为空,或者添加WHERE 列IS NOT NULL 条件。

否则索引无法用于检索所有值。

请注意,如果没有子查询,则无法使用 ROWNUM BETWEEN :start and :end

ROWNUM 总是最后分配并最后检查,这样 ROWNUM 总是按顺序排列,没有间隙。

如果使用ROWNUM BETWEEN 10 and 20,则满足所有其他条件的第一行将成为返回的候选行,临时分配ROWNUM = 1并无法通过<的测试代码>ROWNUM 10 和 20 之间。

然后下一行将是候选行,分配有 ROWNUM = 1 并失败,等等,所以最后根本不会返回任何行。

这应该通过将 ROWNUM 放入子查询中来解决。

ROW_NUMBER is quite inefficient in Oracle.

See the article in my blog for performance details:

For your specific query, I'd recommend you to replace it with ROWNUM and make sure that the index is used:

SELECT  *
FROM    (
        SELECT  /*+ INDEX_ASC(t index_on_column) NOPARALLEL_INDEX(t index_on_column) */
                t.*, ROWNUM AS rn
        FROM    table t
        ORDER BY
                column
        )
WHERE rn >= :start
      AND rownum <= :end - :start + 1

This query will use COUNT STOPKEY

Also either make sure you column is not nullable, or add WHERE column IS NOT NULL condition.

Otherwise the index cannot be used to retrieve all values.

Note that you cannot use ROWNUM BETWEEN :start and :end without a subquery.

ROWNUM is always assigned last and checked last, that's way ROWNUM's always come in order without gaps.

If you use ROWNUM BETWEEN 10 and 20, the first row that satisifies all other conditions will become a candidate for returning, temporarily assigned with ROWNUM = 1 and fail the test of ROWNUM BETWEEN 10 AND 20.

Then the next row will be a candidate, assigned with ROWNUM = 1 and fail, etc., so, finally, no rows will be returned at all.

This should be worked around by putting ROWNUM's into the subquery.

吻泪 2024-07-25 04:44:14

对我来说看起来像是一个分页查询。

来自这篇 ASKTOM 文章(页面下方约 90%):

您需要对这些分页查询进行唯一的排序,以便每次都将 ROW_NUMBER 确定性地分配给行。

此外,您的查询也不尽相同,所以我不确定是什么将一种方法与另一种方法的成本进行比较的好处是。

Looks like a pagination query to me.

From this ASKTOM article (about 90% down the page):

You need to order by something unique for these pagination queries, so that ROW_NUMBER is assigned deterministically to the rows each and every time.

Also your queries are no where near the same so I'm not sure what the benefit of comparing the costs of one to the other is.

┾廆蒐ゝ 2024-07-25 04:44:14

您的 ORDER BY 列是否已编入索引? 如果没有,那是一个很好的起点。

Is your ORDER BY column indexed? If not that's a good place to start.

亣腦蒛氧 2024-07-25 04:44:14

部分问题在于“开始”到“结束”的跨度有多大以及它们“居住”的地方。
假设表中有 100 万行,并且您想要行 567,890 到 567,900,那么您将不得不接受这样一个事实:它将需要遍历整个表,并按 id 对几乎所有内容进行排序,并找出哪些行属于该范围。

简而言之,这是一项大量的工作,这就是优化器给它很高的成本的原因。

索引也没有多大帮助。 索引会给出顺序,但最多只能为您提供一个起点,然后您继续阅读,直到找到第 567,900 个条目。

如果您一次向最终用户展示 10 个项目,那么实际上可能值得从数据库中抓取前 100 个项目,然后让应用程序将这 100 个项目分解为 10 个块。

Part of the problem is how big is the 'start' to 'end' span and where they 'live'.
Say you have a million rows in the table, and you want rows 567,890 to 567,900 then you are going to have to live with the fact that it is going to need to go through the entire table, sort pretty much all of that by id, and work out what rows fall into that range.

In short, that's a lot of work, which is why the optimizer gives it a high cost.

It is also not something an index can help with much. An index would give the order, but at best, that gives you somewhere to start and then you keep reading on until you get to the 567,900th entry.

If you are showing your end user 10 items at a time, it may be worth actually grabbing the top 100 from the DB, then having the app break that 100 into ten chunks.

-柠檬树下少年和吉他 2024-07-25 04:44:14

花更多时间使用“解释计划”工具。 如果您看到 TABLE SCAN,则需要更改查询。

你的询问对我来说毫无意义。 通过 ROWID 进行查询似乎是自找麻烦。 该查询中没有关系信息。 这是您遇到问题的真实查询还是您为说明问题而编造的示例?

Spend more time with the EXPLAIN PLAN tool. If you see a TABLE SCAN you need to change your query.

Your query makes little sense to me. Querying over a ROWID seems like asking for trouble. There's no relational info in that query. Is it the real query that you're having trouble with or an example that you made up to illustrate your problem?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文