Union-All PostgreSQL选择保留订单条款

发布于 2025-01-24 10:32:58 字数 1197 浏览 4 评论 0原文

与RDBMS PostgreSQL具有复杂的SQL查询,该查询由多个嵌套的联合全呈嵌套查询组成,类似的问题:

(
  (
     (<QUERY 1-1-1> UNION ALL <QUERY 1-1-2>) UNION ALL
     (<QUERY 1-1-3> UNION ALL <QUERY 1-1-4>) UNION ALL
     ...
  ) UNION ALL
  (
     (<QUERY 1-2-1> UNION ALL <QUERY 1-2-2>) UNION ALL
     (<QUERY 1-2-3> UNION ALL <QUERY 1-2-4>) UNION ALL
     ...
  ) UNION ALL
  ...
) UNION ALL
(
  (
     (<QUERY 2-1-1> UNION ALL <QUERY 2-1-2>) UNION ALL
     (<QUERY 2-1-3> UNION ALL <QUERY 2-1-4>) UNION ALL
     ...
  ) UNION ALL
  (
     (<QUERY 2-2-1> UNION ALL <QUERY 2-2-2>) UNION ALL
     (<QUERY 2-2-3> UNION ALL <QUERY 2-2-4>) UNION ALL
     ...
  ) UNION ALL
  ...
) UNION ALL
(
  ...
)

每个&lt; query i-th&gt; gt;是相对轻巧的查询,可产生约100k-1m的行,可以在内存中进行排序,而不会产生重大的性能影响。

结果查询由数以万计的多级嵌套联盟以严格的常规顺序组成,例如深度遍历树,因此结果查询是数十亿行数据集。

因此问题是:由于SQL不能保证Union的顺序所有声明,因此外部查询应包含子句的顺序,但是服务器硬件不能在所需的时间内执行Billon行的排序。

但是,严格确定联合查询的顺序,应:依此类推,从层次上进行排序,因此实际上对外部查询的排序是冗余的,因为数据集已按SQL查询结构进行排序。

有必要迫使Postgres保留嵌套联盟所有声明的顺序。怎么做?欢迎任何插件,扩展名甚至肮脏的黑客。

请避免答案和评论提及类似XY的问题 - 问题以研究方式提出为IS。数据库和数据集的结构不能通过问题条件来更改。谢谢。

Having complex SQL query to RDBMS Postgresql which consists of multiple nested UNION ALL-like nested queries, something like this:

(
  (
     (<QUERY 1-1-1> UNION ALL <QUERY 1-1-2>) UNION ALL
     (<QUERY 1-1-3> UNION ALL <QUERY 1-1-4>) UNION ALL
     ...
  ) UNION ALL
  (
     (<QUERY 1-2-1> UNION ALL <QUERY 1-2-2>) UNION ALL
     (<QUERY 1-2-3> UNION ALL <QUERY 1-2-4>) UNION ALL
     ...
  ) UNION ALL
  ...
) UNION ALL
(
  (
     (<QUERY 2-1-1> UNION ALL <QUERY 2-1-2>) UNION ALL
     (<QUERY 2-1-3> UNION ALL <QUERY 2-1-4>) UNION ALL
     ...
  ) UNION ALL
  (
     (<QUERY 2-2-1> UNION ALL <QUERY 2-2-2>) UNION ALL
     (<QUERY 2-2-3> UNION ALL <QUERY 2-2-4>) UNION ALL
     ...
  ) UNION ALL
  ...
) UNION ALL
(
  ...
)

Each <QUERY i-th> is relatively lightweight query which produces about 100K-1M rows and can be sorted in-memory without significant performance impact.

Result query is consists of tens thousands multi-level nested UNION ALL queries in strict conventional order, like traversing tree in depth, so result query is several billion rows dataset.

So question is: since SQL does not guarantee order of UNION ALL statement, outer query should contain ORDER BY clause, but server hardware cannot perform sorting of billon rows in required time.

However, order of united queries is strict determined, and should be: <QUERY 1-1-1>, <QUERY 1-1-2> and so on, sorted hierarchically, so in fact sorting of outer query is redundant, since dataset is already sorted by sql query structure.

It's necessary to force Postgres to preserve order of nested UNION ALL statements. How to do it? Any plugins, extensions and even dirty hacks are welcome.

Please avoid of answers and comments mention XY-like problem - question is formulated as-is in research manner. Structure of database and dataset cannot be changed by conditions of question. Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

迷鸟归林 2025-01-31 10:32:58

尝试一下 - 将查询结果分配到临时表中。
这是一步一步的:

  • 创建一个临时表ex。 the_temp_table喜欢&lt; query 1-1-1&gt;的记录类型>
create temporary table the_temp_table as <QUERY 1-1-1> limit 0;
  • 添加一个自动提示主键列extair_id 然后, the_temp_table
alter table the_temp_table add column extra_id serial primary key not null;
  • 然后以正确的顺序运行所有查询,
insert into the_temp_table <QUERY 1-1-1>; insert into the_temp_table <QUERY 1-1-2>;
insert into the_temp_table <QUERY 1-1-3>; insert into the_temp_table <QUERY 1-1-4>;

insert into the_temp_table <QUERY 1-2-1>; insert into the_temp_table <QUERY 1-2-2>;
insert into the_temp_table <QUERY 1-2-3>; insert into the_temp_table <QUERY 1-2-4>;

-- continue 
  • 因此
select <fields list w/o extra_id> from the_temp_table order by extra_id;

-- no sorting is taking place here

您将以受控的方式以无关紧要的绩效惩罚来模仿union all

Try this - allocate the queries' results into a temporary table.
Here it is step by step:

  • Create a temporary table ex. the_temp_table like the the record type of <QUERY 1-1-1>
create temporary table the_temp_table as <QUERY 1-1-1> limit 0;
  • Add an auto-increment primary key column extra_id to the_temp_table
alter table the_temp_table add column extra_id serial primary key not null;
  • Then run all your queries one by one in the right order
insert into the_temp_table <QUERY 1-1-1>; insert into the_temp_table <QUERY 1-1-2>;
insert into the_temp_table <QUERY 1-1-3>; insert into the_temp_table <QUERY 1-1-4>;

insert into the_temp_table <QUERY 1-2-1>; insert into the_temp_table <QUERY 1-2-2>;
insert into the_temp_table <QUERY 1-2-3>; insert into the_temp_table <QUERY 1-2-4>;

-- continue 
  • Finally
select <fields list w/o extra_id> from the_temp_table order by extra_id;

-- no sorting is taking place here

Effectively thus you will be emulating UNION ALL in a controlled manner with an insignificant performance penalty.

水波映月 2025-01-31 10:32:58

可以通过两种方式来查看这一点:

  • 最安全的替代方法是使用串行或BigSerial声明ID列,这些列将被订购和索引。由于已经订购了记录,因此对查询速度的影响最小,您将确保订购中没有错误。

  • 如果订单并不关键,并且您根本不修改数据 它可能会以与输入相同的顺序获取。没有保证。订单对您的应用程序有多重要?


There are 2 ways of looking at this:

  • The safest alternative is be to declare an id column using SERIAL or BIGSERIAL, which will be ordered and indexed. As the records are already ordered there will be a minimal effect on query speed and you will be sure that there are no errors in the ordering.

  • If the order is not critical, and you don't modify the data at all it will probably be fetched in the same order as you entered it. There is no guarantee. How important is the order to your application?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文