Union-All PostgreSQL选择保留订单条款
与RDBMS PostgreSQL具有复杂的SQL查询,该查询由多个嵌套的联合全呈嵌套查询组成,类似的问题:
(
(
(<QUERY 1-1-1> UNION ALL <QUERY 1-1-2>) UNION ALL
(<QUERY 1-1-3> UNION ALL <QUERY 1-1-4>) UNION ALL
...
) UNION ALL
(
(<QUERY 1-2-1> UNION ALL <QUERY 1-2-2>) UNION ALL
(<QUERY 1-2-3> UNION ALL <QUERY 1-2-4>) UNION ALL
...
) UNION ALL
...
) UNION ALL
(
(
(<QUERY 2-1-1> UNION ALL <QUERY 2-1-2>) UNION ALL
(<QUERY 2-1-3> UNION ALL <QUERY 2-1-4>) UNION ALL
...
) UNION ALL
(
(<QUERY 2-2-1> UNION ALL <QUERY 2-2-2>) UNION ALL
(<QUERY 2-2-3> UNION ALL <QUERY 2-2-4>) UNION ALL
...
) UNION ALL
...
) UNION ALL
(
...
)
每个&lt; query i-th&gt; gt;是相对轻巧的查询,可产生约100k-1m的行,可以在内存中进行排序,而不会产生重大的性能影响。
结果查询由数以万计的多级嵌套联盟以严格的常规顺序组成,例如深度遍历树,因此结果查询是数十亿行数据集。
因此问题是:由于SQL不能保证Union的顺序所有声明,因此外部查询应包含子句的顺序,但是服务器硬件不能在所需的时间内执行Billon行的排序。
但是,严格确定联合查询的顺序,应:依此类推,从层次上进行排序,因此实际上对外部查询的排序是冗余的,因为数据集已按SQL查询结构进行排序。
有必要迫使Postgres保留嵌套联盟所有声明的顺序。怎么做?欢迎任何插件,扩展名甚至肮脏的黑客。
请避免答案和评论提及类似XY的问题 - 问题以研究方式提出为IS。数据库和数据集的结构不能通过问题条件来更改。谢谢。
Having complex SQL query to RDBMS Postgresql which consists of multiple nested UNION ALL-like nested queries, something like this:
(
(
(<QUERY 1-1-1> UNION ALL <QUERY 1-1-2>) UNION ALL
(<QUERY 1-1-3> UNION ALL <QUERY 1-1-4>) UNION ALL
...
) UNION ALL
(
(<QUERY 1-2-1> UNION ALL <QUERY 1-2-2>) UNION ALL
(<QUERY 1-2-3> UNION ALL <QUERY 1-2-4>) UNION ALL
...
) UNION ALL
...
) UNION ALL
(
(
(<QUERY 2-1-1> UNION ALL <QUERY 2-1-2>) UNION ALL
(<QUERY 2-1-3> UNION ALL <QUERY 2-1-4>) UNION ALL
...
) UNION ALL
(
(<QUERY 2-2-1> UNION ALL <QUERY 2-2-2>) UNION ALL
(<QUERY 2-2-3> UNION ALL <QUERY 2-2-4>) UNION ALL
...
) UNION ALL
...
) UNION ALL
(
...
)
Each <QUERY i-th> is relatively lightweight query which produces about 100K-1M rows and can be sorted in-memory without significant performance impact.
Result query is consists of tens thousands multi-level nested UNION ALL queries in strict conventional order, like traversing tree in depth, so result query is several billion rows dataset.
So question is: since SQL does not guarantee order of UNION ALL statement, outer query should contain ORDER BY clause, but server hardware cannot perform sorting of billon rows in required time.
However, order of united queries is strict determined, and should be: <QUERY 1-1-1>, <QUERY 1-1-2> and so on, sorted hierarchically, so in fact sorting of outer query is redundant, since dataset is already sorted by sql query structure.
It's necessary to force Postgres to preserve order of nested UNION ALL statements. How to do it? Any plugins, extensions and even dirty hacks are welcome.
Please avoid of answers and comments mention XY-like problem - question is formulated as-is in research manner. Structure of database and dataset cannot be changed by conditions of question. Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
尝试一下 - 将查询结果分配到临时表中。
这是一步一步的:
the_temp_table
喜欢&lt; query 1-1-1&gt;
的记录类型>extair_id
然后, the_temp_table
您将以受控的方式以无关紧要的绩效惩罚来模仿
union all
。Try this - allocate the queries' results into a temporary table.
Here it is step by step:
the_temp_table
like the the record type of<QUERY 1-1-1>
extra_id
tothe_temp_table
Effectively thus you will be emulating
UNION ALL
in a controlled manner with an insignificant performance penalty.可以通过两种方式来查看这一点:
最安全的替代方法是使用串行或BigSerial声明ID列,这些列将被订购和索引。由于已经订购了记录,因此对查询速度的影响最小,您将确保订购中没有错误。
如果订单并不关键,并且您根本不修改数据 它可能会以与输入相同的顺序获取。没有保证。订单对您的应用程序有多重要?
There are 2 ways of looking at this:
The safest alternative is be to declare an id column using SERIAL or BIGSERIAL, which will be ordered and indexed. As the records are already ordered there will be a minimal effect on query speed and you will be sure that there are no errors in the ordering.
If the order is not critical, and you don't modify the data at all it will probably be fetched in the same order as you entered it. There is no guarantee. How important is the order to your application?