有没有办法在使用Impala中使用该子查询时首先解决子查询

发布于 2025-02-03 20:59:03 字数 603 浏览 2 评论 0原文

我正在尝试使用Impala中的子查询修剪一些分区。

在下面的查询中，我将日期进行了编码，我得到了预期的修剪＆amp; Impala只是阅读相关分区。

select col1, col2
from table1
where
table1PartitionCol >= '2022-05-01';

（Table1）从解释计划中：分区= 100/5000文件= 100 size = 10GB

但是，当我尝试从子查询中获得日期（查询其他表）时，解释计划表明它显示了首先读取每个分区，然后运行子查询＆amp;将其应用于过滤器。

select col1, col2
from table1
where
table1PartitionCol >= (select max(partitionValue) from table2);

（Table1）从解释计划中：分区= 5000/5000文件= 5000 size = 250GB

理想情况下，它将首先运行子查询＆amp;然后从主表中阅读。有没有办法强迫它这样做？还是其他一些实现相同结果的方法？

原文

I am trying to prune some partitions using a subquery in Impala.

In the query below, where I hardcode the date, I get the expected pruning & impala just reads the relevant partitions.

select col1, col2
from table1
where
table1PartitionCol >= '2022-05-01';

(Table1) From Explain Plan: partitions=100/5000 files=100 size=10GB

However, when I try to get the date from a subquery(querying a different table), the explain plan shows that it reads every partition first, then it will run the subquery & apply it as a filter.

select col1, col2
from table1
where
table1PartitionCol >= (select max(partitionValue) from table2);

(Table1) From Explain Plan: partitions=5000/5000 files=5000 size=250GB

Ideally, it would run the subquery first & then read from the main table. Is there a way to force it to do this? Or some other way to achieve the same results?

分享到QQ

分享到微博