点击室TTL设置,其中有状态

发布于 2025-01-24 18:56:31 字数 2187 浏览 5 评论 0原文

我有一个带有以下DDL的表:

CREATE TABLE default.pricing_rate_latest_state_union_test
(
    `organization_id`   Int32,
    `organization_name` String,
    `scraping_time`     AggregateFunction(max, DateTime),
    `amount_from_raw`   Float64,
    `amount_to_raw`     Float64,
    `key_param_id`      LowCardinality(String),
    `amount_from`       AggregateFunction(argMax, Float64, DateTime),
    `amount_to`         AggregateFunction(argMax, Float64, DateTime),
    `source`            LowCardinality(String)
)
ENGINE = ReplicatedAggregatingMergeTree('/clickhouse/tables/{shard}/*****/pricing_rate_latest_state_union_test',
 '{replica}')
ORDER BY (key_param_id,
 amount_from_raw,
 amount_to_raw)
SETTINGS index_granularity = 8192;

为此,我想在scraping_time列上添加TTL表达式,以便在一定天数后删除相应的记录。我可以轻松地完成以下操作:

ALTER TABLE default.pricing_rate_latest_state_union_test 
MODIFY TTL finalizeAggregation(scraping_time) + INTERVAL 30 DAY;

这很好,但是我的最终目标是添加条件,描述到期后需要删除哪些行。我正在尝试的是:

ALTER TABLE default.pricing_rate_latest_state_union_test 
MODIFY TTL finalizeAggregation(scraping_time) + INTERVAL 30 DAY 
DELETE WHERE source = 'some_value';

不幸的是,这导致了以下错误:

 DB::Exception: Exception happened during execution of mutation '0000000000' with part 'all_0_0_0' reason: 'Code: 10, e.displayText() = DB::Exception: Not found column scraping_time in block. There are only columns: organization_id (version 21.8.13.6 (official build))'. This error maybe retryable or not. In case of unretryable error, mutation can be killed with KILL MUTATION query 

根据此文档应该有可能,因为有以下示例:

CREATE TABLE table_with_where
(
    d DateTime,
    a Int
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(d)
ORDER BY d
TTL d + INTERVAL 1 MONTH DELETE WHERE toDayOfWeek(d) = 1;

我相信我的案子我的案子无法正常工作。复制的杂物getmergetree,因此在这里将不胜感激的任何帮助。

我也对找不到的含义感兴趣。 em>,但请参阅 agrishate_id

I have a table with the following DDL:

CREATE TABLE default.pricing_rate_latest_state_union_test
(
    `organization_id`   Int32,
    `organization_name` String,
    `scraping_time`     AggregateFunction(max, DateTime),
    `amount_from_raw`   Float64,
    `amount_to_raw`     Float64,
    `key_param_id`      LowCardinality(String),
    `amount_from`       AggregateFunction(argMax, Float64, DateTime),
    `amount_to`         AggregateFunction(argMax, Float64, DateTime),
    `source`            LowCardinality(String)
)
ENGINE = ReplicatedAggregatingMergeTree('/clickhouse/tables/{shard}/*****/pricing_rate_latest_state_union_test',
 '{replica}')
ORDER BY (key_param_id,
 amount_from_raw,
 amount_to_raw)
SETTINGS index_granularity = 8192;

For which I would like to add a TTL expression on scraping_time column, so that after certain number of days the corresponding records are dropped. This I can do easily with the following:

ALTER TABLE default.pricing_rate_latest_state_union_test 
MODIFY TTL finalizeAggregation(scraping_time) + INTERVAL 30 DAY;

This is works fine, but my end goal is to add condition describing which rows need to be deleted after expiration. What I'm trying is this:

ALTER TABLE default.pricing_rate_latest_state_union_test 
MODIFY TTL finalizeAggregation(scraping_time) + INTERVAL 30 DAY 
DELETE WHERE source = 'some_value';

Unfortunately this lead to the following error:

 DB::Exception: Exception happened during execution of mutation '0000000000' with part 'all_0_0_0' reason: 'Code: 10, e.displayText() = DB::Exception: Not found column scraping_time in block. There are only columns: organization_id (version 21.8.13.6 (official build))'. This error maybe retryable or not. In case of unretryable error, mutation can be killed with KILL MUTATION query 

According to this documentation it should be possible as there is the following example:

CREATE TABLE table_with_where
(
    d DateTime,
    a Int
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(d)
ORDER BY d
TTL d + INTERVAL 1 MONTH DELETE WHERE toDayOfWeek(d) = 1;

I believe my case doesn't work that smoothly due to the ReplicatedAggregatingMergeTree, so any help would be greatly appreciated here.

I'm also interested in the meaning of Not found column scraping_time in block. I know what part is in ClickHouse, but what is the block and why it doesn't see the scraping_time, but see the organization_id?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

软甜啾 2025-01-31 18:56:38

它看起来像是一个错误或意外行为,因为TTL并非设计用于使用聚合功能。

您可以在没有哪里构造TTL

ALTER TABLE default.pricing_rate_latest_state_union_test  
  MODIFY TTL if (source = 'some_value', finalizeAggregation(scraping_time) + INTERVAL 30 DAY, toDateTime(0));

It looks like a bug or unexpected behavior because TTL is not designed to work with AggregateFunction.

You can construct TTL without WHERE

ALTER TABLE default.pricing_rate_latest_state_union_test  
  MODIFY TTL if (source = 'some_value', finalizeAggregation(scraping_time) + INTERVAL 30 DAY, toDateTime(0));
铜锣湾横着走 2025-01-31 18:56:36

在调查此案的同时,发现问题没有汇总。问题是,Clickhouse的TTL不支持在表不空时修改TTL语句中的子句。

CREATE TABLE test
(
    `scraping_time` AggregateFunction(max,
 DateTime),
    `source` String
)
ENGINE = ReplicatedAggregatingMergeTree('/clickhouse/tables/{shard}/test',
 '{replica}')
ORDER BY (source)
TTL finalizeAggregation(scraping_time) + toIntervalDay(if(source='condition1', 30, 40)) where source <> 'condition2'
SETTINGS index_granularity = 8192;

这有效。填充该表之后。然后,如果您尝试删除TTL并再次通过修改添加,则会增加上述错误。

ALTER TABLE test REMOVE TTL;

ALTER TABLE test MODIFY TTL scraping_time + toIntervalDay(if(source='scraper', 30, 40)) where source <> 'stable';

SQL错误[341]:ClickHouse异常,代码:341,主机:
*****************, 港口: ****;代码:341,
e.disPlayText()= DB ::异常:执行期间发生异常
突变'0000000000'与部分'all_0_0_0'原因:'代码:10,
e.displaytext()= db ::异常:找不到列craping_time in
堵塞。只有列:agrishate_id(版本21.8.14.5
(官方构建))''。这个错误可能是否可以重取。如果是
无法取消的错误,可以通过杀死突变查询杀死突变
(版本21.8.14.5(官方构建))

我尝试删除在哪里工作。

ALTER TABLE test MODIFY TTL finalizeAggregation(scraping_time) + toIntervalDay(if(source='condition1', 30, 40));

因此,这里有三个解决方案

1。正如Denny Crane所描述的,不使用您在修改TTL时使用的地方,您可以用Multiif或多个IFS

2替换它。将数据复制到temp表,要添加ttl的截断表,添加ttl(必须与语句一起使用),将数据复制回主表

3。不建议。放置表,然后在创建语句中使用您的TTL重新创建

While investigating this case found that problem is not in AggregateFunction. Problem is that Clickhouse's TTL doesn't support WHERE clause in MODIFY TTL statement when table is not empty.

CREATE TABLE test
(
    `scraping_time` AggregateFunction(max,
 DateTime),
    `source` String
)
ENGINE = ReplicatedAggregatingMergeTree('/clickhouse/tables/{shard}/test',
 '{replica}')
ORDER BY (source)
TTL finalizeAggregation(scraping_time) + toIntervalDay(if(source='condition1', 30, 40)) where source <> 'condition2'
SETTINGS index_granularity = 8192;

This works. After fill that table. Then if you will try to remove TTL and add again through MODIFY it will raise error described above.

ALTER TABLE test REMOVE TTL;

ALTER TABLE test MODIFY TTL scraping_time + toIntervalDay(if(source='scraper', 30, 40)) where source <> 'stable';

SQL Error [341]: ClickHouse exception, code: 341, host:
*****************, port: ****; Code: 341,
e.displayText() = DB::Exception: Exception happened during execution
of mutation '0000000000' with part 'all_0_0_0' reason: 'Code: 10,
e.displayText() = DB::Exception: Not found column scraping_time in
block. There are only columns: organization_id (version 21.8.14.5
(official build))'. This error maybe retryable or not. In case of
unretryable error, mutation can be killed with KILL MUTATION query
(version 21.8.14.5 (official build))

I've tried to remove WHERE and it works.

ALTER TABLE test MODIFY TTL finalizeAggregation(scraping_time) + toIntervalDay(if(source='condition1', 30, 40));

So here are three SOLUTIONS:

1. As described by Denny Crane to don't use WHERE when you are MODIFY'ing TTL, you can replace it with multiIf or multiple ifs

2. Copy data to temp table, truncate table in which you want to add TTL, add that TTL ( must work with WHERE statement), copy data back to main table.

3. Not recommended. Drop table and then recreate with your TTL in CREATE statement

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文