ClickHouse本地加入和本地插入分布式表如何?
我有3个分布式表:T_USER_INFO_ALL,T_USER_EVENT_ALL,T_USER_FLAT_ALL,它们都在同一个shard键user_id的同一群集上。
我想插入t_user_event_all的结果和t_user_info_all in t_user_flat_all,sql类似:插入t_user_flat_all select *从t_user_info_all t1左键join t_user_event_all t2 on t1.user_id = t2.user_id
。
设置 distributed_product_mode_mode local'> local'>在本地模式下运行,但仍在分布式表上插入语句。
我找到了设置 parallel_distribedibed_insert_insert_insert_select_select select and offect ecect ecect ececert ececert ececert ececert ececert ececert ececert ececert ecceper从/到分布式发动机的下面桌子上的每个碎片上。但是它仅适用于instrable_table_a select ...从distribute_table_b
中插入诸如之类的问题,选择查询不能在条件或加入的地方具有。
或者我可以运行本地插入插入t_user_flat_local select * *从t_user_info_local t1左JON JOIN t_user_event_local t2 on t1.user_id = t2.user_id
在每个碎片上,但它使情况变得复杂。
I have 3 distributed table: t_user_info_all, t_user_event_all, t_user_flat_all, they all on same cluster with same shard key user_id.
And i want insert join result of t_user_event_all and t_user_info_all into t_user_flat_all, SQL like this:insert into t_user_flat_all select * from t_user_info_all t1 left join t_user_event_all t2 on t1.user_id = t2.user_id
.
With setting distributed_product_mode = 'local', join runs on local mode, but insert statements still on a distributed table.
I found setting parallel_distributed_insert_select = 2, SELECT and INSERT will be executed on each shard from/to the underlying table of the distributed engine. But it only works for queries like INSERT INTO distributed_table_a SELECT ... FROM distributed_table_b
, the select query can not have where conditions or joins.
Or i can run local insert insert into t_user_flat_local select * from t_user_info_local t1 left join t_user_event_local t2 on t1.user_id = t2.user_id
on each shard, but it makes the case complex.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论