Clickhouse杀死查询永远悬挂
具有以下潜在的无限执行查询。为什么将其发行给Clickhouse服务器是没有意义的。查询已经启动并仍在运行:
SELECT Count("SN".*) FROM (SELECT sleepEachRow(3) FROM system.numbers) "SN"
好的,尝试查找关联的query_id或已经有一个。例如,query_id ='D02F4BDB-8928-4347-8641-4DA4B9C0F486'
。让我们通过查询来杀死它:
KILL QUERY WHERE query_id = 'd02f4bdb-8928-4347-8641-4da4b9c0f486'
从第一次看,实现的杀戮结果似乎还可以:
┌─kill_status─┬─query_id─────────────────────────────┬─user────┬─query────────────────────────────────────────────────────────────────────────┐
│ waiting │ d02f4bdb-8928-4347-8641-4da4b9c0f486 │ default │ SELECT Count("SN".*) FROM (SELECT sleepEachRow(3) FROM system.numbers) "SN"; │
└─────────────┴──────────────────────────────────────┴─────────┴──────────────────────────────────────────────────────────────────────────────┘
好吧,让我们等待几秒钟,并确保原始查询已成功终止。让我们通过以下系统信息架构查询进行检查:
SELECT "query_id", "query", "is_cancelled" FROM system.processes WHERE query_id = 'd02f4bdb-8928-4347-8641-4da4b9c0f486';
不幸的是,原始查询仍在某种意义上运行。它变成了“ is_canceld”状态,但仍然悬挂:
┌─query_id─────────────────────────────┬─query────────────────────────────────────────────────────────────────────────┬─is_cancelled─┐
│ d02f4bdb-8928-4347-8641-4da4b9c0f486 │ SELECT Count("SN".*) FROM (SELECT sleepEachRow(3) FROM system.numbers) "SN"; │ 1 │
└──────────────────────────────────────┴──────────────────────────────────────────────────────────────────────────────┴──────────────┘
等待一个小时和更多时间,并仍然取得一些结果。原始查询仍然挂在“ is_canceld”状态下。随后使用相同Query_id的杀戮查询没有任何作用。
最有可能重新启动服务器将有助于解决问题,但我不想这样做。如何在不重新启动服务器的情况下通过卡住查询解决问题?
Having following potential infinite-time execution query. It does not make sense why it had been issued to Clickhouse server. Query is already has been launched and still running:
SELECT Count("SN".*) FROM (SELECT sleepEachRow(3) FROM system.numbers) "SN"
Okay, try to find associated query_id or already have one. For instance, query_id = 'd02f4bdb-8928-4347-8641-4da4b9c0f486'
. Let's kill it via following query:
KILL QUERY WHERE query_id = 'd02f4bdb-8928-4347-8641-4da4b9c0f486'
Achieved kill-query result seems to be okay from first look:
┌─kill_status─┬─query_id─────────────────────────────┬─user────┬─query────────────────────────────────────────────────────────────────────────┐
│ waiting │ d02f4bdb-8928-4347-8641-4da4b9c0f486 │ default │ SELECT Count("SN".*) FROM (SELECT sleepEachRow(3) FROM system.numbers) "SN"; │
└─────────────┴──────────────────────────────────────┴─────────┴──────────────────────────────────────────────────────────────────────────────┘
Okay, let's wait for several seconds and ensure that original query had been terminated successfully. Let's check it via following system information schema query:
SELECT "query_id", "query", "is_cancelled" FROM system.processes WHERE query_id = 'd02f4bdb-8928-4347-8641-4da4b9c0f486';
Unfortunately original query is still running in a some sense. It turned into "is_cancelled" state and still hangs:
┌─query_id─────────────────────────────┬─query────────────────────────────────────────────────────────────────────────┬─is_cancelled─┐
│ d02f4bdb-8928-4347-8641-4da4b9c0f486 │ SELECT Count("SN".*) FROM (SELECT sleepEachRow(3) FROM system.numbers) "SN"; │ 1 │
└──────────────────────────────────────┴──────────────────────────────────────────────────────────────────────────────┴──────────────┘
Waiting for hour and more time and still getting some results. Original query is still hanged in "is_cancelled" state. Subsequent KILL queries with same query_id does not do nothing.
Most likely, restarting the server will help solve the problem, but I do not want to do this. How to solve the problem with a stuck query without server restarting?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
在睡眠期间,Clickhouse查询无法杀死。
如果您使用的是最近的CH版本(21.12+),则将在处理每个块后检查杀戮标志(在较旧的版本上,可能永远不会检查它)。由于默认块为65536,因此查询将睡65536 * 3秒〜= 54小时,然后检查任何内容。
在将来的CH版本中,无法入睡超过3秒(现在是睡眠的限制,而不是SleepeachRow)。同时,您可以等待或重新启动服务器。
ClickHouse queries can't be killed during the sleep.
If you are using a recent CH release (21.12+), then the KILL flag will be checked after each block is processed (on older releases it might never be checked). Since the default block is 65536, the query will be slept for 65536 * 3 seconds ~= 54 hours before checking anything.
In future releases of CH it will be impossible to sleep for more than 3 seconds (which right now is a limit of sleep but not for sleepEachRow). In the meantime you can either wait or restart the server.