Clickhouse:非主要列是否存储相关主列的值在其数据文件中
在Clickhouse中,我有一个表,
CREATE TABLE MyTable
(
a UInt8,
b UInt8
)
ENGINE =MergeTree
PRIMARY KEY(a)
只有一个分区[ash all
]和两个列数据文件[a.bin
,b.bin
] 。
非主要列的数据文件b
存储其相关主列的值a
?
- 如果没有。如果执行SQL
从mytable中执行a = 1
,如何获得B的值? - 如果是。查询过程如何运行?查询引擎必须处理整个
b.bin
以使用a = 1
获取行?
In ClickHouse, I have a table
CREATE TABLE MyTable
(
a UInt8,
b UInt8
)
ENGINE =MergeTree
PRIMARY KEY(a)
There is only one partition[all
] and two column data files[a.bin
,b.bin
].
Does the data file of the non-primary column b
store the value of its related primary column a
?
- If no. How to get the value of b if execute the sql
select b from Mytable where a = 1
? - If yes. How the query process runs? The query engine has to process the whole
b.bin
to get the row witha=1
?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
不,列
b
将存储在Seperateb.bin
中,但是每个块在处理过程中按
a
值排序值从mytable中选择b其中a = 1首先,因为您没有的
分区,因此不会在每个dir/var/var/lib/lib/lighouse/data/data/default/mytable/中读取minmax index
。 找到了颗粒数
(o log(n))primary.idx来自内存中,其中包含每个8192行的
a
值,并在寻求颗粒数并读取b.mrk2文件后 ,其中包含两个偏移量对于Granual,第一个Offset1在压缩文件中,在Granula中未压缩数据和行计数的第二个Offset2之后,它将寻求+读取b.bin文件(使用内核页Cache),将取消压缩并显示offset2中的b值,并显示B值。
No, column
b
will store in the seperateb.bin
, but each block sorted bya
valueDuring processing
select b from Mytable where a = 1
First of all, cause you don't have
partition by
so will not read minmax index in partition.dat in each dir /var/lib/clickhouse/data/default/MyTable/but will seek+read (o log(n) ) primary.idx from memory which contains
a
values for each 8192 row, and found granula numberafter it will seek to granula number and read b.mrk2 file, which contains two offsets for granual, first offset1 inside compressed file, second offset2 inside uncompressed data and rows count in granula
after it will seek+read b.bin file (use kernel page cache), will uncompress it and show b values in your result set from offset2