为什么MySQL char DataType没有前面的长度字节?

发布于 2025-02-03 15:35:49 字数 161 浏览 3 评论 0原文

在MySQL文档中,您可以发现CHAR DataType没有字符串长度的前面字节。

相反,VARCHAR数据类型具有存储字符串长度的1或2个字节。这是可以理解的。数据库引擎需要知道字符串长度是多少才能读取它。

数据库引擎如何知道什么是char数据类型长度?它背后是什么逻辑/算法?

In MySQL documentation you can find that CHAR datatype has no preceding byte with string length.

In contrast VARCHAR datatype has 1 or 2 preceding bytes where string length is stored. Which is understandable. Database engine needs to know what is the string length in order to read it.

How does database engine know what is CHAR datatype length? What logic/algorithm is behind it?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

烟酒忠诚 2025-02-10 15:35:49

来自 documentation varchar datatypes:

char列的长度固定在您创建表时声明的长度上。长度可以是从0到255的任何值。当存储char值时,它们的右空间是指定长度的空间。当检索char值时,除非启用pad_char_to_full_length SQL模式。

因此,它不需要将长度存储在表数据中。该长度在架构中指定,并且发动机使用它来知道从表数据中检索多少个字符。

char相反,varchar值存储为1字节或2字节长度前缀加数据。长度前缀表示值中的字节数。列使用一个长度字节如果值不超过255个字节,则两个长度字节,如果值可能需要超过255个字节。

如果大多数值将使用长度指定的所有字节,则使用char是合适的;添加额外的长度字节将浪费空间。

varchar在实际值要比最大长度短得多时更合适。例如,如果值最多可以100个字节,但是大多数值少于20个,则可以通过在每行中使用一个长度字节来节省大量空间,而不是在大多数行中添加80多个空间。

From the documentation of CHAR and VARCHAR datatypes:

The length of a CHAR column is fixed to the length that you declare when you create the table. The length can be any value from 0 to 255. When CHAR values are stored, they are right-padded with spaces to the specified length. When CHAR values are retrieved, trailing spaces are removed unless the PAD_CHAR_TO_FULL_LENGTH SQL mode is enabled.

So it doesn't need to store the length in the table data. The length is specified in the schema, and the engine uses that to know how many characters to retrieve from the table data.

In contrast to CHAR, VARCHAR values are stored as a 1-byte or 2-byte length prefix plus data. The length prefix indicates the number of bytes in the value. A column uses one length byte if values require no more than 255 bytes, two length bytes if values may require more than 255 bytes.

Using CHAR is appropriate if most values will use all the bytes specified in the length; adding an extra length byte will waste space.

VARCHAR is more appropriate when the actual values will be much shorter than the maximum length. For example, if the values can be up to 100 bytes, but most are less than 20, you'll save lots of space by using a single length byte in every row rather than adding 80+ spaces to most rows.

薯片软お妹 2025-02-10 15:35:49

我正在使用MySQL 5.6。

我很惊讶地看到char和varchar字段在 *.myd文件(Myisam)中具有相同的存储格式。 char还具有额外的第一个字节,该字节指定每个行中值的长度。 char的存储与 *.myd文件中的varchar完全相同。
我以不同的字符集-UTF8(UTF8MB3)和UTF8MB4进行了测试。

DROP TABLE IF EXISTS test.table1;
CREATE TABLE test.table1 (
  char_1 CHAR(1) NOT NULL,
  char_10 CHAR(10) NOT NULL,
  varchar_1 VARCHAR(1) NOT NULL,
  varchar_10 VARCHAR(10) NOT NULL
)
ENGINE = MYISAM,
CHARACTER SET utf8mb4,
CHECKSUM = 0,
COLLATE utf8mb4_unicode_ci;

INSERT INTO table1 VALUES('A','ABCDEFGHIJ', 'A','ABCDEFGHIJ');

如果您在十六进制读取器中打开table1.myd文件,您将看到此。我用红色正方形标记了char和varchar场的长度存储的位置。

因此,char Datatype还具有1个字节,其中字符串的长度存储。

编辑1。InnodbChar Storage

后来我检查了Char在Innodb中的存储。在InnoDB中,Char在每一行还具有一个额外的字节,以决定字段的长度。

DROP TABLE IF EXISTS test.table_idb2;
CREATE TABLE test.table_idb2 (
  char_2 CHAR(2) NOT NULL,
  char_4 CHAR(4) NOT NULL,
  varchar_1 VARCHAR(1) NOT NULL,
  varchar_10 VARCHAR(10) NOT NULL
)
ENGINE = INNODB,
CHARACTER SET utf8mb4,
CHECKSUM = 0,
COLLATE utf8mb4_unicode_ci;

INSERT INTO table_idb2 VALUES('A','AAAA', 'A','ABCDEFGHIJ');
INSERT INTO table_idb2 VALUES('BB','DDD', 'B','ABCDEFGHIJ');
INSERT INTO table_idb2 VALUES('','CC', 'D','DDDD');
FLUSH TABLES;

如果您在十六进制读取器中打开table_idb2.ibd文件,您将看到此内容。我用平方标记了char长度的地方。

因此,在InnoDB中,CHAR数据类型在每行中还具有1个字节,其中存储了字符串的长度。

I am using MySQL 5.6.

I was surprised to see that CHAR and VARCHAR fields have the same storage format in *.MYD files (MyISAM). CHAR also has an additional first byte that specifies the length of the value in each rows. CHAR is stored exactly the same as VARCHAR in *.MYD files.
I tested this in different CHARACTER SET - utf8 (utf8mb3) and utf8mb4.

DROP TABLE IF EXISTS test.table1;
CREATE TABLE test.table1 (
  char_1 CHAR(1) NOT NULL,
  char_10 CHAR(10) NOT NULL,
  varchar_1 VARCHAR(1) NOT NULL,
  varchar_10 VARCHAR(10) NOT NULL
)
ENGINE = MYISAM,
CHARACTER SET utf8mb4,
CHECKSUM = 0,
COLLATE utf8mb4_unicode_ci;

INSERT INTO table1 VALUES('A','ABCDEFGHIJ', 'A','ABCDEFGHIJ');

If you open the table1.MYD file in a HEX reader, you will see this. I marked with red squares the places where the length of the CHAR and VARCHAR fields is stored.

MyISAM CHAR storage

So CHAR datatype also has 1 byte where the length of the string is stored.

EDIT 1. InnoDB CHAR storage

Later I checked the storage of CHAR in InnoDB. In InnoDB, CHAR also has an additional byte in each row that determines the length of the field.

DROP TABLE IF EXISTS test.table_idb2;
CREATE TABLE test.table_idb2 (
  char_2 CHAR(2) NOT NULL,
  char_4 CHAR(4) NOT NULL,
  varchar_1 VARCHAR(1) NOT NULL,
  varchar_10 VARCHAR(10) NOT NULL
)
ENGINE = INNODB,
CHARACTER SET utf8mb4,
CHECKSUM = 0,
COLLATE utf8mb4_unicode_ci;

INSERT INTO table_idb2 VALUES('A','AAAA', 'A','ABCDEFGHIJ');
INSERT INTO table_idb2 VALUES('BB','DDD', 'B','ABCDEFGHIJ');
INSERT INTO table_idb2 VALUES('','CC', 'D','DDDD');
FLUSH TABLES;

If you open the table_idb2.ibd file in a HEX reader, you will see this. I marked with squares the places where the length of the CHAR.

So in InnoDB, the CHAR data type also has 1 byte in each row, where the length of the string is stored.

InnoDB CHAR storage

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文