为什么历史上人们使用 255 而不是 256 作为数据库字段大小?
您经常看到数据库字段设置为 255 个字符,传统/历史原因是什么?我认为这与分页/内存限制和性能有关,但 255 和 256 之间的区别一直让我感到困惑。
varchar(255)
考虑到这是一个容量或大小,而不是索引器,为什么 255 优于 256? 是否为某种目的保留了一个字节(终止符或 null 或其他)?
大概 varchar(0) 是无稽之谈(容量为零)?在这种情况下,2^8 的空间肯定应该是 256 吗?
还有其他可以提供性能优势的量值吗?例如,varchar(512) 的性能是否低于 varchar(511) 或 varchar(510)?
对于所有关系数据库(新旧),该值是否相同?
免责声明 - 我是一名开发人员,而不是 DBA,我使用适合我的业务逻辑的字段大小和类型,但我想知道历史这种偏好的原因,即使它不再相关(但如果它仍然相关则更重要)。
编辑:
感谢您的回答,似乎存在一些共识,即使用字节来存储大小,但这并不能在我的脑海中明确解决问题。
如果元数据(字符串长度)存储在相同的连续内存/磁盘中,则有意义。 1 字节的元数据和 255 字节的字符串数据非常适合,并且适合 256 个连续字节的存储,这大概是整洁的。
但是...如果元数据(字符串长度)与实际字符串数据分开存储(可能在主表中),则将字符串数据的长度限制为一个字节,因为仅存储 1 字节整数更容易元数据似乎有点奇怪。
在这两种情况下,这似乎都是一个微妙之处,可能取决于数据库的实现。使用 255 的做法似乎相当普遍,所以某个地方的某个人一开始肯定已经提出了一个很好的案例,有人能记得那个案例是什么吗?程序员不会无缘无故地采用任何新的实践,而这一定是新的一次。
You often see database fields set to have a magnitude of 255 characters, what is the traditional / historic reason why? I assume it's something to do with paging / memory limits, and performance but the distinction between 255 and 256 has always confused me.
varchar(255)
Considering this is a capacity or magnitude, not an indexer, why is 255 preferred over 256? Is a byte reserved for some purpose (terminator or null or something)?
Presumably varchar(0) is a nonsense (has zero capacity)? In which case 2^8 of space should be 256 surely?
Are there other magnitudes that provide performance benefits? For example is varchar(512) less performant than varchar(511) or varchar(510)?
Is this value the same for all relations databases, old and new?
disclaimer - I'm a developer not a DBA, I use field sizes and types that suit my business logic where that is known, but I'd like to know the historic reason for this preference, even if it's no longer relevant (but even more if it still is relevant).
Edit:
Thanks for the answers, there seems to be some concensus that a byte is used to store size, but this doesn't settle the matter definitively in my mind.
If the meta data (string length) is stored in the same contiguous memory/disk, it makes some sense. 1 byte of metadata and 255 bytes of string data, would suit each other very nicely, and fit into 256 contiguous bytes of storage, which presumably is neat and tidy.
But...If the metadata (string length) is stored separately from the actual string data (in a master table perhaps), then to constrain the length of string's data by one byte, just because it's easier to store only a 1 byte integer of metadata seems a bit odd.
In both cases, it would seem to be a subtlety that probably depends on the DB implementation. The practice of using 255 seems pretty widespread, so someone somewhere must have argued a good case for it in the beginning, can anyone remember what that case was/is? Programmers won't adopt any new practice without a reason, and this must have been new once.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(12)
最大长度为255个字符,DBMS可以选择使用单个字节来指示字段中数据的长度。如果限制为 256 或更大,则需要两个字节。
长度为零的值对于
varchar
数据当然有效(除非另有限制)。大多数系统将此类空字符串视为与 NULL 不同,但某些系统(尤其是 Oracle)将空字符串视为与 NULL 相同。对于空字符串不为 NULL 的系统,需要在行中的某处添加一个附加位来指示该值是否应被视为 NULL。正如您所注意到的,这是一个历史优化,可能与当今的大多数系统无关。
With a maximum length of 255 characters, the DBMS can choose to use a single byte to indicate the length of the data in the field. If the limit were 256 or greater, two bytes would be needed.
A value of length zero is certainly valid for
varchar
data (unless constrained otherwise). Most systems treat such an empty string as distinct from NULL, but some systems (notably Oracle) treat an empty string identically to NULL. For systems where an empty string is not NULL, an additional bit somewhere in the row would be needed to indicate whether the value should be considered NULL or not.As you note, this is a historical optimisation and is probably not relevant to most systems today.
255 是 mySQL4 及更早版本中的 varchar 限制。
也是 255 个字符+ 空终止符 = 256
或 1 字节长度描述符给出可能的范围 0-255 个字符
255 was the varchar limit in mySQL4 and earlier.
Also 255 chars + Null terminator = 256
Or 1 byte length descriptor gives a possible range 0-255 chars
255 是可以存储在单字节无符号整数(假设 8 位字节)中的最大数值 - 因此,出于某种目的存储字符串长度的应用程序会更喜欢 255 而不是 256,因为这意味着它们只需要为“size”变量分配 1 个字节。
255 is the largest numerical value that can be stored in a single-byte unsigned integer (assuming 8-bit bytes) - hence, applications which store the length of a string for some purpose would prefer 255 over 256 because it means they only have to allocate 1 byte for the "size" variable.
来自 MySQL 手册:
理解并做出选择。
From MySQL Manual:
Understand and make choice.
255 是 8 位整数的最大值:11111111 = 255。
255 is the maximum value of a 8 bit integer : 11111111 = 255.
回顾一下位/字节存储的基本原理,存储 256 以下的整数需要一个字节,存储 256 到 65536 之间的任何整数需要两个字节。
因此,它需要相同的空间(两个字节)来存储 511 或 512 或 65535....
因此,很明显,上面讨论中提到的 this 参数对于
varchar(512)
或varchar(511)
来说是不适用的。Recollected the fundamentals of the bits/bytes storage, it requires one byte to store integers below 256 and two bytes for any integer between 256 and 65536.
Hence, it requires same space (two bytes) to store 511 or 512 or for that matter 65535....
Thus it is clear that the this argument mentioned in the discussion above is N/A for
varchar(512)
orvarchar(511)
.最大长度为 255,允许数据库引擎仅使用 1 个字节来存储每个字段的长度。您是对的,1 个字节的空间允许您存储 2^8=256 个不同的字符串长度值。
但是,如果允许字段存储零长度文本字符串,则需要能够存储长度为零。因此,您可以允许 256 个不同的长度值,从零开始:0-255。
A maximum length of 255 allows the database engine to use only 1 byte to store the length of each field. You are correct that 1 byte of space allows you to store 2^8=256 distinct values for the length of the string.
But if you allow the field to store zero-length text strings, you need to be able to store zero in the length. So you can allow 256 distinct length values, starting at zero: 0-255.
过去,所有字符串都需要
NUL
终止符,或“反斜杠零”。更新后的数据库没有这个。它是“255 个字符的文本”,并在末尾自动添加“\0”,以便系统知道字符串在哪里结束。如果您说VARCHAR(256)
,它最终会是 257,然后您将在下一个寄存器中输入一个字符。浪费。这就是为什么一切都是VARCHAR(255)
和VARCHAR(31)
。出于习惯,255 似乎一直存在,但 31 变成了 32,511 变成了 512。那部分很奇怪。很难让自己写VARCHAR(256)
。It used to be that all strings required a
NUL
terminator, or "backslash-zero". Updated databases don't have that. It was "255 characters of text" with a "\0" added automatically at the end so the system knew where the string ended. If you saidVARCHAR(256)
, it would end up being 257 and then you'd be in the next register for one character. Wasteful. That's why everything wasVARCHAR(255)
andVARCHAR(31)
. Out of habit the 255 seems to have stuck around but the 31's became 32's and the 511's became 512's. That part is weird. It's hard to make myself writeVARCHAR(256)
.通常 varchar 被实现为 pascal 字符串:将实际长度保存在字节 #0 中。因此,长度被限制为 255。(字节的值从 0 到 255 变化。)
Often varchars are implemented as pascal strings: holding the actual length in the byte #0. The length was therefore bound to 255. (Value of a byte varies from 0 to 255.)
8 位无符号 = 256 字节
255 个字符 + 长度字节 0
8 bits unsigned = 256 bytes
255 characters + byte 0 for length
我想这可能会回答你的问题。看起来这是早期系统中 varchar 的最大限制。我把它从另一个 stackoverflow 问题中删除了。
使用适用于所有基于文本的字段的通用 varchar(255)?
I think this might answer your question. Looks like it was the max limit of varchar in earlier systems. I took it off another stackoverflow question.
Are there disadvantages to using a generic varchar(255) for all text-based fields?
数据以二进制形式保存在内存中,0和1是二进制数字。 1 字节(8 位)可以容纳的最大二进制数是 11111111,转换为十进制 255。
Data is saved in memory in binary system and 0 and 1 are binary digits. Largest binary number that can fit in 1 byte (8-bits) is 11111111 which converts to decimal 255.