为什么要为字符变化类型指定长度

发布于 2024-12-02 22:48:12 字数 1008 浏览 3 评论 0原文

参考关于字符类型的Postgres文档，我不清楚为字符变化 (varchar) 类型指定长度的要点。

假设：

字符串的长度对应用程序来说并不重要。
你不在乎有人在数据库中放入最大大小
你有无限的硬盘空间

它确实提到：

短字符串（最多126字节）的存储要求是1字节加上实际的字符串，其中包括大小写中的空格填充的性格。较长的字符串有 4 个字节的开销，而不是 1 个。长字符串会被系统自动压缩，所以对磁盘的物理要求可能会更少。很长的值也是存储在后台表中，以便它们不会干扰快速访问较短的列值。无论如何，尽可能长的可存储的字符串约为1GB。（最大值数据类型声明中允许 n 小于那。改变这个是没有用的，因为对于多字节字符编码的字符数和字节数可以相当不同。

这讨论的是字符串的大小，而不是字段的大小（即听起来它总是会压缩大 varchar 字段中的大字符串，但不会压缩大 varchar 字段中的小字符串？）

我问这个问题，因为它会指定更大的大小会更容易（而且更懒），因此您永远不必担心字符串太大。例如，如果我为地名指定 varchar(50)，我将获得具有更多字符的位置（例如 Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch），但如果我指定 varchar(100) 或 varchar(500)，我不太可能遇到该问题。

那么，如果最大的字符串长度为 400 个字符，那么 varchar(500) 和（任意）varchar(5000000) 或 text() 之间的性能会受到影响吗？

同样出于兴趣，如果有人知道这个问题的答案并且知道其他数据库的答案，也请添加。

我用谷歌搜索过，但没有找到足够的技术解释。

原文

Referring to the Postgres Documentation on Character Types, I am unclear on the point of specifying a length for character varying (varchar) types.

Assumption:

the length of string doesn't matter to the application.
you don't care that someone puts that maximum size in the database
you have unlimited hard disk space

It does mention:

The storage requirement for a short string (up to 126 bytes) is 1 byte
plus the actual string, which includes the space padding in the case
of character. Longer strings have 4 bytes of overhead instead of 1.
Long strings are compressed by the system automatically, so the
physical requirement on disk might be less. Very long values are also
stored in background tables so that they do not interfere with rapid
access to shorter column values. In any case, the longest possible
character string that can be stored is about 1 GB. (The maximum value
that will be allowed for n in the data type declaration is less than
that. It wouldn't be useful to change this because with multibyte
character encodings the number of characters and bytes can be quite
different.

This talks about the size of string, not the size of field, (i.e. sounds like it will always compress a large string in a large varchar field, but not a small string in a large varchar field?)

I ask this question as it would be much easier (and lazy) to specify a much larger size so you never have to worry about having a string too large. For example, if I specify varchar(50) for a place name I will get locations that have more characters (e.g. Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch), but if I specify varchar(100) or varchar(500), I'm less likley to get that problem.

So would you get a performance hit between varchar(500) and (arbitrarily) varchar(5000000) or text() if your largest string was say 400 characters long?

Also out of interest if anyone has the answer to this AND knows the answer to this for other databases, please add that too.

I have googled, but not found a sufficiently technical explanation.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

浮生面具三千个 2024-12-09 22:48:12

我的理解是，约束对于数据完整性很有用，因此我使用列大小来验证较低层的数据项，并更好地描述数据模型。

有关此事的一些链接：

VARCHAR(n) 被认为有害
<一href="http://www.depesz.com/index.php/2010/03/02/charx-vs-varcharx-vs-varchar-vs-text/">CHAR(x) 与 VARCHAR(x) 对比VARCHAR 与 TEXT
捍卫varchar(x)