MySQL:VARCHAR 的大小与 PK 一样快,与 INT 一样快?
对于 VARCHAR 列来说,一个好的/安全的最大长度是多少,因为主键不会比在 64 位系统上使用 MySQL 5 + InnoDB 的 INTEGER ID 慢很多/任何慢?注意,这个PK应该被假设被其他表引用,所以它会出现在许多JOIN中。
VARCHAR(7) 的长度合适吗? 6? 8? 10?更多的?较少的? 为什么?
这可能很难回答,但至少应该有一个基于事实的上限,例如基于MySQL/InnoDB的内部工作原理(索引结构, ...?)。
编辑:假设 ASCII 字符编码,区分大小写。
What would be a good/safe maximum length for a VARCHAR column as primary key not being much/any slower than an INTEGER ID using MySQL 5 + InnoDB on a 64 bit system? Note, that this PK should be assumed to be referenced by other tables, so it will appear in a number of JOINs.
Would a VARCHAR(7) be a good length? 6? 8? 10? More? Less? Why?
It might be hard to answer, but there should at least be an upper limit based on facts, e.g. based on the inner workings of MySQL/InnoDB (index structures, ... ?).
Edit: Assume the ASCII character encoding, case sensitive.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
对于键来说,通常最好使用 int 字段。几乎可以在现有的任何平台上通过一条汇编指令来比较两个数字。比较两个字符串总是需要提前循环和额外的设置步骤,或者需要多个 cpu 周期,即使 cpu 内置了字符串比较指令。
Using an int field is generally preferable for keys. Two numbers can be compared in a single assembler instruction on pretty much any platform in existence. Comparing two strings will invariably require a loop and extra setup steps in advance, or take multiple cpu cycles even if the cpu has string comparison instructions built in.
ASCII 编码中任何大于 VARCHAR(4) 的内容(以及 ascii_bin 排序规则 - 您不希望关系操作使用不区分大小写的排序规则)都会比 INT 慢(因为 INT 的长度为 4 个字节)
Anything larger than VARCHAR(4) in ASCII encoding (and ascii_bin collation - you don't want case insensitive collation for relational operations) will be slower than INT (because INT is 4 bytes long)
做事不慢很多吗?选择?更新?插入?我的内部用户想要更快的插入;我的网络用户想要更快的选择。
在某种程度上,性能(无论这意味着什么)取决于您特定的数据库结构、特定的查询模式和特定的服务器硬件。您自己的测试显示了什么?
如果存在上限,即使是次要版本升级,您也不能指望它保持不变。当然,查询优化器在运行时做出决策,而不是在设计时。根据目前存在的 dbms 内部结构制定长期数据库设计决策并不是最佳实践。 (这只是一个观察。我并不是暗示这就是你正在做的事情,但是很多读过这篇文章的人可能会这样做。)
如果你想知道一个特定的集合如何的表执行,编辑您的问题并包含 DDL 是有意义的。这样我们至少在谈论同一件事。就像现在一样,每个回答的人可能都会使用不同的结构。 (如果他们费心去测试的话。)我们可能不会透露我们私人的——有时是没有根据的——假设。
在一种特定情况下 -来自另一个 SO 问题——使用 id 号和连接的执行时间是使用自然键的 100 倍。 (32 位 PostgreSQL。)因此人们可以整天谈论比较整数或字符串需要多少 CPU 指令,或者整数中的字节数,或者 UTF-8 排序规则中的字节数,等等。然而,在该特定情况下,VARCHAR(30) 以压倒性优势获胜。
当我在军队的时候,我们有一句话。 “当你的地图和地形不一致时,就跟随地形走。”
如果理论和测量结果不一致,请遵循测量结果。根据测量制定经验法则。
Not much slower to do what? SELECT? UPDATE? INSERT? My internal users want faster inserts; my web users want faster selects.
In part, performance (whatever that means) depends on your particular database structure, your particular query patterns, and your particular server hardware. What did your own tests show you?
If there were an upper limit, you wouldn't be able to count on it remaining unchanged across even minor version upgrades. And, of course, the query optimizer makes decisions at run time, not at design time. Basing long-term database design decisions on dbms internals as they exist today is not a Best Practice. (That's just an observation. I'm not implying that's what you're doing, but a lot of people who read this are liable to do just that.)
If you want to know how a particular set of tables perform, it makes sense to edit your question and include the DDL. That way we're at least talking about the same thing. As it is now, everybody that answers is probably going to be using a different structure. (If they bother to test at all.) And we might not reveal our private--and sometime unwarranted--assumptions.
In one specific case--from another SO question--using id numbers and joins took 100 times as long to execute as using a natural key. (32-bit PostgreSQL.) So people can talk all day about how many CPU instructions it takes to compare integers or strings, or the number of bytes in an integer, or the number of bytes in UTF-8 collations, or whatever. Nevertheless, in that specific case, VARCHAR(30) won by a landslide.
When I was in the military, we had a saying. "When your map and the terrain disagree, follow the terrain."
If theory and measurements disagree, follow the measurements. Develop rules of thumb from measurements.