如何为列[innodb 特定]选择优化的数据类型?
我正在学习数据库数据类型的用法。
例如:
- 电子邮件哪个更好? varchar[100]、char[100] 或tinyint(开玩笑)
- 哪个对用户名更好?我应该使用 int、bigint 还是 varchar? 解释。我的一些朋友说,如果我们使用 int、bigint 或其他数字数据类型会更好(facebook 就是这样做的)。就像 u=123400023 指的是用户 123400023,而不是 user=thenameoftheuser。因为获取数字所需的时间更少。
- 电话号码哪个更好?帖子(例如博客或公告中)?或者也许是日期(我使用日期时间)?也许有些人做了研究并想分享。
- 产品价格(我用的是decimal(11,2),不知道你们怎么样)?
- 或者您想到的任何其他内容,例如“我使用 blablabla 的串行数据类型”。
为什么我要专门提到innodb呢?
除非你使用的是InnoDB表 类型(参见第 11 章“高级 MySQL,”了解更多信息),CHAR 列的访问速度比 VARCHAR。
Inno db 有一些我不知道的差异。 我是从此处读到的。
I'm learning about the usage of datatypes for databases.
For example:
- Which is better for email? varchar[100], char[100], or tinyint (joking)
- Which is better for username? should I use int, bigint, or varchar?
Explain. Some of my friends say that if we use int, bigint, or another numeric datatype it will be better (facebook does it). Like u=123400023 refers to user 123400023, rather then user=thenameoftheuser. Since numbers take less time to fetch. - Which is better for phone numbers? Posts (like in blogs or announcments)? Or maybe dates (I use datetime for that)? maybe some have make research that would like to share.
- Product price (I use decimal(11,2), don't know about you guys)?
- Or anything else that you have in mind, like, "I use serial datatype for blablabla".
Why do I mention innodb specifically?
Unless you are using the InnoDB table
types (see Chapter 11, "Advanced
MySQL," for more information), CHAR
columns are faster to access than
VARCHAR.
Inno db has some diffrence that I don't know.
I read that from here.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
简要摘要:
(只是我的意见)
VARCHAR(255)
VARCHAR(100)
或VARCHAR(255)
INT
(除非您计划在系统中拥有超过 20 亿用户)INT
或VARCHAR
或CHAR
(取决于您是否要存储格式)TEXT
DATE
或 < code>DATETIME (绝对包括帖子或电子邮件等内容的时间)DECIMAL(11,2)
至于使用 InnoDB,因为
VARCHAR
应该更快,我不会担心这一点,或者一般的速度。使用 InnoDB 因为您需要执行事务和/或想要使用外键约束 (FK) 来保证数据完整性。此外,InnoDB 使用行级锁定,而 MyISAM 仅使用表级锁定。因此,InnoDB 可以比 MyISAM 更好地处理更高级别的并发。使用 MyISAM 可以使用全文索引并减少一些开销。对于速度而言,比引擎类型更重要的是:将索引放在需要快速搜索的列上。始终在 ID/PK 列上放置索引,例如我提到的 id_username。
更多详细信息:
这里有一堆有关 MySQL 数据类型和数据库设计的问题(警告,比您要求的要多):
我应该选择什么数据类型?
表格设计问题
枚举数据类型与 MySQL 中的数据表?
电话号码和地址的 mysql 数据类型
克、毫克、微克和千焦的最佳 mysql 数据类型
MySQL 5 星级评级数据类型?
还有一些关于何时使用 InnoDB 引擎的问题:
MyISAM 与 InnoDB
什么时候应该选择在 MySQL 中使用 InnoDB ?
我只是使用
tinyint
来处理几乎所有事情(认真的)。编辑 - 如何存储“帖子:”
下面是一些包含更多详细信息的链接,但这是简短的版本。为了存储“帖子”,您需要空间来容纳长文本字符串。
CHAR
最大长度为 255,因此这不是一个选项,当然CHAR
会浪费未使用的字符,而VARCHAR
是可变长度的CHAR
。在 MySQL 5.0.3 之前,
VARCHAR
最大长度为 255,因此您将留下TEXT
。但是,在较新版本的 MySQL 中,您可以使用VARCHAR
或TEXT
。选择取决于偏好,但也存在一些差异。VARCHAR
和TEXT
最大长度现在均为 65,535,但您可以在VARCHAR
上设置自己的最大值。假设您认为您的帖子最多只需 2000 个,您可以设置VARCHAR(2000)
。如果您每次都遇到限制,您可以稍后ALTER
您的表并将其更改为VARCHAR(3000)
。另一方面,TEXT
实际上将其数据存储在BLOB
中 (1)。我听说VARCHAR
和TEXT
之间可能存在性能差异,但我还没有看到任何证据,因此您可能想更多地研究这一点,但您将来总是可以改变这个小细节。更重要的是,使用全文索引而不是
LIKE
搜索此“帖子”列会快得多 (2)。但是,您必须使用 MyISAM 引擎才能使用全文索引,因为 InnoDB 不支持它。在 MySQL 数据库中,您可以为每个表使用异构混合的引擎,因此您只需使“posts”表使用 MyISAM。但是,如果您绝对需要“posts”来使用 InnoDB(用于事务),则设置一个触发器来更新“posts”表的 MyISAM 副本,并使用 MyISAM 副本进行所有全文搜索。请参阅底部的一些有用的引用。
MySQL 数据类型图表(已过时)
MySQL 数据类型(已过时)
第 10 章数据类型(更好的细节)
BLOB 和 TEXT 类型(一)
< a href="http://dev.mysql.com/doc/refman/5.1/en/fulltext-search.html" rel="nofollow noreferrer">11.9。全文搜索函数(二)
10.4 .1. CHAR 和 VARCHAR 类型(3)
与 TEXT 优缺点的精彩文章。它还谈到了性能问题:
Brief Summary:
(just my opinions)
VARCHAR(255)
VARCHAR(100)
orVARCHAR(255)
INT
(unless you plan on over 2 billion users in you system)INT
orVARCHAR
or maybeCHAR
(depends on if you want to store formatting)TEXT
DATE
orDATETIME
(definitely include times for things like posts or emails)DECIMAL(11,2)
As far as using InnoDB because
VARCHAR
is supposed to be faster, I wouldn't worry about that, or speed in general. Use InnoDB because you need to do transactions and/or you want to use foreign key constraints (FK) for data integrity. Also, InnoDB uses row level locking whereas MyISAM only uses table level locking. Therefore, InnoDB can handle higher levels of concurrency better than MyISAM. Use MyISAM to use full-text indexes and for somewhat less overhead.More importantly for speed than the engine type: put indexes on the columns that you need to search on quickly. Always put indexes on your ID/PK columns, such as the id_username that I mentioned.
More details:
Here's a bunch of questions about MySQL datatypes and database design (warning, more than you asked for):
What DataType should I pick?
Table design question
Enum datatype versus table of data in MySQL?
mysql datatype for telephne number and address
Best mysql datatype for grams, milligrams, micrograms and kilojoule
MySQL 5-star rating datatype?
And a couple questions on when to use the InnoDB engine:
MyISAM versus InnoDB
When should you choose to use InnoDB in MySQL?
I just use
tinyint
for almost everything (seriously).Edit - How to store "posts:"
Below are some links with more details, but here's the short version. For storing "posts," you need room for a long text string.
CHAR
max length is 255, so that's not an option, and of courseCHAR
would waste unused characters versusVARCHAR
, which is variable lengthCHAR
.Prior to MySQL 5.0.3,
VARCHAR
max length was 255, so you'd be left withTEXT
. However, in newer versions of MySQL, you can useVARCHAR
orTEXT
. The choice comes down to preference, but there are a couple differences.VARCHAR
andTEXT
max length is now both 65,535, but you can set you own max onVARCHAR
. Let's say you think your posts will only need to be 2000 max, you can setVARCHAR(2000)
. If you every run into the limit, you canALTER
you table later and bump it toVARCHAR(3000)
. On the other hand,TEXT
actually stores its data in aBLOB
(1). I've heard that there may be performance differences betweenVARCHAR
andTEXT
, but I haven't seen any proof, so you may want to look into that more, but you can always change that minor detail in the future.More importantly, searching this "post" column using a Full-Text Index instead of
LIKE
would be much faster (2). However, you have to use the MyISAM engine to use full-text index because InnoDB doesn't support it. In a MySQL database, you can have a heterogeneous mix of engines for each table, so you would just need to make your "posts" table use MyISAM. However, if you absolutely need "posts" to use InnoDB (for transactions), then set up a trigger to update the MyISAM copy of your "posts" table and use the MyISAM copy for all your full-text searches.See bottom for some useful quotes.
MySQL Data Type Chart (outdated)
MySQL Datatypes (outdated)
Chapter 10. Data Types (better details)
The BLOB and TEXT Types (1)
11.9. Full-Text Search Functions (2)
10.4.1. The CHAR and VARCHAR Types (3)
Lastly, here's a great post about the pros and cons of VARCHAR versus TEXT. It also speaks to the performance issue:
有多个角度来解决你的问题。
从设计 POV 来看,最好选择能够最好地表达您想要建模的数量的数据类型。也就是说,首先要确定数据域和数据大小,以便非法数据不能存储在数据库中。但这并不是 MySQL 的强项,尤其是默认的 sql_mode (http://dev.mysql.com/doc/refman/5.1/en/server-sql-mode.html)。如果它适合您,请尝试传统的 sql_mode,它是许多所需标志的简写。
从表演的角度来看,问题是完全不同的。例如,关于电子邮件正文的存储,您可能需要阅读 http://www.mysqlperformanceblog.com/2010/02/09/blob-storage-in-innodb/ 然后想想。
消除冗余并使用短键可能是一个巨大的胜利。例如,在我见过的一个项目中,一个日志表一直在存储http User-Agent信息。通过简单地将日志表中的每个用户代理字符串替换为查找表中用户代理字符串的数字 ID,数据集大小显着减小(超过 60%)。通过进一步解析用户代理,然后存储一堆 ids(操作系统、浏览器类型、版本索引),数据集大小减少到原始大小的 1%。
最后,有许多规则可以帮助您发现模式设计中的错误。
例如,名称中包含 id 并且不是无符号整数类型的任何内容都可能是一个错误(尤其是在 innodb 的上下文中)。
例如,任何名称中包含价格或成本且未签名的内容都是潜在的欺诈来源(欺诈者创建负价商品并购买)。
例如,任何适用于货币数据且未使用适当大小的 DECIMAL 数据类型的内容都可能会执行数学错误(DECIMAL 执行的是 BCD、具有正确精度和舍入的十进制纸质数学,而 DOUBLE 和 FLOAT 则不会)。
There are multiple angles to approach your question.
From a design POV it is always best to chose the datatype which expresses the quantity you want to model best. That is, get the data domain and data size right so that illegal data cannot be stored in the database in the first place. But that is not where MySQL is strong in the first place, and especially not with the default sql_mode (http://dev.mysql.com/doc/refman/5.1/en/server-sql-mode.html). If it works for you, try the TRADITIONAL sql_mode, which is a shorthand for many desireable flags.
From a performance POV, the question is entirely different. For example, regarding the storage of email bodies, you might want to read http://www.mysqlperformanceblog.com/2010/02/09/blob-storage-in-innodb/ and then think about that.
Removing redundancies and having short keys can be a big win. For example, in a project that I have seen, a log table has been storing http User-Agent information. By simply replacing each user agent string in the log table with a numeric id of a user agent string in a lookup table, data set size was considerably (more than 60%) reduced. By parsing the user agent further and then storing a bunch of ids (operating system, browser type, version index) data set size was reduced to 1% of the original size.
Finally, there is a number of rules that can help you spot errors in schema design.
For example, anything that has id in the name and is not an unsigned integer type is probably a bug (especially in the context of innodb).
For example, anything that has price or cost in the name and is not unsigned is a potential source of fraud (fraudster creates article with negative price, and buys that).
For example, anything that works on monetary data and is not using the DECIMAL data type of the appropriate size is probably doing math wrong (DECIMAL is doing BCD, decimal paper math with correct precision and rounding, DOUBLE and FLOAT do not).
SQLyog 具有计算最佳数据类型功能,有助于根据插入表中的记录找出最佳数据类型。
它使用
查询找出最佳数据类型
SQLyog has Calculate optimal datatype feature which helps in finding out optimal datatype based on records inserted in a table.
It uses
query to find out optimal datatype