MySQL 是否索引 NULL 值?
我有一个 MySQL 表,其中 90% 的行的索引 INT
列将为 0。 如果我将这些行更改为使用 NULL
而不是 0,它们是否会被排除在索引之外,从而使索引小约 90%?
I have a MySQL table where an indexed INT
column is going to be 0 for 90% of the rows. If I change those rows to use NULL
instead of 0, will they be left out of the index, making the index about 90% smaller?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
http://dev.mysql.com/doc/refman /5.0/en/is-null-optimization.html
MySQL 可以对
col_name IS NULL
执行与col_name = Constant_value
相同的优化。 例如,MySQL 可以使用索引和范围来搜索NULL
和IS NULL
。http://dev.mysql.com/doc/refman/5.0/en/is-null-optimization.html
MySQL can perform the same optimization on
col_name IS NULL
that it can use forcol_name = constant_value
. For example, MySQL can use indexes and ranges to search forNULL
withIS NULL
.看起来它也对 NULL 建立了索引。
参考。
It looks like it does index the
NULL
s too.Reference.
允许列为空将会为该列的存储要求添加一个字节。 这将导致索引大小增加,这可能不好。 也就是说,如果您的许多查询更改为使用“IS NULL”或“NOT NULL”,它们总体上可能比进行值比较更快。
我的直觉告诉我不为空,但只有一个答案:测试!
Allowing a column to be null will add a byte to the storage requirements of the column. This will lead to an increased index size which is probably not good. That said if a lot of your queries are changed to use "IS NULL" or "NOT NULL" they might be overall faster than doing value comparisons.
My gut would tell me not null, but there's one answer: test!
不,它将继续包括它们,但不要对这两种情况的后果做出太多假设。 很大程度上取决于其他值的范围(谷歌搜索“基数”)。
MSSQL 有一个新的索引类型,称为“过滤索引”,适用于这种情况(即基于过滤器在索引中包含记录)。 dBASE 类型的系统曾经具有类似的功能,而且非常方便。
No, it will continue to include them, but don't make too many assumptions about what the consequences are in either case. A lot depends on the range of other values (google for "cardinality").
MSSQL has a new index type called a "filtered index" for this type of situation (i.e. includes records in the index based on a filter). dBASE-type systems used to have a similar capability, and it was pretty handy.
每个索引都有一个基数,意味着有多少个不同的值被索引。 AFAIK 说索引对许多行重复相同的值不是一个合理的想法,但索引只会将重复的值寻址到许多行的聚集索引(该字段具有空值的行)并保留聚集索引的引用 ID意味着:具有 NULL 值索引字段的每一行都会浪费与 PK 一样大的大小(因此,如果您有复合 PK,专家建议使用合理的 PK 大小)。
Each index has a cardinality means how many distinct values are indexed. AFAIK it's not a reasonable idea to say indexes repeat the same value for many rows but the index will only addresses a repeated value to the clustered index of many rows (rows having null value for this field) and keeping the reference ID of the clustered index means : each row with a NULL value indexed field wastes a size as large as the PK (for this reason experts recommend to have a reasonable PK size if you have composite PK).