在 VARCHAR 数据库字段上建立索引可以提高插入速度吗?
我的数据库中有两个表:页面和链接。在每个字段中,我都定义 URL 字段是唯一的,因为我不想要重复的 URL。
作为一个唯一字段,它自动有一个索引?为这些字段创建索引可以加快插入速度吗? VARCHAR 字段最合适的索引是什么?
有很多行会减慢插入速度,因为这个 UNIQUE 字段?目前,我有 1,200,000 行。
I have two tables in my database: page and link. In each one I define that the URL field is UNIQUE because I don't want repetead URLs.
Being a UNIQUE field, it automatically have an index? Creating an index for these field can speed up the insertions? What is the most appropriate index for a VARCHAR field?
Having a lot of rows can slow the insert because this UNIQUE field? At the moment, I have 1,200,000 rows.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
是的,添加 UNIQUE 约束将创建索引:
但这不会加快您的插入速度,实际上会减慢插入速度:
Yes, adding a UNIQUE constraint will create an index:
This won't speed up your INSERTs though, it will actually slow them down:
从逻辑上讲,约束是一回事,索引是另一回事。约束与数据完整性有关;索引与速度有关。
实际上,大多数 dbms 通过构建唯一索引来实现唯一约束。唯一索引可以让 dbms 更快地确定您尝试插入的值是否已在表中。
我想在某些情况下,VARCHAR() 列上的索引可能会加快插入速度。但通常索引会减慢插入速度,因为数据库管理系统必须
合适的索引将加快更新速度,因为 dbms 可以更快地找到要更新的行。 (但它可能还必须更新索引,这会花费一些时间。)
PostgreSQL 可以告诉您它正在使用哪些索引。请参阅解释。
Logically speaking, a constraint is one thing, and an index is another. Constraints have to do with data integrity; indexes have to do with speed.
Practically speaking, most dbms implement a unique constraint by building a unique index. A unique index lets the dbms determine more quickly whether the values you're trying to insert are already in the table.
I suppose an index on a VARCHAR() column might speed up an insert under certain circumstances. But generally an index slows inserts, because the dbms has to
A suitable index will speed up updates, because the dbms can find the rows to be updated more quickly. (But it might have to update the index, too, which costs you a little bit.)
PostgreSQL can tell you which indexes it's using. See EXPLAIN.
通常b-tree/b+tree索引是最常见的索引,这些索引的插入和更新很可能会比较慢,而选择单行、选择范围和ORDER BY(大多数情况下是升序)会非常快。这是因为该索引是有序的,因此插入必须找出插入的位置,而不是仅仅将其插入到表的末尾。在聚集索引的情况下,由于页面分割,插入/更新甚至更糟。
唯一性可能会使其速度慢一些,因为它必须扫描更多行以确保它是唯一的。
另外,如果您正在寻找最佳性能,则 varchar 通常不是索引的最佳选择,如果可以使用整数,则速度要快得多。因此,对于 varchar 来说确实没有“最佳”索引,每个索引都有自己的优点和缺点,并且总是需要权衡。这实际上取决于情况以及您打算用它做什么,您只需要插入/更新吗?还是说你也需要做出选择?这些是你需要问的事情。
Usually b-tree/b+tree index is the most common indexes, and most likely inserts and updates are slower with these indexes, whereas selection of single row, selection of ranges and ORDER BY (ascending in most cases) would be very quick. This is because this index is ordered and so insertion would have to find out where to insert, instead of just inserting it at the end of the table. In the case of a clustered index, insertion/updates are even worse because of page splits.
Being unique would probably make it a bit slower since it has to scan more rows to make sure it is unique.
Also varchar is generally not the best choice for indexes if you are looking for optimal performance, integer is much much faster if it can be used. So there really is no 'best' index for varchar, each index has its own strengths and weaknesses and theres always a tradeoff. It really depends on the situation and what you plan to do with it, do you only need inserts/updates? Or do you also need to make selections? These are the things you need to ask.