如何正确索引数据库以提高查询性能
我正在使用 OpenID 开发简单的登录页面:如果用户刚刚注册了 OpenID,那么我需要在数据库中为用户创建一个新条目,否则我只是显示他们的别名和问候语。每次有人使用他们的 Open ID 进行身份验证时,我都必须通过查找哪个用户拥有给定的 OpenID 来找到他们的别名,如果主键是 UserID(并且有数百万用户),那么似乎可能会相当慢。
我使用的是 SQL Server 2008,数据库中有两个表(Users 和 OpenIDs):我计划检查 OpenIDs 表中是否存在 Open ID,然后使用相应的 UserID 从用户表。
Users 表按 UserID 索引,并具有以下列:
- UserID (pk)
- Alias
- OpenID (fk)
OpenIDs 表按 OpenID 索引,并具有以下列:
- OpenID (pk)
- UserID (fk)
或者,我可以索引Users表由UserID和OpenID组成(即有2个索引),并完全删除OpenIDs表。
在这种情况下,改进对具有匹配 OpenID 的用户的查询的推荐方法是什么:使用两个键索引 Users 表或使用 OpenIDs 表查找匹配的 UserID?
I'm working on simple log in page using OpenID: if the user has just registered for an OpenID, then I need to create a new entry in the database for the user, otherwise I just display their alias with a greeting. Every time somebody gets authenticated with their Open ID, I must find their alias by looking up which user has the given OpenID and it seems that it might be fairly slow if the primary key is the UserID (and there are millions of users).
I'm using SQL Server 2008 and I have two tables in my database (Users and OpenIDs): I plan the check if the Open ID exists in the OpenIDs table, then use the corresponding UserID to get the rest of the user information from the Users table.
The Users table is indexed by UserID and has the following columns:
- UserID (pk)
- Alias
- OpenID (fk)
The OpenIDs table is indexed by OpenID and has the following columns:
- OpenID (pk)
- UserID (fk)
Alternately, I can index the Users table by UserID and OpenID (i.e have 2 indexes) and completely drop the OpenIDs table.
What would be the recommended way to improve the query for a user with the matching OpenID in this case: index the Users table with two keys or use the OpenIDs table to find the matching UserID?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
可能是创建数据库索引的最佳实践和“经验法则”有哪些?可以帮助您。
May be the answers to What are some best practises and “rules of thumb” for creating database indexes? can help you.
在不知道您将详细运行哪种查询的情况下,我建议对两个外键列建立索引 -
Users.OpenID
和OpenIDs.UserID
。对外键建立索引通常是一个好主意,有助于 JOIN 条件和其他查询。
但老实说,如果您仅使用 OpenIDs 表来检查 OpenID 是否存在,那么最好只建立索引(可能是唯一索引?)
Users
表中的列并完成操作。您现在拥有的 OpenIDs 表根本没有任何实际用途 - 只是占用了冗余信息的空间。除此之外:您需要观察应用程序的行为方式,对一些使用数据进行采样,然后查看运行最频繁和最长的查询类型,然后开始进行性能调整。不要过度进行提前性能优化 - 太多索引可能比没有索引更糟糕!
事实上,恰恰相反!如果您有一个在数百万行中唯一的值,那么找到该单个值实际上相当快 - 即使对于数百万用户也是如此。只需进行几次(最多 5-6 次)比较,然后爆炸!你的用户是一百万中的一个。如果您在 OpenID 列上有索引,那么速度确实应该相当快。这种高度选择性的索引(一个值选出百万分之一)的工作效率非常高。
Without knowing what kind of queries you'll be running in detail, I would recommend indexing the two foreign key columns -
Users.OpenID
andOpenIDs.UserID
.Indexing the foreign keys is typically a good idea to help with JOIN conditions and other queries.
But quite honestly, if you use the
OpenIDs
table only to check the existance of anOpenID
, you'd be much better off just indexing (possibly a unique index?) that column in theUsers
table and be done with it. ThatOpenIDs
table as you have it now serves no real purpose at all - just takes up space for redundant information.Other than that: you need to observe how your application behaves, samples some usage data, and then see what kind of queries are running the most often, and the longest, and then start doing performance tweaking. Don't over-do the ahead-of-time performance optimizations - too many indices can be worse than having none at all !
Actually, quite the contrary! If you have a value that's unique amongst millions of rows, finding that single value is actually quite quick - even with millions of users. It will take only a handful (max. 5-6) comparisons, and bang! you have your one user out of a million. If you have an index on that
OpenID
column, that should be pretty fast indeed. Such a highly selective index (one value picks out 1 in a million) work very very efficiently.