SQL Server 2008 表结构和指导性能注意事项
我现在实际上陷入了是否使用 Guid 的两难境地。
我有一个名为 Posts 的非事务表,其中一个 bigint 作为 PK。
据我了解,使用 Guid 作为 PK 会影响查询性能。但是,为了使查询字符串真正唯一,我决定添加一个名为 specialID
的列,其中包含 Guid 默认值 newid()
。这将使我的所有查询字符串真正唯一,因为我需要做的就是执行以下查询:
SELECT *
FROM Posts p
WHERE p.specialID = '[query-string]'; // For single record retrieval
至于连接,bigint PK 将发挥如下作用:
SELECT p.id, p.specialID, ul.name as Writer
FROM Posts p
JOIN Users ul ON ul.id = p.writer;
然而,我的同事不同意并说它仍然会妨碍查询性能。为什么?而我应该继续这样下去吗?真正唯一的查询字符串不是必需的,但会是首选。如果它确实会影响性能,那么我们如何才能拥有真正唯一的查询字符串呢?
I'm actually in a dilemma right now with regards to whether or not to use Guid.
I have a non-transactional table called Posts with a bigint as PK.
As far as I understand, using Guid as PK would hamper query performance. However, in order to make query string truly unique, I decided to add a column called specialID
with Guid default value newid()
. This would make all my query strings truly unique since all I need to do is to execute the following query:
SELECT *
FROM Posts p
WHERE p.specialID = '[query-string]'; // For single record retrieval
As for Joins, bigint PK will come into play as follows:
SELECT p.id, p.specialID, ul.name as Writer
FROM Posts p
JOIN Users ul ON ul.id = p.writer;
My colleague, however, disagreed and said that it would still hamper query performance. Why? And should I continue this way? A truly unique query string isn't necessary, but would be preferred. If it would indeed hamper performance, how can we go about having a truly unique query string?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
它不应该显着妨碍 SELECT 查询,尤其是在您正确索引列的情况下。它可能会影响插入,但如果 GUID 不是聚集索引的一部分,那么问题就不那么严重了。它还会影响存储要求,具体取决于您要存储的数据量,因为它(显然)要大得多。
详细讨论如下:http://www.sql-server-performance。 com/2005/guid-performance/ 虽然那是2005年的,但我相信所有的观点仍然有效。
** 编辑:简单索引示例 **
“覆盖索引”仅意味着您有一个包含相关列的索引。聚集索引意味着记录实际上按照索引所说的顺序存储,非聚集索引意味着索引保存指向存储位置的指针。考虑一下字典与书籍索引之间的差异。字典按词序排序,并按该顺序存储其所有数据(聚集),而索引按词序排序,但有一个指向不同顺序的页码(非聚集)的指针。
因此,要为您的列创建索引,您可以这样做:
默认是非聚集的,但如果您想明确,可以添加“非聚集”关键字。
It shouldn't significantly hamper SELECT queries, especially not if you properly index the column. It can potentially affect Inserts, but that's less of an issue if the GUID isn't part of a clustered index. It can also affect storage requirements, depending on how much data you're going to store since it's (obviously) a lot bigger.
A detailed discussion is Here: http://www.sql-server-performance.com/2005/guid-performance/ although that is for 2005, I believe all the points are still valid.
** EDIT: simple index example **
A 'covering index' just means you have an index which includes the column in question. A clustered index means that the records are actually stored in the order that the index says, a nonclustered index means that the index holds a pointer to the stored location. Consider the difference like a dictionary vs. a book's index. A dictionary is sorted in word order, and stores all its data in that order (clustered) while an index is sorted in word order, but has a pointer to page numbers which aren't in the same order (nonclustered).
So to make an index for your column, you would do:
Default is nonclustered, but you can add the 'nonclustered' keyword if you want to be explicit.