SQL Server - Guid VS。 长的

发布于 2024-07-29 02:38:39 字数 235 浏览 3 评论 0原文

到目前为止,我一直在使用 C#“Guid = Guid.NewGuid();” 方法生成一个唯一的 ID,该 ID 可以使用 Linq to SQL 作为 ID 字段存储在我的一些 SQL Server 数据库表中。 我被告知,出于索引原因,使用 GUID 是一个坏主意,我应该使用自动递增的 Long 来代替。 使用 long 会加速我的数据库事务吗? 如果是这样,我该如何生成 Long 类型的唯一 ID?

问候,

Up until now i've been using the C# "Guid = Guid.NewGuid();" method to generate a unique ID that can be stored as the ID field in some of my SQL Server database tables using Linq to SQL.
I've been informed that for indexing reasons, using a GUID is a bad idea and that I should use an auto-incrementing Long instead. Will using a long speed up my database transactions? If so, how do I go about generating unique ID's that are of type Long?

Regards,

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

遥远的她 2024-08-05 02:38:39

两者都有优点和缺点,这完全取决于你如何使用它们。

如果您需要可以跨多个数据库工作的标识符,那么您就需要 GUID。 Long 有一些技巧(手动为每个数据库分配不同的种子/增量),但这些技巧不能很好地扩展。

就索引而言,如果索引是聚集的(默认情况下主键是聚集的,但这可以针对您的表进行修改),Long 将提供更好的插入性能,因为表不需要在每次插入后重新组织。

然而,就并发插入而言,Long(身份)列将比 GUID 慢 - 身份列生成需要一系列独占锁来确保只有一行获得下一个序列号。 在许多用户始终插入许多行的环境中,这可能会影响性能。 在这种情况下 GUID 生成速度更快。

在存储方面,GUID 占用的空间是 Long 的两倍(8 字节 vs 16 字节)。 然而,如果 8 字节会对一个叶子中容纳的记录数量产生显着影响,从而在平均请求期间从磁盘中提取的叶子数量产生显着差异,则取决于行的总体大小。

Both have pros and cons, it depends entirely on how you use them that matters.

Right off the bat, if you need identifiers that can work across several databases, you need GUIDs. There are some tricks with Long (manually assigning each database a different seed/increment), but these don't scale well.

As far as indexing goes, Long will give much better insert performance if the index is clustered (by default primary keys are clustered, but this can be modified for your table), since the table does not need to be reorganized after every insert.

As far as concurrent inserts are concerned however, Long (identity) columns will be slower then GUID - identity column generation requires a series of exclusive locks to ensure that only one row gets the next sequential number. In an environment with many users inserting many rows all the time, this can be a performance hit. GUID generation in this situation is faster.

Storage wise, a GUID takes up twice the space of a Long (8 bytes vs 16). However it depends on the overall size of your row if 8 bytes is going to make a noticable difference in how many records fit in one leaf, and thus the number of leaves pulled from disk during an average request.

时常饿 2024-08-05 02:38:39

“索引女王” - Kim Tripp - 基本上在她的索引博客文章中说明了一切:

基本上,她的最佳实践是:最佳集群键应该是:

  • 唯一的
  • 稳定(从不) 。
  • 不断增加的

GUID 违反了“小”和“不断增加”的原则,因此不是最佳的

另外:所有聚集键都将添加到每个非聚集索引中的每个条目(作为实际在数据库中查找记录的查找),因此您希望使它们尽可能小(INT = 4 字节与 GUID = 16 字节)。 如果您有数亿行和多个非聚集索引,那么选择 INT 或 BIGINT 而不是 GUID 可能会产生重大影响 - 即使只是在空间方面。

马克

The "Queen of Indexing" - Kim Tripp - basically says it all in her indexing blog posts:

Basically, her best practices are: an optimal clustering key should be:

  • unique
  • small
  • stable (never changing)
  • ever-increasing

GUID's violate the "small" and "ever-increasing" and are thus not optimal.

PLUS: all your clustering keys will be added to each and every single entry in each and every single non-clustered index (as the lookup to actually find the record in the database), thus you want to make them as small as possible (INT = 4 byte vs. GUID = 16 byte). If you have hundreds of millions of rows and several non-clustered indices, choosing an INT or BIGINT over a GUID can make a major difference - even just space-wise.

Marc

少跟Wǒ拽 2024-08-05 02:38:39

long(sql server 中的 big int)是 8 个字节,Guid 是 16 个字节,因此您将 sql server 在查找时必须比较的字节数减半。

要生成 long,请在数据库中创建字段时使用 IDENTITY(1,1)。

因此,要么使用 create table 要么 alter table:

Field_NAME BIGINT NOT NULL PRIMARY KEY IDENTITY(1,1)

请参阅将 Linq 发布到 sql 的注释

A long (big int in sql server) is 8 bytes and a Guid is 16 bytes, so you are halving the number of the bytes sql server has to compare when doing a look up.

For generating a long, use IDENTITY(1,1) when you create the field in the database.

so either using create table or alter table:

Field_NAME BIGINT NOT NULL PRIMARY KEY IDENTITY(1,1)

See comments for posting Linq to sql

苄①跕圉湢 2024-08-05 02:38:39

当您需要考虑导入/导出到多个数据库时,请使用 guid。 在处理多个子关系的数据集时,指南通常比指定 IDENTITY 属性的列更容易使用。 这是因为您可以在与数据库断开连接的状态下在代码中随机生成 guid,然后一次提交所有更改。 当正确生成指南时,它们就很难被偶然复制。 对于标识列,您通常必须在添加子数据之前对父行进行初始插入并查询其新标识。 然后,您必须使用新的父标识更新所有子记录,然后再将它们提交到数据库。 这同样适用于孙子等继承体系中的其他人。 它增加了许多看似不必要且平凡的工作。 您可以通过在没有 IDENTITY 规范的情况下使用随机整数来执行类似于 Guid 的操作,但是随着时间的推移插入更多记录,冲突的可能性会大大增加。 (Guid.NewGuid() 类似于随机 Int128 - 目前还不存在)。

我将 Byte (TinyInt)、Int16 (SmallInt)、Int32/UInt16 (Int)、Int64/UInt32 (BigInt) 用于不更改的小型查找列表或不在多个数据库之间复制的数据。 (权限、应用程序配置、颜色名称等)

我想无论您使用的是 guid 还是 long,索引都需要同样长的时间来查询。 通常,索引表中还有其他大于 128 位的字段(例如用户表中的用户名)。 Guid 和 Integers 之间的区别在于内存中索引的大小,以及填充和重建索引的时间。 大多数数据库事务通常是读取。 写作是最少的。 首先集中精力优化从数据库的读取,因为它们通常由未正确优化的连接表、不正确的分页或丢失索引组成。

与任何事情一样,最好的办法就是证明你的观点。 创建一个包含两个表的测试数据库。 一个具有整数/长整型主键,另一个具有 guid。 用 N 百万行填充每个行。 在 CRUD 操作(创建、读取、更新、删除)期间监视每个操作的性能。 您可能会发现它确实对性能造成了影响,但微不足道。

服务器通常运行在没有调试环境的盒子上,并且其他应用程序占用CPU、内存和硬盘驱动器的I/O(尤其是RAID)。 开发环境只能让您了解性能。

Use guids when you need to consider import/export to multiple databases. Guids are often easier to use than columns specifying the IDENTITY attribute when working with a dataset of multiple child relationships. this is because you can randomly generate guids in the code in a disconnected state from the database, and then submit all changes at once. When guids are generated properly, they are insainely hard to duplicate by chance. With identity columns, you often have to do an intial insert of a parent row and query for it's new identity before adding child data. You then have to update all child records with the new parent identity before committing them to the database. The same goes for grandchildren and so on down the heirarchy. It builds up to a lot of work that seems unnecessary and mundane. You can do something similar to Guids by comming up with random integers without the IDENTITY specification, but the chance of collision is greatly increased as you insert more records over time. (Guid.NewGuid() is similar to a random Int128 - which doesn't exist yet).

I use Byte (TinyInt), Int16 (SmallInt), Int32/UInt16 (Int), Int64/UInt32 (BigInt) for small lookup lists that do not change or data that does not replicate between multiple databases. (Permissions, Application Configuration, Color Names, etc.)

I imagine the indexing takes just as long to query against regardless if you are using a guid or a long. There are usually other fields in tables that are indexed that are larger than 128 bits anyway (user names in a user table for example). The difference between Guids and Integers is the size of the index in memory, as well as time populating and rebuilding indexes. The majority of database transactions is often reading. Writing is minimal. Concentrate on optimizing reading from the database first, as they are usually made of joined tables that were not optimized properly, improper paging, or missing indexes.

As with anything, the best thing to do is to prove your point. create a test database with two tables. One with a primary key of integers/longs, and the other with a guid. Populate each with N-Million rows. Moniter the performance of each during the CRUD operations (create, read, update, delete). You may find out that it does have a performance hit, but insignificant.

Servers often run on boxes without debugging environments and other applications taking up CPU, Memory, and I/O of hard drive (especially with RAID). A development environment only gives you an idea of performance.

以酷 2024-08-05 02:38:39

您可以整天争论 GUID 或身份。 我更喜欢数据库生成具有身份的唯一值。 如果合并来自多个数据库的数据,请添加另一列(用于标识源数据库,可能是tinyint或smallint)并形成复合主键。

如果您确实使用身份,请务必根据将生成的预期键数选择正确的数据类型:

bigint - 8 Bytes - max positive value: 9,223,372,036,854,775,807  
int    - 4 Bytes - max positive value:             2,147,483,647

注意“预期键数”与行数不同。 如果你主要是添加和保留行,你可能会发现一个 INT 就足够了,有超过 20 亿个唯一键。 我敢打赌你的桌子不会那么大。 但是,如果您有一个大容量表,并且不断添加和删除行,则行数可能会很低,但您会快速浏览键。 您应该做一些计算,看看遍历 INT 20 亿个密钥需要多少 log。 如果它不会很快用完它们,请使用 INT,否则将密钥大小加倍并使用 BIGINT。

You can debate GUID or identity all day. I prefer the database to generate the unique value with an identity. If you merge data from multiple databases, add another column (to identify the source database, possibly a tinyint or smallint) and form a composite primary key.

If you do go with an identity, be sure to pick the right datatype, based on number of expected keys you will generate:

bigint - 8 Bytes - max positive value: 9,223,372,036,854,775,807  
int    - 4 Bytes - max positive value:             2,147,483,647

Note "number of expected keys " is different than the number of rows. If you mainly add and keep rows, you may find that an INT is enough with over 2 billion unique keys. I'll bet your table won't get that big. However, if you have a high volume table where you keep adding and removing rows, you row count may be low, but you'll go through keys fast. You should do some calculations to see how log it would take to go through the INTs 2 billion keys. If it won't use them up any time soon go with INT, otherwise double the key size and go with BIGINT.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文