GUID 作为主键 - 离线 OLTP

发布于 2024-07-04 06:12:26 字数 372 浏览 12 评论 0原文

我们正在致力于设计一个典型的 OLTP 应用程序(想想:采购系统)。 然而,这个特别需要一些用户处于离线状态,因此他们需要能够将数据库下载到他们的计算机上,对其进行处理,然后在连接到 LAN 后同步回来。

我想指出的是,我知道以前已经这样做过,我只是没有使用这个特定模型的经验。

我想到的一个想法是使用 GUID 作为表键。 例如,采购订单不会有数字(自动数字),而是有 GUID,这样每个离线客户端都可以生成这些数字,并且当我连接回数据库时不会发生冲突。

出于某种原因,这是一个坏主意吗? 通过GUID键访问这些表会不会很慢?

您有使用此类系统的经验吗? 你是如何解决这个问题的?

谢谢!
丹尼尔

We are working on designing an application that is typically OLTP (think: purchasing system). However, this one in particular has the need that some users will be offline, so they need to be able to download the DB to their machine, work on it, and then sync back once they're on the LAN.

I would like to note that I know this has been done before, I just don't have experience with this particular model.

One idea I thought about was using GUIDs as table keys. So for example, a Purchase Order would not have a number (auto-numeric) but a GUID instead, so that every offline client can generate those, and I don't have clashes when I connect back to the DB.

Is this a bad idea for some reason?
Will access to these tables through the GUID key be slow?

Have you had experience with these type of systems? How have you solved this problem?

Thanks!
Daniel

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(15

稀香 2024-07-11 06:12:26

如果您的数据库足够小,可以下载到笔记本电脑并离线使用它,那么您可能不需要太担心 int 和 Guid 之间的性能差异。 但不要低估整数在系统开发和故障排除时的用处! 无论您是否使用 Guid,您可能都需要提出一些相当复杂的导入/同步逻辑,因此它们可能没有您想象的那么有帮助。

If your database is small enough to download to a laptop and work with it offline, you probably don't need to worry too much about the performance differences between ints and Guids. But do not underestimate how useful ints are when developing and troubleshooting a system! You will probably need to come up with some fairly complex import/synch logic regardless of whether or not you are using Guids, so they might not help as much as you think.

孤者何惧 2024-07-11 06:12:26

@Simon,

你提出了非常好的观点。 我已经在考虑离线时生成的“临时”“人类可读”数字,并在同步时重新创建。 但我想避免使用外键等。

@Simon,

You raise very good points. I was already thinking about the "temporary" "human-readable" numbers i'd generate while offline, that i'd recreate on sync. But i wanted to avoid doing with with foreign keys, etc.

甜心小果奶 2024-07-11 06:12:26

@Portman默认情况下PK == Clustered Index,创建主键约束会自动创建聚集索引,如果不想聚集则需要指定非聚集。

@Portman By default PK == Clustered Index, creating a primary key constraint will automatically create a clustered index, you need to specify non clustered if you don't want it clustered.

舂唻埖巳落 2024-07-11 06:12:26

我首先想到的是:MS不是设计了DataSet和DataAdapter模型来支持这样的场景吗?

我相信我读到 MS 将其 ADO 记录集模型更改为当前的 DataSet 模型,因此它在离线状态下也能很好地工作。 还有这个 ADO.NET 同步服务

我相信我有看到了利用 DataSet 模型的代码,该模型也使用外键,并且在使用 DataAdapter 时它们仍然完美同步。 虽然还没有尝试过同步服务,但我认为您也可以从中受益。

希望这可以帮助。

First thought that comes to mind: Hasn't MS designed the DataSet and DataAdapter model to support scenarios like this?

I believe I read that MS changed their ADO recordset model to the current DataSet model so it works great offline too. And there's also this Sync Services for ADO.NET

I believe I have seen code that utilizes the DataSet model which also uses foreign keys and they still sync perfectly when using the DataAdapter. Havn't try out the Sync Services though but I think you might be able to benefit from that too.

Hope this helps.

话少心凉 2024-07-11 06:12:26

@SqlMenace

<块引用>

GUID 还有其他问题,您会看到 GUID 不是连续的,因此插入会分散在各处,这会导致页面拆分和索引碎片

不正确。 主键!=聚集索引。

如果聚集索引是另一列(脑海中浮现“inserted_on”),那么插入将是连续的,并且不会发生页面拆分或过多碎片。

@SqlMenace

There are other problems with GUIDs, you see GUIDs are not sequential, so inserts will be scattered all over the place, this causes page splits and index fragmentation

Not true. Primary key != clustered index.

If the clustered index is another column ("inserted_on" springs to mind) then the inserts will be sequential and no page splits or excessive fragmentation will occur.

攒一口袋星星 2024-07-11 06:12:26

使用 Guid 作为主键是可以接受的,并且被认为是相当标准的做法,原因与您考虑使用它们的原因相同。 它们可能会被过度使用,这会使调试和管理变得有点乏味,因此如果可能的话,请尽量将它们排除在代码表和其他参考数据之外。

您必须关心的是人类可读的标识符。 人们无法交换指南 - 如果是指南,您能想象尝试通过电话确认您的订单号吗? 因此,在离线场景中,您可能仍然需要生成某些内容 - 例如发布者(工作站/用户)ID 和一些序列号,因此订单号可能是 123-5678 -。

然而,这可能无法满足具有序列号的业务需求。 事实上,监管要求可能会产生影响——一些法规(可能是 SOX)要求发票号码是连续的。 在这种情况下,可能需要生成一种形式编号,该形式编号稍后在系统同步时修复。 您可能会得到包含 OrderId (Guid)、OrderNo (int)、ProformaOrderNo (varchar) 的表 - 可能会出现一些复杂性。

至少将 guid 作为主键意味着您不必执行大量级联更新当同步最终发生时 - 您只需更新人类可读的数字即可。

Using Guids as primary keys is acceptable and is considered a fairly standard practice for the same reasons that you are considering them. They can be overused which can make things a bit tedious to debug and manage, so try to keep them out of code tables and other reference data if at all possible.

The thing that you have to concern yourself with is the human readable identifier. Guids cannot be exchanged by people - can you imagine trying to confirm your order number over the phone if it is a guid? So in an offline scenario you may still have to generate something - like a publisher (workstation/user) id and some sequence number, so the order number may be 123-5678 -.

However this may not satisfy business requirements of having a sequential number. In fact regulatory requirements can be and influence - some regulations (SOX maybe) require that invoice numbers are sequential. In such cases it may be neccessary to generate a sort of proforma number which is fixed up later when the systems synchronise. You may land up with tables having OrderId (Guid), OrderNo (int), ProformaOrderNo (varchar) - some complexity may creep in.

At least having guids as primary keys means that you don't have to do a whole lot of cascading updates when the sync does eventually happen - you simply update the human readable number.

数理化全能战士 2024-07-11 06:12:26

后端将是 SQL Server 2005
前端/应用程序逻辑将是.Net

除了 GUID 之外,您还能想到其他方法来解决离线计算机将新数据同步回中央数据库时发生的“合并”吗?
我的意思是,如果键是 INT,我基本上在导入时必须重新编号所有内容。 GUID 会让我省去这个麻烦。

The backend will be SQL Server 2005
Frontend / Application Logic will be .Net

Besides GUIDs, can you think of other ways to resolve the "merge" that happens when the offline computer syncs the new data back into the central database?
I mean, if the keys are INTs, i'll have to renumber everything when importing basically. GUIDs will spare me of that.

A君 2024-07-11 06:12:26

指南肯定会比标准整数键慢(并且使用更多内存),但这是否是一个问题将取决于您的系统将看到的负载类型。 根据您的后端数据库,索引 guid 字段可能会出现问题。

使用 guid 可以简化一整类问题,但是你要为性能和可调试性付出代价——在这些测试查询中输入 guid 会很快变得过时!

Guids will certainly be slower (and use more memory) than standard integer keys, but whether or not that is an issue will depend on the type of load your system will see. Depending on your backend DB there may be issues with indexing guid fields.

Using guids simplifies a whole class of problems, but you pay for it part will performance and also debuggability - typing guids into those test queries will get old real fast!

深爱不及久伴 2024-07-11 06:12:26

我只是要向您指出Sequential Guid 相对于标准 Guid 的性能改进是什么?,其中涵盖了 GUID 讨论。

为了便于人类阅读,请考虑分配机器 ID,然后可以使用这些机器的序列号。 不过,这将需要管理机器 ID 的分配。 可以在一列或两列中完成。

不过,我个人很喜欢 SGUID 的答案。

I'm just going to point you to What are the performance improvement of Sequential Guid over standard Guid?, which covers the GUID talk.

For human readability, consider assigning machine IDs and then using sequential numbers from those machines as a possibility. This will require managing the assignment of machine IDs, though. Could be done in one or two columns.

I'm personally fond of the SGUID answer, though.

屌丝范 2024-07-11 06:12:26

确保使用 guid.comb - 负责索引工作。 如果您之后正在处理性能问题,那么您很快就会成为扩展方面的专家。

使用 GUID 的另一个原因是启用数据库重构。 假设您决定对您的 Customers 实体应用多态性或继承或其他任何内容。 您现在希望 Customers 和Employees 派生自Person 并让他们共享一个表。 拥有真正唯一的标识符使数据迁移变得简单。 没有序列或整数标识字段可供对抗。

Make sure to utilize guid.comb - takes care of the indexing stuff. If you are dealing with performance issues after that then you will be, in short order, an expert on scaling.

Another reason to use GUIDs is to enable database refactoring. Say you decide to apply polymorphism or inheritance or whatever to your Customers entity. You now want Customers and Employees to derive from Person and have them share a table. Having really unique identifiers makes data migration simple. There are no sequences or integer identity fields to fight with.

长梦不多时 2024-07-11 06:12:26

为此我将开始考虑 SQL Server Compact Edition! 它有助于解决您的所有问题。

SQL Server 2005 Compact Edition 的数据存储架构

它专门设计用于

现场部队应用程序 (FFA)。 游离脂肪酸
通常共享一个或多个
以下属性

它们允许用户执行他们的操作
断开连接时的工作职能
后端网络——现场
客户位置、在路上、在
机场,或从家里。

FFA 通常设计用于
偶尔连接,这意味着
当用户运行客户端时
应用程序,他们不需要
任何类型的网络连接。 游离脂肪酸
经常涉及多个客户
可以同时访问和使用数据
来自后端数据库,都在
连接和断开模式。

FFA 必须能够复制数据
从后台数据库到
用于离线支持的客户端数据库。
他们还需要能够复制
修改、添加或删除数据
客户端到服务器的记录
当应用程序能够
连接到网络

i would start to look at SQL Server Compact Edition for this! It helps with all of your issues.

Data Storage Architecture with SQL Server 2005 Compact Edition

It specifically designed for

Field force applications (FFAs). FFAs
usually share one or more of the
following attributes

They allow the user to perform their
job functions while disconnected from
the back-end network—on-site at a
client location, on the road, in an
airport, or from home.

FFAs are usually designed for
occasional connectivity, meaning that
when users are running the client
application, they do not need to have
a network connection of any kind. FFAs
often involve multiple clients that
can concurrently access and use data
from the back-end database, both in a
connected and disconnected mode.

FFAs must be able to replicate data
from the back-end database to the
client databases for offline support.
They also need to be able to replicate
modified, added, or deleted data
records from the client to the server
when the application is able to
connect to the network

单调的奢华 2024-07-11 06:12:26

当我们必须将两个数据库合并为一个时,使用 GUID 为我们节省了大量工作。

Using GUIDs saved us a lot of work when we had to merge two databases into one.

上课铃就是安魂曲 2024-07-11 06:12:26

您是对的,这是一个老问题,并且它有两个规范的解决方案:

  • 使用唯一标识符作为主键。 请注意,如果您担心可读性,您可以滚动自己的唯一标识符,而不是使用 GUID。 唯一标识符将使用有关日期和机器的信息来生成唯一值。

  • 使用“Actor”+标识符的复合键。 每个用户都会获得一个数字参与者 ID,新插入行的键使用参与者 ID 以及下一个可用标识符。 因此,如果两个参与者都插入 ID 为“100”的新行,则不会违反主键约束。

就我个人而言,我更喜欢第一种方法,因为我认为复合键作为外键确实很乏味。 我认为对人类可读性的抱怨被夸大了——无论如何,最终用户不应该知道关于你的密钥的任何信息!

You're correct that this is an old problem, and it has two canonical solutions:

  • Use unique identifiers as the primary key. Note that if you're concerned about readability you can roll your own unique identifier instead of using a GUID. A unique identifier will use information about the date and the machine to generate a unique value.

  • Use a composite key of 'Actor' + identifier. Every user gets a numeric actor ID, and the keys of newly inserted rows use the actor ID as well as the next available identifier. So if two actors both insert a new row with ID "100", the primary key constraint will not be violated.

Personally, I prefer the first approach, as I think composite keys are really tedious as foreign keys. I think the human readability complaint is overstated -- end-users shouldn't have to know anything about your keys, anyways!

半葬歌 2024-07-11 06:12:26

将通过以下方式访问这些表
GUID 键速度慢吗?

GUID 还有其他问题,您会看到 GUID 不是连续的,因此插入将分散在各处,这会导致页面拆分和索引碎片

在 SQL Server 2005 MS 引入 NEWSEQUENTIALID() 来解决此问题,您可能遇到的唯一问题您只能使用 NEWSEQUENTIALID 作为表中的默认值

Will access to these tables through
the GUID key be slow?

There are other problems with GUIDs, you see GUIDs are not sequential, so inserts will be scattered all over the place, this causes page splits and index fragmentation

In SQL Server 2005 MS introduced NEWSEQUENTIALID() to fix this, the only problem for you might be that you can only use NEWSEQUENTIALID as a default value in a table

你的笑 2024-07-11 06:12:26

这是 GUID 的完美使用。 唯一的缺点是使用 GUID 比使用 INT 稍微复杂一些,并且大小略有差异(16 字节与 4 字节)。

我认为这两者都不是什么大问题。

This is a perfectly good use of GUIDs. The only draw backs would be a slight complexity in working with GUIDs over INTs and the slight size difference (16 bytes vs 4 bytes).

I don't think either of those are a big deal.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文