有时离线应用程序架构问题

发布于 2024-07-10 00:23:19 字数 235 浏览 6 评论 0原文

我有一个针对 sqlserver DB 运行的 n 层 winform 客户端服务器应用程序。我希望它有时能够“离线”运行（未连接到数据库）并在重新连接时重新协调对主数据库的更改。现在，我需要做出艰难的架构决策：我应该使用数据库复制还是使用队列/脚本等自己管理它。我的应用程序非常复杂 - 我使用包含自动增量键和表之间的外来键约束的表的数据库。我的部分数据没有像图片和文档那样嵌入到数据库中。我非常想听听您的意见和过去的经验！谢谢，阿迪

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

初心未许 2024-07-17 00:23:20

（免责声明：我假设您已经考虑使用 .NET 数据集并对其打折扣，因为它们旨在帮助解决您所描述的问题领域。）

我曾经在一家公司工作，该公司为其全国连锁店开发了销售点系统。主数据库存储在总部，而每个商店都有自己的该数据库的精简版本存储在该站点本地。实际上，每个商店一直处于离线状态，因此情况并不完全是您所描述的情况，但是我们必须处理一些我想您需要处理的同步/复制问题。

我们的数据通信每天晚上进行：商店将在预定时间连接到总部，上传数据更改包，并下载类似的数据更改包，并将其应用于该商店的本地数据库。然后，我们在两个站点（总部和商店）都有所谓的“数据同步引擎”，它可以处理这些数据包，将更改（插入/更新/删除）折叠回相关数据库。

当您执行这样的基本数据复制时，正如 Sergio 提到的那样，存在许多潜在的陷阱。一是身份，即如何派生唯一标识表行的主键。另一个是版本控制，以及如何处理同一行的不同版本之间的冲突。

在我们的例子中，我们通过使用 GUID 作为主键而不是使用自动增量列，让事情变得简单（更！）。使用 GUID 并非没有问题，但在我们的例子中，这意味着我们可以将主键分配给新的数据行，而不必担心其他人使用它。

我对我们如何处理版本控制问题有点模糊（已经有几年了！），但从记忆中我认为每个表行上都有两个时间戳：其中一个记录了该行更新的日期/时间总公司; 另一个是在商店更新时。每行还有两个“版本号”，表示该行在总部和商店的版本。数据协调涉及将这些时间戳和版本号相互比较，最近的更改“获胜”（当然假设另一方没有更改该行）。

正如 Sergio 指出的那样，最大的问题是处理数据协调冲突。在我们的案例中，当商店和总部在同一天更改相同的数据项时就会发生这种情况。我们解决这个问题的方法是总是在商店端失败更改，并在总部编写自定义数据协调应用程序，其中涉及用户直观地比较和合并数据项的两个冲突版本。理论上，我认为您可以使用一些自定义处理规则自动合并不同版本，但您需要权衡开发此类内容的成本与发生冲突的可能性。据我记忆，尽管有大量商店（几百个）对同一组数据进行更改，但这对我们的系统来说从来没有被证明是一个大问题。当然是YMMV。

(Disclaimer: I'm assuming that you've already considered using .NET DataSets and discounted them, given that they're designed to help with just the problem domain that you're describing.)

I used to work for a company that developed a point-of-sale system for its nationwide chain of shops. The master database was stored at head office, while each shop had its own cut-down version of this database stored locally at that site. Effectively, each shop was off-line all the time, so it's not quite the situation that you're describing, however we had to deal with some of the synchronisation/replication issues that I imagine you will need to deal with.

Our data communications happened each night: shops would connect to head office at a pre-determined time, upload a package of data changes, and download a similar package of data changes that were to be applied to that shop's local database. We then had what you might call 'data sync engines' at both sites (head office & shops) which would process these data packets, folding the changes (inserts/updates/deletions) back into the relevant database.

When you perform basic data replication like this, there are a number of potential pitfalls as Sergio has mentioned. One is identity, namely how you derive a primary key that uniquely identifies a table row. Another is versioning, and how you handle conflicts between different versions of the same row.

In our case, we made things easy(-ier!) for ourselves by using GUIDs as primary keys rather than using auto-increment columns. Using GUIDs is not without its issues, but in our case it meant that we could assign a primary key to a new data row and not have to worry about anyone else using it.

I'm a bit hazy on how we handled the versioning problem (it's been a few years!), but from memory I think we had two timestamps on each table row: one of these recorded the date/time when the row was updated at head office; the other, when it was updated at the shop. Each row also had two 'version numbers' that indicated the version of the row at head office and at the shop. Data reconciliation involved comparing these timestamps and version numbers against each other, with the most recent change 'winning' (assuming the other party hadn't changed the row of course).

As Sergio points out, your biggest problem will be handling data reconciliation conflicts. In our case, this occurred when a shop and head office changed the same data item on the same day. We worked around this by always failing the change at the shop end, and writing a custom data reconciliation application at head office, which involved a user visually comparing and merging two conflicting versions of a data item. In theory I suppose you could automate the merging of different versions using some custom processing rules, but you would need to weigh-up the cost of developing something like that versus the likelihood of conflicts arising. From memory, this never proved to be that big a problem for our system, despite there being a large number of shops (a few hundred) making changes to the same set of data. YMMV of course.

回复收藏 0 原文

后知后觉 2024-07-17 00:23:20

我以前从未做过类似的事情，但在我看来，如果你这样做，你可能会遇到严重的问题......

从技术上讲，我认为实施起来并不那么困难。基本上，您必须在每个客户端上设置数据库的副本，并在每次客户端连接到服务器时与服务器同步，但我想您已经做到了这一点。

我会在客户端的每个表上都有一个位列和一个日期戳，这样我就可以检查哪些记录已离线更改。在服务器端，记录对象的最后更新的日期戳列就可以解决问题。

至于具有自动增量的主键，我会松开它们，因为您需要自己设置它们以防止使用相同的键创建两个记录（您可能需要在同步时更改它们）。

这是简单的部分...现在事情会变得混乱...你需要考虑到这会给你带来很多头痛...各种或不希望的事件都会发生，一些例子：
- 两个用户离线更改同一条记录。
- 一个用户在线更改记录，另一个用户离线更改记录
- 一个用户在线删除一条记录，而另一个用户则离线处理

该记录潜在问题的清单不胜枚举，在开始解决这些问题之前，您必须枚举每一个问题，并与客户一起记录他们期望系统如何处理每种情况，否则当他们丢失数据时（无论你做什么都会发生这种情况），这将是你的错而不是他们的错。

我建议您为数据库中的每个可以离线更改的表构建一个版本控制系统。用户会弄乱他们的数据，并且执行回滚对他们来说会很好。

回复收藏 0 原文

沉睡月亮 2024-07-17 00:23:20

我已经在不同的地方做过几次了（请参阅下面 Steve Rands 的回答），我强烈建议您不要使用正常的复制 - 特别是如果涉及多个数据库的话。

我这么说的原因是，根据我的经验，复制不够智能，无法处理当您将远程站点重新联机（或当您决定向整个网络添加新站点时）可能出现的问题。

如果您只有 2 或 3 个不同的数据库，但如果您谈论的是许多不同的位置，这些位置可以随时在线/离线，并且可以随时添加（或删除或修改）信息，那么复制对于这种事情来说是很好的。在这些位置中，你很快就会让某些东西陷入混乱状态。从技术上讲，这并不是一件非常令人满意的事情，但是您总是能够想到特殊情况，在这些情况下您不希望复制按照设计去做它想做的事情。

如果您只处理 2 个数据库，那么显然复制问题会变得更加简单，您可能会发现可以使用合并复制来完成这项工作（尽管您必须注意数据库设计）。

我刚刚买了一本二手的 Apress SQL Server 2005 复制圣经（不是在办公室，所以没有作者，但这是一本值得推荐的巨著）——在前几章中，我读到了开始意识到，如果您确实在两端（或更多端）更改数据，那么复制并不是灵丹妙药。 :-)

回复收藏 0 原文