比较两个不同数据库之间的大量数据的最佳方法是什么?

发布于 2024-10-19 05:52:11 字数 351 浏览 1 评论 0原文

我有一个桌面应用程序从 Web 服务接收数据并将其存储在本地 postgresql 数据库中(而 Web 服务从 SQL Server 数据库检索数据)。在此过程结束时,我的本地数据库中的表中将至少有 250 万个条目,但这些条目将从 de webservice 处分批接收,每次约 300 行,时间范围约为 15 天。

我需要一种方法来确保我的本地数据库具有与服务器数据库完全相同的信息。

我正在考虑为收到的每个批次创建某种校验和,然后在收到所有批次后,为整个表创建另一个校验和,但我不知道这是否是最佳解决方案,如果是,我不知道知道从哪里开始创建它。

PS:TCP已经处理完整性检查,所以我什至不知道是否需要这样做,但数据相同至关重要。

I have one desktop application receiving data from a webservice and storing it inside a local postgresql database (while the webservice retrieves data from a SQL Server database). At the end of the process there will be a minimum of 2.5 million entries inside a table in my local database but this will be received from de webservice in batches of about 300 rows at time and within a time frame of about 15 days.

What I need is a way to make sure that my local database has the exact same information the server's database has.

I'm thinking of creating some sort of checksum for each batch received and then, after all batches were received, another checksum of the entire table but I don't know if this is the best solution and, if is, I don't know where to start to create it.

PS: TCP already handles integrity check so I don't even know if this is needed, but it is critical that the data are the same.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

尤怨 2024-10-26 05:52:11

我可以看到校验和如何可能有用,但是您所做的转换量可能会使其不切实际。您必须根据数据的原始形式或转换后的形式得出校验和;它对两者都无效。

你有一些奇怪的限制(我自己也经历过),所以在不了解所有细节的情况下很难想出一个清晰的策略。也许以下建议之一会起作用。

  1. 迁移完成后在 SQL Server 端和 PostgreSQL 端的简单 count(*)。

  2. 迁移完成后,从 SQL Server 端和 PostgreSQL 端转储出键列表,然后对这些文件进行排序和比较。

  3. 如果由于对 SQL Server 的访问受限而无法实现 1 和 2,则可以在进行过程中将 Web 服务调用的结果转储到单个文件位置,然后最后从 PostgreSQL 中提取相同的数据,并比较这些文件。

如果您选择选项 2 或 3,则有许多工具可用于比较文件。

I can see how a checksum could possibly be useful, but the amount of transformation you're doing would probably make it impractical. You'd have to derive the checksum on either the original form of the data or on the transformed form; it wouldn't be valid on both.

You have some strange constraints (been there myself), so it's kind of hard to come up with a clear strategy without knowing all the details. Maybe one of the following suggestions would work.

  1. A simple count(*) on the SQL Server side and on the PostgreSQL side after the migration is complete.

  2. Dump out a list of keys from the SQL Server side and from the PostgreSQL side after the migration is complete, and then sort and compare those files.

  3. If 1 and 2 aren't possible because of limited access to SQL Server, maybe dump out the results of the web service calls to a single file location as you go along, and then extract the same data from PostgreSQL at the end, and compare those files.

There are numerous tools available for comparing files if you choose options 2 or 3.

那请放手 2024-10-26 05:52:11

您可以控制 Web 服务和 SQL Server DB 吗?如果您这样做,SQL Server 更改跟踪应该可以解决问题。 MSDN 更改跟踪将跟踪每个更改(或仅跟踪您关心的更改) )以每张桌子为基础。每次同步时,您只需将版本号传递给它,它就会返回使您保持最新状态所需的变更集。

Do you have control over the web service and SQL Server DB? If you do, SQL Server Change Tracking should do the trick. MSDN Change Tracking will track every change (or just the changes you care about) on a per table basis. Each time you synchronize you just pass it your version number and it will return the changeset required to bring you up to date.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文