如何实现快速数据库同步。有只读源吗?
我有一个源数据库(Sybase),它是只读的,您可以使用导入文件写入数据库。另一边是我自己的数据库(MSSQL),没有任何限制。
主要问题是第一个数据库上没有时间戳,并且我无权更改源数据库。那么是否有一个引擎/解决方案来实现这种同步。完毕?
I've got a source database (Sybase), which is read-only and you can write to the database with a import file. The other side is my own database (MSSQL) which has no limitations.
The main problem is that there are no timestamps on the first database and I don't have any access to change the source database. So is there a engine/solution to get this sync. done?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
diff 算法可能会起作用,但它不会很快,因为您必须为每次同步扫描整个源数据库。
基本上,您将以商定且稳定的方式进行完整的数据提取(即,两个这样的提取之间没有变化将产生相同的输出。)
然后将其与之前所做的提取进行比较,然后您可以发现所有的变化。需要比纯文本差异稍微更智能的东西,以帮助确定行不仅被删除+插入,而且实际上被更新。
不幸的是,如果没有办法询问源数据库最新的更改是什么,正如您所指出的,通过缺乏时间戳或类似的机制,那么我不知道您如何能获得比完整的更好的结果。每次都提取。
现在,我不太了解 Sybase,但在 MS SQL Server 中,您可能会创建另一个镜像第一个数据库的数据库,并且在第二个数据库中您可以进行所需的任何更改。
但是,如果您可以在 Sybase 中创建这样的数据库,并使用 SQL 同时访问这两个数据库,您也许能够运行产生差异的查询。
例如,类似以下内容:
这将生成插入或删除的行。
要查找发生更改的内容,您需要这个 WHERE 子句:
由于表是连接的,因此 WHERE 子句将过滤掉先前提取和当前状态不同的任何行。
现在,这可能也运行得不快,您必须进行测试才能确定。
A diff algorithm might work, but it wouldn't be fast, in the sense that you would have to scan the whole source database for each synchronization.
Basically you would do a full data extract, in an agreed upon, and stable, manner (ie. two such extracts with no changes between would produce identical output.)
Then you compare that to the previous extract you did, and then you can find all the changes. Something slightly more intelligent than a pure text diff would be needed, to help determine that rows weren't just deleted+inserted, but in fact updated.
Unfortunately, if there is no way to ask the source database what the latest changes are, through, as you've pointed out, lack of timestamps, or similar mechanisms, then I don't see how you can get any better than a full extract each time.
Now, I don't know Sybase that much, but in MS SQL Server you could potentially create another database that mirrors the first, and in this second database you could make whatever changes you need.
However, if you can make such a database in Sybase, and use SQL to access both at the same time, you might be able to run queries that produce the differences.
For instance, something along the lines of:
This would produce rows that are inserted or deleted.
To find those that changed, you would need this WHERE-clause:
Since the tables are joined, the WHERE-clause would filter out any rows where the previous extract and the current state is different.
Now, this might not run fast either, you would have to test to make sure.