具有投递箱式集成模式的交易
我目前面临一个问题,试图使用下拉框样式集成模式设计一个强大的解决方案。文件到达文件系统上的文件夹中,我必须将内容放入 Oracle 数据库中。问题是事务完成后,文件被移动到另一个文件夹以表明处理已完成。假设进程在提交和移动文件之间被终止。当进程重新启动时,文件内容将再次插入数据库,从而导致重复数据。我认为有几种方法可以制定可靠的解决方案,但我不确定该选择哪一种。我读了 Pat Helland 的论文 生活超越分布式事务:叛教者的意见,但现在我面临着实际的实施问题。
我可以以某种方式使插入过程幂等。换句话说,第二次插入尝试将会失败,因为数据已经在数据库中。不幸的是,除了我给它的标识符之外,数据没有唯一的标识符,因此我必须开始插入过程,将标识符合并为文件名的一部分,并在文件到达系统后立即重命名该文件。然后,在尝试插入时,插入过程将使用该键作为主键的一部分。
我可以使用文件系统和数据库进行某种分布式事务。我可以将文件移动与数据库提交绑定到同一事务。我正在使用 Java,我知道分布式事务(XAResource),但我从未使用过它们。该解决方案可能包括使用 JBoss 事务文件 I/O,尽管替代方案很少。 XADisk 是另一个可能执行类似操作的现成库。
我可以使用我没有想到的其他替代方案。
有什么建议吗?
I'm currently facing a problem trying to design a robust solution using a drop box style integration pattern. Files arrive in a folder on a filesystem and I have to put the contents into a Oracle database. The problem is that the transaction completes and then the file is moved to another folder to indicate processing is complete. Suppose the process were killed in between the time of the commit and the moving of the file. When the process is restarted the file contents would be inserted into the database again resulting in duplicate data. I feel there are couple of approaches to make a robust solution, but I'm not sure which one to choose. I read Pat Helland's paper Life beyond Distributed Transactions: an Apostate's Opinion, but now I am facing practical implementation issues.
I could make the insertion process idempotent somehow. In other words a second insert attempt would fail because the data would already be in database. Unfortunately, the data has no unique identifier other than one I would give it, so I would have to start the insertion process incorporating the identifier as part of the file name and renaming the file as soon as it arrives on the system. The insertion process would then use the key as part of the primary key when inserts are attempted.
I could do some sort of distributed transaction using the filesystem and the database. I could tie the file move to the same transaction as the database commit. I am using Java and I am aware of distributed transactions (XAResource), but I have never used them. This solution might include using JBoss Transactional File I/O, although there are precious few alternatives. XADisk is another off the shelf library that might do similar.
I could use some other alternative I have not thought of.
Any suggestions?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果您想使用 XA,可以尝试 XADisk。我一直是 XADisk 项目的积极参与者,并且有一些现实生活中的项目已经解决了文件<->数据库一致性问题,类似于您使用 XADisk 编写的内容。
我相信您会发现 XA 很简单,至少从应用程序开发人员的角度来看,他们不必为 XA 编写事务管理器。
希望有帮助。
尼丁(@XADisk)
In case you feel like going with XA, you can give XADisk a try. I have been an active part of the XADisk project, and there are some real life projects which have solved the problem of file<->database consistency, similar to what you wrote, using XADisk.
I believe you would find XA simple, atleast from the application developer perspective who don't have to write a transaction-manager for XA.
Hope that helps.
Nitin (@XADisk)