如何实现文件系统和数据库的协调?
我正在开发一个在线文件管理项目。我们将引用存储在数据库(sql server)上,将文件数据存储在文件系统上。
当我们上传文件以及删除文件时,我们面临着文件系统和数据库之间的协调问题。首先,我们在数据库中创建引用或在文件系统上存储文件。
问题是,如果我首先在数据库中创建引用,然后在文件系统上存储文件,但是在文件系统上存储文件时发生任何类型的错误,则在数据库中创建该文件的引用,但没有文件数据存在于文件系统上。
请给我一些如何处理这种情况的解决方案。我非常需要它。
我们删除文件的时候也会出现这种情况吗?
I am working on a online file management project. We are storing references on the database (sql server) and files data on the on file system.
We are facing a problem of coordination between file system and database while we are uploading a file and also in case of deleting a file. First we create a reference in the database or store files on file system.
The problem is that if I create a reference in the database first and then store a file on file system, but while storing files on the file system any type of error occur, then the reference for that file is created in the database but no file data exist on the file system.
Please give me some solution how to deal with such situation. I am badly in need of it.
This case happens also while we deleting a file?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
对文件系统的访问确实不是事务性的。您需要自己模拟一个全有或全无的分布式事务:如果数据库中的提交失败,请删除文件系统上的文件。相反,如果写入文件失败,则回滚数据库事务(这会更复杂一些,但这是一个粗略的草图)。
请注意,更新文件时可能会变得非常复杂。您需要首先复制它,这样如果您覆盖文件后数据库事务失败,您仍然可以恢复文件的旧版本。您是否要这样做取决于所需的稳健性级别。
尝试强制所有操作都通过您的应用程序(创建、写入、删除文件)。如果您不能做到这一点并且无法阻止文件在文件系统上被直接访问(并且可能被删除),我认为除了定期将数据库与文件系统同步之外别无他法:检查哪个文件被删除并删除数据库中的条目。您可以为此创建一个每 X 分钟运行一次的作业。
我还建议在数据库中存储文件的哈希(例如 MD5)。需要花一些时间来计算它,但这对我检测问题非常有用,例如,文件是否在文件系统上被错误重命名,但在数据库中却没有重命名。这还允许定期运行一些完整性检查,以验证没有任何问题。
如果这种方法还不够(例如,您希望它更加健壮),我认为除了将二进制文件存储在数据库中的 LOB 中之外,没有其他方法。那么它将真正具有交易性和安全性。
Access to the file system is indeed not transactional. You will need to simulate an all-or-nothing distributed transaction yourself: if the commit in database fails, delete the file on file-system. Inversely, if writing file fails, rollback database transaction (That will be a bit more complicated, but that's a rough sketch).
Note that it can get pretty complicated when a file is updated. You need first to copy it, so that if the database transaction fails after you've overwritten the file you can still restore the old version of the file. Whether you want to do this depends on the level of robustness that is desired.
Try to enforce that all manipulations go through your application (create, write, delete of files). If you can not do this and you can not prevent a file from being accessed directly on the file system (and maybe deleted), I see no other way than to periodically synchronize the database with the file system: check which file was removed and delete the entry in database. You could create a job that runs each X minute for that.
I would also suggest storing a hash (e.g. MD5) of the file in database. Take a bit of time to compute it, but that has been immensely useful for me to detect problems, e.g. if the file is renamed on file system by error but not in database. That also allows to run some integrity check periodically, to verify nothing was screwed.
If this approach is not sufficient (e.g. you want it to be more robust), I see no other way than to store the binary in the database in LOB. Then it will be really transactional and safe.
我知道一个老问题,但为了其他读者的利益:
根据您的操作系统,您也许能够使用事务性 TxF
http://msdn.microsoft.com/en-us/magazine/cc163388.aspx
An old question I know, but for the benefit of other readers:
Depending on your operating systems you may be able to use Transactional TxF
http://msdn.microsoft.com/en-us/magazine/cc163388.aspx
将两个事件(管理引用和管理文件)视为单个事务。如果其中一个失败,则退出另一个。那么你应该会发现很难陷入两者不同步的情况。回滚数据库操作比回滚文件系统操作更容易。
Treat the two events (managing the reference, and managing the file) as a single transaction. If either one fails, back the other one out. Then you should find it hard to get into a situation where the two are not in sync. It's easier to rollback database operations than filesystem operations.
SQL Server 2008 中引入了 FILESTREAM 来解决这个问题。
但是,它附带了自己的一组 实施挑战。
FILESTREAM was introduced in SQL Server 2008 to address this exact problem.
However, it comes with its own set of implementation challenges.