延迟写入错误
在过去的几个月里,我们一直因延迟写入错误而丢失数据。我在自定义代码和收缩包装应用程序中都遇到过错误。例如,下面的错误消息来自 Visual Studio 2008 构建解决方案时的错误消息
Windows - 延迟写入失败:Windows 无法 保存文件的所有数据 \Vital\Source\Other\OCHSHP\Done07\LHFTInstaller\Release\LHFAI.CAB。这 数据已丢失。此错误可能是由于您的故障造成的 计算机硬件或网络连接。请尝试保存此文件 其他地方。
例如,当它发生在 Adobe、Visual Studio 或 Word 中时,不会造成任何损害。主要问题是当它发生在我们的自定义应用程序(将 dBase 文件中的数据写入网络共享的直接 C 应用程序)时。
从程序的角度来看,写入成功。它删除源数据,并继续处理下一条记录。几分钟后,Windows 弹出错误消息,说发生延迟写入,数据丢失。
我的问题是,我们可以做些什么来帮助我们的网络/服务器团队隔离并纠正问题(阅读,让他们相信问题是真实的。只需告诉他们很多很多时代还没有说服他们)您对我们如何编写以避免数据丢失有什么建议吗?
For the past few months, we've been losing data to a Delayed Write errors. I've experienced the error with both custom code and shrink-wrap applications. For example, the error message below came from Visual Studio 2008 on building a solution
Windows - Delayed Write Failed : Windows was unable
to save all the data for the file
\Vital\Source\Other\OCHSHP\Done07\LHFTInstaller\Release\LHFAI.CAB. The
data has been lost. This error may be caused by a failure of your
computer hardware or network connection. Please try to save this file
elsewhere.
When it occurs in Adobe, Visual Studio, or Word, for example, no harm is done. The major problem is when it occurs to our custom applications (straight C apps that writes data in dBase files to a network share.)
From the program's perspective, the write succeeds. It deletes the source data, and goes on to the next record. A few minutes later, Windows pops up an error message saying that a delayed write occurred and the data was lost.
My question is, what can we do to help our networking/server teams isolate and correct the problem (read, convince them the problem is real. Simply telling them many, many times hasn't convinced them as of yet) and do you have any suggestions of how we can write to avoid the data loss?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
与任何现代操作系统一样,Windows 上的写入实际上不会发送到磁盘,直到操作系统开始处理它。这是一个巨大的性能胜利,但问题(正如您所发现的)是您无法在写入时检测到错误。
每个进行异步写入的操作系统还提供了将数据强制写入磁盘的机制。在 Windows 上,FlushFileBuffers 或 < href="http://msdn.microsoft.com/en-us/library/17618685" rel="nofollow">_commit 函数就可以解决问题。 (一个用于
HANDLE
,另一个用于文件描述符。)请注意,您必须检查每个磁盘写入的返回值,和这些同步函数的返回值,以确定数据已写入磁盘。另请注意,这些函数会阻塞并等待数据到达磁盘 - 即使您正在写入网络服务器 - 因此它们可能会很慢。在您确实需要将数据推送到稳定存储之前,请勿调用它们。
有关详细信息,请参阅跨平台的 fsync()。
Writes on Windows, like any modern operating system, are not actually sent to the disk until the OS gets around to it. This is a big performance win, but the problem (as you have found) is that you cannot detect errors at the time of the write.
Every operating system that does asynchronous writes also provides mechanisms for forcing data to disk. On Windows, the FlushFileBuffers or _commit function will do the trick. (One is for
HANDLE
s, the other for file descriptors.)Note that you must check the return value of every disk write, and the return value of these synchronizing functions, in order to be certain the data made it to disk. Also note that these functions block and wait for the data to reach disk -- even if you are writing to a network server -- so they can be slow. Do not call them until you really need to push the data to stable storage.
For more, see fsync() Across Platforms.
您的文件系统已损坏或硬盘出现故障。网络/服务器团队应该扫描磁盘以修复前者并检测后者。另请检查错误日志,看看它是否告诉您任何信息。如果错误日志表明无法写入硬件,则需要更换磁盘。
You have a corrupted file system or a hard disk that is failing. The networking/server team should scan the disk to fix the former and detect the latter. Also check the error log to see if it tells you anything. If the error log indicates that failure to write to the hardware then you need to replace the disk.