通过 NFS 挂载的 Java I/O

发布于 2024-08-12 06:23:11 字数 1172 浏览 5 评论 0 原文

我有一些 Java 代码,可以将 XML 文件输出到 NFS 安装的文件系统。在另一台将文件系统安装为 Samba 共享的服务器上,有一个正在运行的进程每 30 秒轮询一次新的 XML 文件。如果找到新文件,则会对其进行处理,然后重命名为备份文件。 99% 的情况下,文件写入都没有问题。然而,备份文件有时会包含部分写入的文件。

经过与其他人的讨论,我们猜测外部服务器上运行的进程在读取文件时干扰了 Java 输出流。他们建议首先创建一个 .temp 类型的文件,然后在文件写入完成后将其重命名为 .xml。一种常见的行业惯例。更改后,每次重命名都会失败。

一些研究表明,在使用 NFS 安装的文件系统时,Java 文件 I/O 存在问题。

Java 高手们帮帮我吧!我该如何解决这个问题?

以下是一些相关信息:

  • 我的进程是在 Solaris 10 上运行的 Java 1.6.0_16
  • 挂载的文件系统是 NAS
  • 服务器,轮询进程是 Windows Server 2003 R2 Standard,Service Pack 2

以下是我的代码示例:

//Write the file
XMLOutputter serializer = new XMLOutputter(Format.getPrettyFormat());
FileOutputStream os = new FileOutputStream(outputDirectory + fileName + ".temp");
serializer.output(doc, os);//doc is a constructed xml document using JDOM
os.flush();
os.close();

//Rename the file
File oldFile = new File(outputDirectory + fileName + ".temp");
File newFile = new File(fileName + ".xml");
boolean success = oldFile.renameTo(newFile);
if (!success) {
    // File was not successfully renamed.
    throw new IOException("The file " + fileName + ".temp could not be renamed.");
}//if

I have a bit of Java code that outputs an XML file to a NFS mounted filesystem. On another server that has the filesytem mounted as a Samba share, there is a process running that polls for new XML files every 30 seconds. If a new file is found, it is processed and then renamed as a backup file. 99% of the time, the files are written without an issue. However, every now and then the backup file contains a partially written file.

After some discussion with some other people, we guessed that the process running on the external server was interfering with the Java output stream when it read the file. They suggested first creating a file of type .temp which will then be renamed to .xml after the file write is complete. A common industry practice. After the change, the rename fails every time.

Some research turned up that Java file I/O is buggy when working with NFS mounted filesystems.

Help me Java gurus! How do I solve this problem?

Here is some relevant information:

  • My process is Java 1.6.0_16 running on Solaris 10
  • Mounted filesystem is a NAS
  • Server with polling process is Windows Server 2003 R2 Standard, Service Pack 2

Here is a sample of my code:

//Write the file
XMLOutputter serializer = new XMLOutputter(Format.getPrettyFormat());
FileOutputStream os = new FileOutputStream(outputDirectory + fileName + ".temp");
serializer.output(doc, os);//doc is a constructed xml document using JDOM
os.flush();
os.close();

//Rename the file
File oldFile = new File(outputDirectory + fileName + ".temp");
File newFile = new File(fileName + ".xml");
boolean success = oldFile.renameTo(newFile);
if (!success) {
    // File was not successfully renamed.
    throw new IOException("The file " + fileName + ".temp could not be renamed.");
}//if

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

So要识趣 2024-08-19 06:23:11

您可能必须在新文件名中指定完整路径:

File newFile = new File(outputDirectory + fileName + ".xml");

You probably have to specify the complete path in the new file name:

File newFile = new File(outputDirectory + fileName + ".xml");
柳絮泡泡 2024-08-19 06:23:11

这对我来说似乎是一个错误:

File oldFile = new File(outputDirectory + fileName + ".temp");
File newFile = new File(fileName + ".xml");

我本来期望这样:

File oldFile = new File(outputDirectory + fileName + ".temp");
File newFile = new File(outputDirectory + fileName + ".xml");

一般来说,听起来 XML 文件的写入和读取/处理/重命名任务之间存在竞争条件。您可以让读取/处理/重命名任务仅对文件进行操作吗? 1 分钟前或类似的东西?

或者,让 Java 程序在完成 XML 文件的写出后写出一个附加的空文件,以表明 XML 文件的写入已完成。仅当信号文件存在时才读取/处理/重命名 XML 文件。然后删除信号文件。

This looks like a bug to me:

File oldFile = new File(outputDirectory + fileName + ".temp");
File newFile = new File(fileName + ".xml");

I would have expected this:

File oldFile = new File(outputDirectory + fileName + ".temp");
File newFile = new File(outputDirectory + fileName + ".xml");

In general, it sounds like there is a race condition between the writing of the XML file and the read/process/rename task. Can you have the read/process/rename task only operate on files > 1 minute old or something similar?

Or, have the Java program write out an additional, empty file once it has completed writing out the XML file that signals that the writing to the XML file has completed. Only read/process/rename the XML file when the signal file is present. Then delete the signal file.

猥琐帝 2024-08-19 06:23:11

最初的错误听起来肯定像是并发访问文件的问题——您的解决方案应该有效,但也有替代解决方案。

例如,在自动读取进程上放置一个计时器,以便在检测到新文件时记录文件大小,休眠 X 秒,然后如果大小不匹配则重新启动计时器。这应该可以避免部分文件传输的问题。

编辑:或按照上面的方法检查时间戳来检查这一点,但请确保它足够旧,时间戳中的任何不精确都无关紧要(例如,自上次修改以来 10 秒到 1 分钟)。

或者,尝试以下操作:

File f = new File("foo.xml");
FileOutputStream fos = new FileOutputStream(f);
FileChannel fc = fos.getChannel();
FileLock lock = fc.lock();
(DO FILE WRITE)
fis.flush();
lock.release();
fos.close();

这应该使用本机操作系统文件锁定来防止其他程序(例如 XML 阅读器守护程序)的并发访问。

至于 NFS 故障:有一个已记录的“功能”(错误),文件无法通过 Java 中的“重命名”在文件系统之间移动。由于它位于 NFS 文件系统上,会不会造成混乱?

The original bug definitely sounds like an issue with concurrent access to the file -- your solution should have worked, but there are alternate solutions too.

For example, put a timer on your auto-read process so it when a new file is detected it records filesize, sleeps X seconds, and then if the sizes don't match restarts the timer. That should avoid problems with partial file transfer.

EDIT: or check the timestamps as pre above to check this, but make sure it's old enough that any imprecision in the timestamp doesn't matter (say, 10 seconds to 1 minute since last modified).

Alternately, try this:

File f = new File("foo.xml");
FileOutputStream fos = new FileOutputStream(f);
FileChannel fc = fos.getChannel();
FileLock lock = fc.lock();
(DO FILE WRITE)
fis.flush();
lock.release();
fos.close();

This SHOULD use native OS file locking to prevent concurrent access by other programs (such as your XML reader daemon).

As far as NFS glitches: there is a documented "feature" (bug) where files can't be moved between filesystems via "rename" in Java. Could there be confusion, since it is on a NFS filesystem?

柒七 2024-08-19 06:23:11

NFS 的一些一般信息。根据您的 NFS 设置,锁可能根本不起作用,并且许多大型 NFS 安装都针对读取性能进行了调整,因此由于缓存效应,新数据可能会晚于预期出现。

我看到过这样的效果:您创建了一个文件,添加了数据(这是在另一台机器上看到的),但此后的所有数据都出现了 30 秒的延迟。

顺便说一句,最好的解决方案是旋转文件架构。这样就假定最后一个已被写入,而之前的一个已安全写入并且可以读取。我不会处理单个文件并将其用作“管道”。

您也可以使用在大文件写入并正确关闭后写入的空文件。因此,如果小家伙在那里,那么大家伙就已经完成并且可以阅读了。

Some information to NFS in general. Depending on your NFS settings, locks might not work at all and a lot of big NFS installations are tuned for read performance, therefore new data might turn up later than expected, due to caching effects.

I have seen effects where you created a file, added data (this was seen on another machine), but all data after that appeared with a 30 sec delay.

Best solution by the way is a rotating file schema. So that the last one is assumed to be written and the one before was safely written and can be read. I would not work on a single file and use it as a "pipe".

You can alternatively use an empty file that is written after the large file was written and closed properly. So if the small guys is there, the big guy was definitively done and can be read.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文