多个 JVM 的 I/O 性能(Windows 7 受影响,Linux 有效)
我有一个程序可以创建大约 50MB 大小的文件。在此过程中,程序频繁重写文件的各个部分,并将更改强制写入磁盘(大约 100 次)。它使用 FileChannel 并通过 fc.read(...)、fc.write(...) 和 fc.force(...) 直接使用 ByteBuffer。
新文本:
我现在对这个问题有了更好的看法。 问题似乎是我使用三个不同的 JVM 来修改文件(一个创建它,另外两个(从第一个启动)写入它)。每个 JVM 都会在下一个 JVM 启动之前正确关闭该文件。 问题在于,对于第三个 JVM,该文件的 fc.write() 成本有时会达到顶峰(约为正常成本的 100 倍)。也就是说,所有写操作都同样慢,不仅仅是一个挂起时间很长的操作。 有趣的是,解决此问题的一种方法是在 JVM 启动之间插入延迟(2 秒)。没有延迟,写总是慢,有延迟,大约每隔一秒左右写就慢。
我还发现了这个 Stackoverflow:如何unmap a file from memory Maped using FileChannel in java? 它描述了映射文件的问题,我没有使用它。
我怀疑可能发生的事情: 当我调用 close() 时,Java 没有完全释放文件句柄。当下一个 JVM 启动时,Java(或 Windows)会识别对该文件的并发访问,并为该文件安装一些昂贵的并发处理程序,这使得写入变得昂贵。 这有意义吗?
该问题出现在Windows 7(Java 6和7,在两台机器上测试)上,但在Linux(SuSE 11.3 64)下没有出现。
旧文本:
问题: 从 Eclipse 或控制台作为 JUnit 测试工具启动程序工作正常,大约需要 3 秒。 对于同一任务,通过 ant 任务(或通过 JUnit 通过使用 ProcessBuilder 启动单独的 JVM)启动程序会将程序速度减慢至 70-80 秒(系数 20-30)。
使用 -Xprof 显示“force0”和“pwrite”的使用率从 34.1% (76+20 tics) 飙升至 97.3% (3587+2913+751 tics): 快速运行:
27.0% 0 + 76 sun.nio.ch.FileChannelImpl.force0
7.1% 0 + 20 sun.nio.ch.FileDispatcher.pwrite0
[..]
慢速运行:
Interpreted + native Method
48.1% 0 + 3587 sun.nio.ch.FileDispatcher.pwrite0
39.1% 0 + 2913 sun.nio.ch.FileChannelImpl.force0
[..]
Stub + native Method
10.1% 0 + 751 sun.nio.ch.FileDispatcher.pwrite0
[..]
GC 和编译可以忽略不计。
更多事实:
没有其他方法显示 -Xprof 输出有显着变化。
- 它要么很快,要么很慢,从来没有介于两者之间的。
- 内存不是问题,所有测试机器至少有 8GB,进程使用 <200MB
- 重新启动机器对
- 切换病毒扫描程序和类似的东西没有影响
- 当进程缓慢时,几乎没有 CPU
- 使用率从普通 JVM 运行时从不慢
- 在从第一个 JVM 启动(通过 ProcessBuilder 或作为 ant-task)启动的 JVM 中运行时一直很慢
- 所有 JVM 都完全相同。我通过 RuntimeMXBean 输出 System.getProperty("java.home") 和 JVM 选项 RuntimemxBean = ManagementFactory.getRuntimeMXBean();列表参数 = RuntimemxBean.getInputArguments();
- 我在两台Windows7 64位、Java 7u2、Java 6u26和JRockit的机器上进行了测试,虽然机器的硬件不同,但结果非常相似。
- 我也从 Eclipse 外部(命令行 ant)测试了它,但没有区别。
- 整个程序是我自己编写的,它所做的只是读取和写入该文件,没有使用其他库,特别是没有使用本机库。 -
还有一些我只是拒绝相信有任何意义的可怕事实:
- 删除所有类文件并重建项目有时(很少)有帮助。该程序(嵌套版本)运行速度很快一两次,然后再次变得非常慢。
- 安装新的 JVM 总是有帮助(每次!),这样(嵌套的)程序至少运行一次!安装 JDK 算作两次,因为 JDK-jre 和 JRE-jre 至少可以正常工作一次。过度安装 JVM 没有帮助。重启也不行。我还没有尝试删除/重新启动/重新安装...
- 这是我设法为嵌套程序获得快速程序运行时间的唯一两种方法。
问题:
- 什么可能导致嵌套 JVM 的性能下降?
- 这些方法到底是做什么的(pwrite0/force0)? -
I have a program that creates a file of about 50MB size. During the process the program frequently rewrites sections of the file and forces the changes to disk (in the order of 100 times). It uses a FileChannel and direct ByteBuffers via fc.read(...), fc.write(...) and fc.force(...).
New text:
I have a better view on the problem now.
The problem appears to be that I use three different JVMs to modify a file (one creates it, two others (launched from the first) write to it). Every JVM closes the file properly before the next JVM is started.
The problem is that the cost of fc.write() to that file occasionally goes through the roof for the third JVM (in the order of 100 times the normal cost). That is, all write operations are equally slow, it is not just one that hang very long.
Interestingly, one way to help this is to insert delays (2 seconds) between the launching of JVMs. Without delay, writing is always slow, with delay, the writing is slow aboutr every second time or so.
I also found this Stackoverflow: How to unmap a file from memory mapped using FileChannel in java? which describes a problem for mapped files, which I'm not using.
What I suspect might be going on:
Java does not completely release the file handle when I call close(). When the next JVM is started, Java (or Windows) recognizes concurrent access to that file and installes some expensive concurrency handler for that file, which makes writing expensive.
Would that make sense?
The problem occurs on Windows 7 (Java 6 and 7, tested on two machines), but not under Linux (SuSE 11.3 64).
Old text:
The problem:
Starting the program from as a JUnit test harness from eclipse or from console works fine, it takes around 3 seconds.
Starting the program through an ant task (or through JUnit by kicking of a separate JVM using a ProcessBuilder) slows the program down to 70-80 seconds for the same task (factor 20-30).
Using -Xprof reveals that the usage of 'force0' and 'pwrite' goes through the roof from 34.1% (76+20 tics) to 97.3% (3587+2913+751 tics):
Fast run:
27.0% 0 + 76 sun.nio.ch.FileChannelImpl.force0
7.1% 0 + 20 sun.nio.ch.FileDispatcher.pwrite0
[..]
Slow run:
Interpreted + native Method
48.1% 0 + 3587 sun.nio.ch.FileDispatcher.pwrite0
39.1% 0 + 2913 sun.nio.ch.FileChannelImpl.force0
[..]
Stub + native Method
10.1% 0 + 751 sun.nio.ch.FileDispatcher.pwrite0
[..]
GC and compilation are negligible.
More facts:
No other methods show a significant change in the -Xprof output.
- It's either fast or very slow, never something in-between.
- Memory is not a problem, all test machines have at least 8GB, the process uses <200MB
- rebooting the machine does not help
- switching of virus-scanners and similar stuff has no affect
- When the process is slow, there is virtually no CPU usage
- It is never slow when running it from a normal JVM
- It is pretty consistently slow when running it in a JVM that was started from the first JVM (via ProcessBuilder or as ant-task)
- All JVMs are exactly the same. I output System.getProperty("java.home") and the JVM options via RuntimeMXBean RuntimemxBean = ManagementFactory.getRuntimeMXBean(); List arguments = RuntimemxBean.getInputArguments();
- I tested it on two machines with Windows7 64bit, Java 7u2, Java 6u26 and JRockit, the hardware of the machines differs, though, but the results are very similar.
- I tested it also from outside Eclipse (command-line ant) but no difference there.
- The whole program is written by myself, all it does is reading and writing to/from this file, no other libraries are used, especially no native libraries. -
And some scary facts that I just refuse to believe to make any sense:
- Removing all class files and rebuilding the project sometimes (rarely) helps. The program (nested version) runs fast one or two times before becoming extremely slow again.
- Installing a new JVM always helps (every single time!) such that the (nested) program runs fast at least once! Installing a JDK counts as two because both the JDK-jre and the JRE-jre work fine at least once. Overinstalling a JVM does not help. Neither does rebooting. I haven't tried deleting/rebooting/reinstalling yet ...
- These are the only two ways I ever managed to get fast program runtimes for the nested program.
Questions:
- What may cause this performance drop for nested JVMs?
- What exactly do these methods do (pwrite0/force0)? -
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您是否使用本地磁盘进行所有测试(而不是任何网络共享)?
您可以使用 RAM 驱动器来设置 Windows 来存储数据吗?当 JVM 终止时,默认情况下其文件句柄将被关闭,但您可能会看到数据刷新到磁盘。当您覆盖大量数据时,以前版本的数据将被丢弃,并且可能不会导致磁盘 IO。关闭文件的行为可能会使 Windows 内核隐式地将数据刷新到磁盘。因此,使用 RAM 驱动器可以让您确认磁盘 IO 时间已从您的统计数据中删除。
找到一个适用于 Windows 的工具,它允许您强制内核将所有缓冲区刷新到磁盘,在 JVM 运行之间使用它,看看当时需要多长时间。
但我猜想您在尝试管理磁盘块缓冲区高速缓存时会遇到一些进程需求和内核需求的迭代。在Linux中有一个像“/sbin/blockdev --flushbufs”这样的工具可以做到这一点。
FWIW
“pwrite”是一个Linux/Unix API,用于允许并发写入文件描述符(这将是用于JVM的最佳内核系统调用API,我认为Win32 API已经提供了相同类型的用法来共享文件进程中线程之间的处理,但由于 Sun 有 Unix 传统,因此以 Unix 方式命名)。谷歌“pwrite(2)”以获取有关此 API 的更多信息。
“force”我猜这是一个文件系统同步,意味着该进程正在请求内核将未写入的数据(当前位于磁盘块缓冲区缓存中)刷新到磁盘上的文件中(例如在您打开文件之前需要)电脑关闭)。此操作将随着时间的推移自动发生,但事务系统需要知道先前写入(使用 pwrite)的数据何时实际到达物理磁盘并被存储。因为其他一些磁盘 IO 依赖于了解这一点,例如事务检查点。
Are you using local disks for all testing (as opposed to any network share) ?
Can you setup Windows with a ram drive to store the data ? When a JVM terminates, by default its file handles will have been closed but what you might be seeing is the flushing of the data to the disk. When you overwrite lots of data the previous version of data is discarded and may not cause disk IO. The act of closing the file might make windows kernel implicitly flush data to disk. So using a ram drive would allow you to confirm that their since disk IO time is removed from your stats.
Find a tool for windows that allows you to force the kernel to flush all buffers to disk, use this in between JVM runs, see how long that takes at the time.
But I would guess you are hitten some iteraction with the demands of the process and the demands of the kernel in attempting to manage disk block buffer cache. In linux there is a tool like "/sbin/blockdev --flushbufs" that can do this.
FWIW
"pwrite" is a Linux/Unix API for allowing concurrent writing to a file descriptor (which would be the best kernel syscall API to use for the JVM, I think Win32 API already has provision for the same kinds of usage to share a file handle between threads in a process, but since Sun have Unix heritige things get named after the Unix way). Google "pwrite(2)" for more info on this API.
"force" I would guess that is a file system sync, meaning the process is requesting the kernel to flush unwritten data (that is currently in disk block buffer cache) into the file on the disk (such as would be needed before you turned your computer off). This action will happen automatically over time, but transactional systems require to know when the data previously written (with pwrite) has actually hit the physical disk and is stored. Because some other disk IO is dependant on knowing that, such as with transactional checkpointing.
可能有帮助的一件事是确保将
FileChannel
显式设置为null
。然后在程序末尾调用System.runFinalization()
,也可能调用System.gc()
。您可能需要 1 次以上的通话。System.runFinalizersOnExit(true)
也可能有帮助,但它已被弃用,因此您必须处理编译器警告。One thing that could help is making sure you explicitly set the
FileChannel
tonull
. Then callSystem.runFinalization()
and maybeSystem.gc()
at the end of the program. You may need more than 1 call.System.runFinalizersOnExit(true)
may also help, but it's deprecated so you will have to deal with the compiler warnings.