如何避免读取基准中的缓存效应
我有一个读取基准,并且在连续运行之间,我必须确保数据不驻留在内存中,以避免因缓存而出现影响。到目前为止,我所做的就是:运行一个程序,在连续运行读取基准测试之间写入一个大文件。类似
./read_benchmark
./write --size 64G --path /tmp/test.out
./read_benchmark
write 程序只是将一个大小为 1G 的数组写入文件 64 次。由于主存的大小是64G,所以我写了一个大约的文件。相同的尺寸。问题是写入需要很长时间,我想知道是否有更好的方法来做到这一点,即避免数据缓存时出现的影响。
另外,如果我将数据写入/dev/null 会发生什么?
./write --size 64G --path /dev/null
这样,写入程序退出得非常快,实际上没有执行任何I/O,但我不确定它是否覆盖了64G的主存,这正是我最终想要的。
非常感谢您的意见。
I have a read benchmark and between consecutive runs, I have to make sure that the data does not reside in memory to avoid effects seen due to caching. So far what I used to do is: run a program that writes a large file between consecutive runs of the read benchmark. Something like
./read_benchmark
./write --size 64G --path /tmp/test.out
./read_benchmark
The write program simply writes an array of size 1G 64 times to file. Since the size of the main memory is 64G, I write a file that is approx. the same size. The problem is that writing takes a long time and I was wondering if there are better ways to do this, i.e. avoid effects seen when data is cached.
Also, what happens if I write data to /dev/null?
./write --size 64G --path /dev/null
This way, the write program exits very fast, no I/O is actually performed, but I am not sure if it overwrites 64G of main memory, which is what I ultimately want.
Your input is greatly appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
您可以使用
/proc
中的特殊文件删除所有缓存,如下所示:这应该确保缓存不会影响基准测试。
You can drop all caches using a special file in
/proc
like this:That should make sure cache does not affect the benchmark.
您只需卸载文件系统并将其重新安装即可。卸载会刷新并删除文件系统的缓存。
You can just unmount the filesystem and mount it back. Unmounting flushes and drops the cache for the filesystem.
使用
echo 3 > /proc/sys/vm/drop_caches
刷新页面缓存、目录条目缓存和索引节点缓存。Use
echo 3 > /proc/sys/vm/drop_caches
to flush the pagecache, directory entries cache and inodes cache.您可以使用 FADV_DONTNEED 调用 fadvise 来告诉内核不要缓存某些文件。您还可以使用 mincore() 来验证文件是否未缓存。虽然 drop_caches 解决方案显然更简单,但这可能比清除整个缓存更好,因为这会影响机器上的所有进程。我认为您不需要提升权限来使用 fadvise,而我打赌您需要提升权限来写入 /proc 。以下是如何为此目的使用 fadvise 调用的一个很好的示例: http://insights.oetiker.ch /linux/fadvise/
You can the fadvise calls with FADV_DONTNEED to tell the kernel to keep certain files from being cached. You can also use mincore() to verify that the file is not cached. While the drop_caches solution is clearly simpler, this might be better than wiping out the entire cache as that effects all processes on the box.. I don't think you need elevated privledges to use fadvise while I bet you do for writing to /proc. Here is a good example of how to use fadvise calls for this purpose: http://insights.oetiker.ch/linux/fadvise/
一种几乎永远不会失败的(粗略的)方法是简单地用另一个程序占用所有多余的内存。
创建一个分配几乎所有可用内存的简单程序(同时为您的基准应用程序留出足够的内存)。然后
memset()
将内存设置为某些内容,以确保操作系统将其提交到物理内存。最后,执行scanf()
来暂停程序而不终止它。通过“占用”所有多余的内存,操作系统将无法将其用作缓存。这在 Linux 和 Windows 中都有效。现在您可以继续进行 I/O 基准测试。
(尽管如果您与其他用户共享计算机,这可能不太顺利......)
One (crude) way that almost never fails is to simply occupy all that excess memory with another program.
Make a trivial program that allocates nearly all the free memory (while leaving enough for your benchmark app). Then
memset()
the memory to something to ensure that the OS will commit it to physical memory. Finally, do ascanf()
to halt the program without terminating it.By "hogging" all the excess memory, the OS won't be able to use it as cache. And this works in both Linux and Windows. Now you can proceed to do your I/O benchmark.
(Though this might not go well if you're sharing the machine with other users...)