分配最大缓冲区而不使用交换
在Linux下的C/C++中,我需要分配一个大的(几GB)内存块,以便存储来自连接到以太网端口的传感器的实时数据以及约110MB/s的流数据。我想分配尽可能多的内存,以最大化我可以存储的数据序列的长度。但是,我还需要确保不会发生磁盘交换,因为由此产生的延迟和磁盘访问的有限带宽会导致传感器(非常有限)的缓冲区溢出。
确定分配多少内存的最佳方法是什么?我是否只能分配比报告的可用内存稍小的块,或者我可以更直接地与 Linux 虚拟内存管理器交互吗?
In C/C++ under Linux, I need to allocate a large (several gigabyte) block of memory, in order to store real-time data from a sensor connected to the ethernet port and streaming data at about 110MB/s. I'd like to allocate the largest amount of memory possible, to maximise the length of data sequence that I can store. However, I also need to make sure that there will be no disk-swapping, since the resulting delay and limited bandwidth of disk access causes the sensor's (very limited) buffer to overflow.
What is the best way to determine how much memory to allocate? Am I limited to just allocating a slightly smaller block than the reported free memory, or can I interface more directly with the linux virtual memory manager?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
那么,在 Linux 下,您可以使用 mlock()/mlockall() 在物理内存中保留一个地址范围并防止其被换出。使用 mlock 的进程需要一些权限才能执行此操作,“man mlock”有详细信息。我不确定最大的 mlock'able 块(它可能与看起来“免费”的块不同),所以二分搜索可能会有所帮助(锁定一个范围,如果失败会减少区域的大小等......)
另一方面,110MB/s 对于固态硬盘来说并不是真正的问题。写入速度为 280MB/s 的 60GB SSD 售价约为 200 美元。只需将传感器数据复制到小型写入缓冲区并将其流式传输到 SSD。
Well, under linux you can use mlock()/mlockall() to keep an adress range in physical memory and prevent it from being swapped out. The process using mlock needs a couple of privileges to do so, "man mlock" has the details. I am not sure about the maximum mlock'able block (it might differ from what seems to be "free"), so probably a binary search could help (lock a range, if that fails reduce the size of the area etc..)
On the other hand, 110MB/s is not really a problem for a Solid-State-Drive. A 60GB SSD with 280MB/s write speed costs about $200 on the corner. Just copy the sensor data into a small write buffer and stream that to the SSD.
如果计算机系统专用于从传感器接收数据,您只需禁用交换即可。然后分配尽可能大的缓冲区,在系统中留下足够的内存仅用于必要的工具。
If the computer system is dedicated to receiving data from your sensor, you can simply disable swap. Then allocate as big buffer as you can, leaving enough memory in the system only for essential tools.
如果您
malloc
所需的内存量并以该速度写入,由于所有页面错误(即将虚拟内存的每个页面映射到物理内存,这会导致性能下降),您仍然会受到性能影响。还可能包括交换其他进程的内存)。为了避免这种情况,您可以在开始从传感器读取数据之前将整个分配的缓冲区
memset
设置为0,以便将所有需要的虚拟内存映射到物理内存。如果您只使用可用的物理内存,则根本不会发生交换。使用更多会导致其他进程的内存被交换到磁盘 - 如果这些进程空闲,则不会造成任何问题。如果它们处于活动状态(即偶尔使用它们的内存),则会发生一些交换 - 可能以比硬盘驱动器带宽低得多的速率。您使用的内存越多,就会交换出更多活动进程的内存,并且会发生更多的 HD 活动 - 此时,您可以使用的最大内存量和良好的性能几乎是反复试验的结果。
通过使用超过可用物理内存的内存,您肯定会导致以内存写入速率进行交换,并且没有办法避免这种情况。
If you
malloc
the needed amount of memory and write to it at that speed, you'll still get a performance hit due to all the page faults (i.e. mapping each page of virtual memory to physical memory, which also may include swapping out memory of other processes).In order to avoid that, you could
memset
the entire allocated buffer to 0 before you start reading from the sensor, so that all the needed virtual memory is mapped to physical memory.If you only use the available physical memory, you should suffer no swapping at all. Using more would cause memory of other processes to be swapped to the disk - if these processes are idle, it shouldn't pose any problem. If they're active (i.e. using their memory once in a while), some swapping would occur - probably in a much lower rate than the hard-drive bandwidth. The more memory you use, more active processes' memory would be swapped out, and more HD activity would occur - at this point the maximal amount of memory you could use with decent performance is pretty much a result of trial and error.
By using more than the physical memory available, you'll definitely cause swapping at the rate of memory writes, and there's no way to avoid that.
由于虚拟内存的使用方式和不可交换的内核内存,几乎不可能确定应用程序可以访问多少已安装的内存。
我能想到的最好办法是允许用户配置用于缓冲的内存量。
报告的可用内存并不是真正的“可用物理内存”。很遗憾。
这可以通过使用自定义设备驱动程序、直接在内核空间中分配内存并通过
mmap()
提供对其的访问来完成。通常不推荐,但适用于像您这样的特殊情况。随着 Linux 内核开发的步伐,知识很快就会过时,所以请对我在这里所说的话持保留态度。您可以尝试使用以下内容:
SysV 共享内存。一般是不换的。请参阅
man shmget
。tmpfs - 内存文件系统。至少在早期的 2.6 内核中,内存被固定到 RAM,因此不可交换。要将其用作内存,请在 tmpfs 上创建一个文件,
write()
将某些内容写入该文件(以强制实际分配内存),然后 mmap() 该文件。Due to how virtual memory is used, non-swappable kernel memory, it is nearly impossible to identify how much of installed memory can be accessed by an application.
Best I can come up with is to allow user to configure how much memory to use for buffering.
Reported free memory is not really "free physical memory." Unfortunately.
That can be done by using a custom device driver, allocating memory directly in kernel space and providing access to it via
mmap()
. Generally not recommended, yet would works in specialized cases such as yours.At pace of the Linux kernel development, knowledge becomes obsolete quite fast, so take with grain of salt what I'm saying here. You can try to play with the following:
SysV shared memory. It is generally not swapped. See
man shmget
.tmpfs - in-memory file system. The memory was pinned to RAM at least in early 2.6 kernels and thus was not swappable. To use it as memory, create a file on tmpfs,
write()
something into the file (to force the memory being actually allocated) and then mmap() the file.