Bursty 写入 SD/USB 会导致嵌入式 Linux 上的时间关键型应用程序停止运行
我正在开发一个嵌入式 Linux 项目,该项目将 ARM9 连接到硬件视频编码器芯片,并将视频写入 SD 卡或 USB 记忆棒。 软件架构涉及一个将数据读入缓冲区池的内核驱动程序,以及一个将数据写入已安装的可移动设备上的文件的用户态应用程序。
我发现超过一定的数据速率(大约 750kbyte/sec)时,我开始看到用户态视频写入应用程序停滞大约半秒,大约每 5 秒一次。 这足以导致内核驱动程序耗尽缓冲区 - 即使我可以增加缓冲区的数量,视频数据也必须与实时发生的其他事情同步(最好在 40 毫秒内)。 在这 5 秒的“滞后峰值”之间,写入在 40 毫秒内完成(就应用程序而言 - 我很欣赏它们由操作系统缓冲)
我认为这个滞后峰值与 Linux 刷新数据的方式有关到磁盘 - 我注意到 pdflush 被设计为每 5 秒唤醒一次,我的理解是这就是写入的内容。 一旦停顿结束,用户态应用程序就能够快速服务并写入积压的缓冲区(没有溢出)。
我认为我正在写入的设备具有合理的最终吞吐量:从内存 fs 复制 15MB 文件并等待同步完成(并且 USB 记忆棒的指示灯停止闪烁)给了我大约 2.7MBytes/sec 的写入速度。
我正在寻找两种线索:
如何阻止突发性写入使我的应用程序停顿 - 也许是进程优先级、实时补丁或调整文件系统代码以连续写入而不是突发性写入?
如何让我的应用程序了解文件系统在写入积压和卡/棒吞吐量方面的情况? 我能够动态更改硬件编解码器中的视频比特率,这比丢帧或对最大允许比特率施加人为上限要好得多。
更多信息:这是一个 200MHz ARM9,当前运行基于 Montavista 2.6.10 的内核。
更新:
- 安装文件系统 SYNC 会导致吞吐量太差。
- 可移动介质必须采用 FAT/FAT32 格式,因为设计目的是使该介质可以插入任何 Windows PC 并读取。
- 定期调用sync()或fsync(),例如,每秒都会导致定期停顿和不可接受的低吞吐量
- 我使用 write() 和 open(O_WRONLY | O_CREAT | O_TRUNC) 而不是 fopen() 等。
- 我无法立即在网上找到有关提到的“Linux 实时文件系统”的任何信息。 链接?
我希望这是有道理的。 stackoverflow 上的第一个嵌入式 Linux 问题? :)
I'm working on an embedded Linux project that interfaces an ARM9 to a hardware video encoder chip, and writes the video out to SD card or USB stick. The software architecture involves a kernel driver that reads data into a pool of buffers, and a userland app that writes the data to a file on the mounted removable device.
I am finding that above a certain data rate (around 750kbyte/sec) I start to see the userland video-writing app stalling for maybe half a second, about every 5 seconds. This is enough to cause the kernel driver to run out of buffers - and even if I could increase the number of buffers, the video data has to be synchronised (ideally within 40ms) with other things that are going on in real time. Between these 5 second "lag spikes", the writes complete well within 40ms (as far as the app is concerned - I appreciate they're buffered by the OS)
I think this lag spike is to do with the way Linux is flushing data out to disk - I note that pdflush is designed to wake up every 5s, my understanding is that this would be what does the writing. As soon as the stall is over the userland app is able to quickly service and write the backlog of buffers (that didn't overflow).
I think the device I'm writing to has reasonable ultimate throughput: copying a 15MB file from a memory fs and waiting for sync to complete (and the usb stick's light to stop flashing) gave me a write speed of around 2.7MBytes/sec.
I'm looking for two kinds of clues:
How can I stop the bursty writing from stalling my app - perhaps process priorities, realtime patches, or tuning the filesystem code to write continuously rather than burstily?
How can I make my app(s) aware of what is going on with the filesystem in terms of write backlog and throughput to the card/stick? I have the ability to change the video bitrate in the hardware codec on the fly which would be much better than dropping frames, or imposing an artificial cap on maximum allowed bitrate.
Some more info: this is a 200MHz ARM9 currently running a Montavista 2.6.10-based kernel.
Updates:
- Mounting the filesystem SYNC causes throughput to be much too poor.
- The removable media is FAT/FAT32 formatted and must be as the purpose of the design is that the media can be plugged into any Windows PC and read.
- Regularly calling sync() or fsync() say, every second causes regular stalls and unacceptably poor throughput
- I am using write() and open(O_WRONLY | O_CREAT | O_TRUNC) rather than fopen() etc.
- I can't immediately find anything online about the mentioned "Linux realtime filesystems". Links?
I hope this makes sense. First embedded Linux question on stackoverflow? :)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
做你自己的flush()ing对我来说听起来是对的——你想要控制,而不是把它留给通用缓冲层的变幻莫测。
这可能是显而易见的,但请确保您没有过于频繁地调用 write() - 确保每个 write() 都有足够的数据要写入,以使系统调用开销值得。 另外,在另一个方向上,不要调用它太少,否则它会阻塞足够长的时间而导致问题。
在更难以重新实现的轨道上,您是否尝试过切换到异步 I/O? 使用 aio,您可以触发写入并交给它一组缓冲区,同时将视频数据吸入另一组缓冲区,当写入完成时,您可以切换缓冲区组。
Doing your own flush()ing sounds right to me - you want to be in control, not leave it to the vagaries of the generic buffer layer.
This may be obvious, but make sure you're not calling write() too often - make sure every write() has enough data to be written to make the syscall overhead worth it. Also, in the other direction, don't call it too seldom, or it'll block for long enough to cause a problem.
On a more difficult-to-reimplement track, have you tried switching to asynchronous i/o? Using aio you could fire off a write and hand it one set of buffers while you're sucking video data into the other set, and when the write finishes you switch sets of buffers.
首先很明显,您是否尝试过明确告诉文件刷新? 我还认为您可能可以使用一些 ioctl 来做到这一点,但老实说我还没有做过太多 C/POSIX 文件编程。
看到您使用的是 Linux 内核,您应该能够调整和重建内核以更好地满足您的需求,例如。 对永久存储的刷新更加频繁,但也较小。
快速检查我的手册页发现:
Well obvious first, have you tried explicitly telling the file to flush? I also think there might be some ioctl you can use to do it, but I honestly haven't done much C/POSIX file programming.
Seeing you're on a Linux kernel you should be able to tune and rebuild the kernel to something that suits your needs better, eg. much more frequent but then also smaller flushes to the permanent storage.
A quick check in my man pages finds this:
一个有用的 Linux 函数以及sync 或fsync 的替代函数是sync_file_range。 这使您可以安排数据写入,而无需等待内核缓冲系统处理它。
为了避免长时间暂停,请确保您的 IO 队列(例如:/sys/block/hda/queue/nr_requests)足够大。 该队列是数据从内存中刷新到到达磁盘之间的中间位置。
请注意,sync_file_range 不可移植,仅在内核 2.6.17 及更高版本中可用。
A useful Linux function and alternative to sync or fsync is sync_file_range. This lets you schedule data for writing without waiting for the in-kernel buffer system to get around to it.
To avoid long pauses, make sure your IO queue (for example: /sys/block/hda/queue/nr_requests) is large enough. That queue is where data goes in between being flushed from memory and arriving on disk.
Note that sync_file_range isn't portable, and is only available in kernels 2.6.17 and later.
有人告诉我,主机发送命令后,MMC 和 SD 卡“必须在 0 到 8 个字节内响应”。
然而,规范允许这些卡在完成操作之前响应“忙”,并且显然卡可以声称忙的时间没有限制(请告诉我是否有这样的限制)。
我发现一些低成本闪存芯片(例如 M25P80)保证“最大单扇区擦除时间”为 3 秒,尽管通常“仅”需要 0.6 秒。
这 0.6 秒听起来与你的“停滞半秒”非常相似。
我怀疑廉价、慢速闪存芯片和昂贵、快速闪存芯片之间的权衡与 USB 闪存驱动器结果的巨大差异有关:
我听说有传言说,每次擦除闪存扇区然后重新编程时,所花费的时间都会比上次长一点。
因此,如果您有一个时间关键的应用程序,您可能需要 (a) 测试您的 SD 卡和 USB 记忆棒,以确保它们满足您的应用程序所需的最小延迟、带宽等,以及 (b) 定期重新测试或抢先更换这些存储设备。
I've been told that after the host sends a command, MMC and SD cards "must respond within 0 to 8 bytes".
However, the spec allows these cards to respond with "busy" until they have finished the operation, and apparently there is no limit to how long a card can claim to be busy (please, please tell me if there is such a limit).
I see that some low-cost flash chips such as the M25P80 have a guaranteed "maximum single-sector erase time" of 3 seconds, although typically it "only" requires 0.6 seconds.
That 0.6 seconds sounds suspiciously similar to your "stalling for maybe half a second".
I suspect the tradeoff between cheap, slow flash chips and expensive, fast flash chips has something to do with the wide variation in USB flash drive results:
I've heard rumors that every time a flash sector is erased and then re-programmed, it takes a little bit longer than the last time.
So if you have a time-critical application, you may need to (a) test your SD cards and USB sticks to make sure they meet the minimum latency, bandwidth, etc. required by your application, and (b) peridically re-test or pre-emptively replace these memory devices.
在不了解您的具体情况的情况下,我只能提供以下猜测:
尝试使用 fsync()/sync() 强制内核更频繁地将数据刷新到存储设备。 听起来好像内核缓冲了所有写入,然后在执行实际写入时占用总线或以其他方式停止系统。 通过仔细调用 fsync(),您可以尝试以更细粒度的方式安排系统总线上的写入。
以编码/捕获(您没有提到视频捕获,所以我在这里做出假设 - 您可能想要添加更多信息)任务在其自己的线程中运行的方式构建应用程序可能是有意义的在用户态缓冲其输出 - 然后,第二个线程可以处理对设备的写入。 这将为您提供一个平滑缓冲区,以允许编码器始终完成其写入而不会阻塞。
听起来可疑的一件事是,您只能在特定的数据速率下看到此问题 - 如果这确实是缓冲问题,我预计该问题在较低的数据速率下不会频繁发生,但我仍然希望看到此问题问题。
无论如何,更多信息可能会有用。 您的系统架构是什么? (非常笼统地说。)
鉴于您提供的附加信息,听起来设备的吞吐量对于小写入和频繁刷新来说相当差。 如果您确定对于较大的写入,您可以获得足够的吞吐量(我不确定情况是否如此,但文件系统可能会做一些愚蠢的事情,例如在每次写入后更新 FAT),然后使用编码线程管道数据写入线程中具有足够的缓冲以避免停顿。 我过去曾使用共享内存环形缓冲区来实现这种方案,但是任何允许写入器写入 I/O 进程而不会停止的 IPC 机制(除非缓冲区已满)都应该可以实现这一点。
Without knowing more about your particular circumstances, I can only offer the following guesses:
Try using fsync()/sync() to force the kernel to flush data to the storage device more frequently. It sounds like the kernel buffers all your writes and then ties up the bus or otherwise stalls your system while performing the actual write. With careful calls to fsync() you can try to schedule writes over the system bus in a more fine grained way.
It might make sense to structure the application in such a way that the encoding/capture (you didn't mention video capture, so I'm making an assumption here - you might want to add more information) task runs in its own thread and buffers its output in userland - then, a second thread can handle writing to the device. This will give you a smoothing buffer to allow the encoder to always finish its writes without blocking.
One thing that sounds suspicious is that you only see this problem at a certain data rate - if this really was a buffering issue, I'd expect the problem to happen less frequently at lower data rates, but I'd still expect to see this issue.
In any case, more information might prove useful. What's your system's architecture? (In very general terms.)
Given the additional information you provided, it sounds like the device's throughput is rather poor for small writes and frequent flushes. If you're sure that for larger writes you can get sufficient throughput (and I'm not sure that's the case, but the file system might be doing something stupid, like updating the FAT after every write) then having an encoding thread piping data to a writing thread with sufficient buffering in the writing thread to avoid stalls. I've used shared memory ring buffers in the past to implement this kind of scheme, but any IPC mechanism that would allow the writer to write to the I/O process without stalling unless the buffer is full should do the trick.
有一个调试辅助工具,您可以使用 strace 来查看哪些操作花费了时间。
FAT/FAT32 可能会有一些令人惊讶的事情。
您写入单个文件还是多个文件?
您可以创建一个读取线程,它将维护一个准备写入队列的视频缓冲区池。
当接收到帧时,它会被添加到队列中,并且写入线程会收到信号
共享数据
读取线程:
写入线程
如果您的写入线程被阻塞等待内核,那么这可能会起作用。
但是,如果您被阻塞在内核空间内,那么除了寻找比 2.6.10 更新的内核之外,您无能为力
Has a debugging aid, you could use strace to see what operations is taking time.
There might be some surprising thing with the FAT/FAT32.
Do you write into a single file, or in multiple file ?
You can make a reading thread, that will maintain a pool of video buffer ready to be written in a queue.
When a frame is received, it is added to the queue, and the writing thread is signaled
Shared data
Reading thread :
Writing thread
If your writing threaded is blocked waiting for the kernel, this could work.
However, if you are blocked inside the kerne space, then thereis nothing much you can do, except looking for a more recent kernel than your 2.6.10
听起来你正在寻找 Linux 实时文件系统。 请务必在 Google 等网站上进行搜索。
XFS 有一个实时选项,尽管我还没有使用过它。
hdparm 可能会让您完全关闭缓存。
调整文件系统选项(关闭所有额外不需要的文件属性)可能会减少需要刷新的内容,从而加快刷新速度。 不过,我怀疑这会有多大帮助。
但我的建议是完全避免使用该棒作为文件系统,而是将其用作原始设备。 像使用“dd”一样在上面填充数据。 然后在其他地方读取原始数据并在烘焙后将其写出来。
当然,我不知道这是否适合你。
Sounds like you're looking for linux realtime filesystems. Be sure to search Google et al for that.
XFS has a realtime option, though I haven't played with it.
hdparm might let you turn off the caching altogether.
Tuning the filesystem options (turn off all the extra unneeded file attributes) might reduce what you need to flush, thus speeding the flush. I doubt that'd help much, though.
But my suggestion would be to avoid using the stick as a filesystem at all and instead use it as a raw device. Stuff data on it like you would using 'dd'. Then elsewhere read that raw data and write it out after baking.
Of course, I don't know if that's an option for you.
这里是一些关于调整 pdflush 以进行大量写入操作的信息。
Here is some information about tuning pdflush for write-heavy operations.
我会提出一些建议,建议很便宜。
fopen、fread、fwrite
使用较低级别的函数open、read、write
。O_SYNC
标志,这将导致每个写入操作阻塞,直到写入磁盘,这将消除写入的突发行为......但代价是每次写入速度变慢。copy_to_user
将视频数据缓冲区从内核空间传输到用户空间时调用。只是一些想法,希望这会有所帮助。
I'll throw out some suggestions, advice is cheap.
fopen, fread, fwrite
use the lower level functionsopen, read, write
.O_SYNC
flag when you open the file, this will cause each write operation to block until written to disk, which will remove the bursty behavior of your writes...with the expense of each write being slower.copy_to_user
calls when transferring video data buffers from kernel space to user space.Just a couple thoughts, hope this helps.
根据记录,除了最极端的情况外,有两个主要方面似乎已经消除了该问题。 该系统仍在开发中,尚未经过彻底的严酷测试,但运行得相当不错(碰木头)。
最大的胜利来自于使用户态编写器应用程序成为多线程。 有时,对 write() 的调用会阻塞:其他进程和线程仍在运行。 只要我有一个线程为设备驱动程序提供服务并更新帧计数和其他数据以与正在运行的其他应用程序同步,数据就可以在几秒钟后缓冲并写出,而不会违反任何截止日期。 我首先尝试了一个简单的乒乓双缓冲区,但这还不够; 小缓冲区会被淹没,而大缓冲区只会在文件系统消化写入时导致更大的暂停。 线程之间排队的 10 个 1MB 缓冲区现在运行良好。
另一方面是关注物理介质的最终写入吞吐量。 为此,我正在关注 /proc/meminfo 报告的 Dirty: 统计数据。 我有一些粗略且准备好的代码来限制编码器,如果 Dirty: 爬升至某个阈值以上,似乎可以正常工作。 稍后需要更多测试和调整。 幸运的是,我有大量 RAM (128M) 可供使用,这给了我几秒钟的时间来查看我的积压工作的积累和顺利减少。
如果我发现我需要做任何其他事情来处理这个问题,我会尽力记住返回并更新这个答案。 感谢其他回答者。
For the record, there turned out to be two main aspects that seem to have eliminated the problem in all but the most extreme cases. This system is still in development and hasn't been thoroughly torture-tested yet but is working fairly well (touch wood).
The big win came from making the userland writer app multi-threaded. It is the calls to write() that block sometimes: other processes and threads still run. So long as I have a thread servicing the device driver and updating frame counts and other data to sychronise with other apps that are running, the data can be buffered and written out a few seconds later without breaking any deadlines. I tried a simple ping-pong double buffer first but that wasn't enough; small buffers would be overwhelmed and big ones just caused bigger pauses while the filesystem digested the writes. A pool of 10 1MB buffers queued between threads is working well now.
The other aspect is keeping an eye on ultimate write throughput to physical media. For this I am keeping an eye on the stat Dirty: reported by /proc/meminfo. I have some rough and ready code to throttle the encoder if Dirty: climbs above a certain threshold, seems to vaguely work. More testing and tuning needed later. Fortunately I have lots of RAM (128M) to play with giving me a few seconds to see my backlog building up and throttle down smoothly.
I'll try to remember to pop back and update this answer if I find I need to do anything else to deal with this issue. Thanks to the other answerers.