如何使用 SD 卡以 48 ksamples/s 的速度记录 16 位数据?
背景
我的主板采用了 STM32 微控制器和 SD/MMC 卡 SPI 并以 48 ksamples/s 的速度采样模拟数据。我正在使用 Keil Real-time Library RTX 内核和 ELM FatFs。
我有一个高优先级任务,通过 DMA 以 40 个样本块(40 x 16 位)捕获模拟数据;数据通过长度为 128 的队列(构成大约 107 毫秒的样本缓冲)传递到第二个低优先级任务,该任务将样本块整理到 2560 字节缓冲区(这是 512 字节 SD 扇区大小的倍数)和 40 个样本块大小)。当该缓冲区已满时(32 个块或大约 27 毫秒),数据将写入文件系统。
观察
通过检测代码,我可以看到每 32 个块就会写入一次数据,并且写入大约需要 6 毫秒。这种情况会一直持续到(在 FAT16 上)文件大小达到 1 MB,此时写入操作需要 440 毫秒,此时队列已满且日志记录将中止。如果我将卡格式化为 FAT32,则为“长写入”之前的文件大小事件大小为 4 MB。
发生这种情况的文件大小在 FAT16 和 FAT32 之间变化的事实表明,这不是卡的限制,而是文件系统在 1 MB 或 4 MB 边界上执行的操作需要额外的时间。
看来我的任务正在及时安排,并且时间消耗在 ELM FatFs 代码仅位于 1 MB(FAT32 为 4)边界。
问题
有解释或解决方案吗?这是 FAT 问题,还是 ELM 的 FatFs 代码特有的问题?
我考虑过使用多个文件,但根据我的经验,FAT 不能很好地处理单个目录中的大量文件,而且这也会失败。完全不使用文件系统并写入原始卡是可能的,但理想情况下,我想在具有标准驱动程序且无需特殊软件的 PC 上读取数据。
我想到尝试编译器优化来缩短写入时间;这似乎有效果,但写入时间似乎更加可变。在-O2处我确实得到了一个8MB的文件,但是结果不一致。我现在不确定文件大小和失败点之间是否存在直接相关性;我已经看到它在没有特定边界的各种文件长度上以这种方式失败。可能是卡性能问题。
我进一步对代码进行了检测并应用了分而治之的方法。这一观察可能使这个问题变得过时,并且所有先前的观察都是错误的或转移注意力的。
我最终将其范围缩小到一个多扇区写入 (CMD25) 的实例,其中偶尔卡的“等待就绪”轮询需要 174 毫秒,对于 5 个块中的前三个扇区。设置了等待就绪的超时到 500 毫秒,所以它会很乐意等待这么长时间。在一般情况下,迭代使用 CMD24(单扇区写入)要慢得多 - 每个扇区 140 毫秒 - 而不是偶尔。
所以这看起来毕竟是卡的行为。我将尽力尝试一系列 SD 和 MMC 卡。
Background
My board incorporates an STM32 microcontroller with an SD/MMC card on SPI and samples analogue data at 48 ksamples/s. I am using the Keil Real-time Library RTX kernel, and ELM FatFs.
I have a high priority task that captures analogue data via DMA in blocks of 40 samples (40 x 16 bit); the data is passed via a queue of length 128 (which constitutes about 107 ms of sample buffering) to a second low priority task that collates sample blocks into a 2560 byte buffer (this being a multiple of both the 512 byte SD sector size and the 40 sample block size). when this buffer is full (32 blocks or approx 27 ms), the data is written to the file system.
Observation
By instrumenting the code, I can see that every 32 blocks, the data is written and that the write takes about 6 ms. This is sustained until (on FAT16) the file size gets to 1 MB, when the write operation takes 440 ms, by which time the queue fills and logging is aborted. If I format the card as FAT32, the file size before the 'long-write' event is 4 MB.
The fact that the file size at which this occurs changes between FAT16 and FAT32 suggests to me that it is not a limitation of the card but rather something that the file system does at the 1 MB or 4 MB boundaries that takes additional time.
It also appears that my tasks are being scheduled in a timely manner, and that the time is consumed in the ELM FatFs code only at the 1 MB (or 4 for FAT32) boundary.
The question
Is there an explanation or a solution? Is it a FAT issue, or rather specific to ELM's FatFs code perhaps?
I have considered using multiple files, but in my experience FAT does not handle large numbers of files in a single directory very well and this would simply fail also. Not using a file system at all and writing to the card raw would be a possibility, but ideally I'd like to read the data on a PC with standard drivers and no special software.
It occurred to me to try compiler optimisations to get the write-time down; this seems to have an effect, but the write times seemed much more variable. At -O2 I did get a 8 MB file, but the results were inconsistent. I am now not sure whether there is a direct correlation between the file size and the point at which it fails; I have seen it fail in this way at various file lengths on no particular boundary. Maybe it is a card performance issue.
I further instrumented the code and applied a divide an conquer approach. This observation probably renders the question obsolete and all previous observations are erroneous or red-herrings.
I finally narrowed it down to an instance a multi-sector write (CMD25) where occasionally the "wait ready" polling of the card takes 174 ms for the first three sectors out of a block of 5. The timeout for wait ready is set to 500 ms, so it would happily busy-wait for that long. Using CMD24 (single sector write) iteratively is much slower in the general case - 140 ms per sector - rather than just occasionally.
So it seems a behaviour of the card after all. I shall endeavour to try a range of cards SD and MMC.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
要尝试的第一件事可能非常简单:将队列深度增加到 640。这将为您提供 535 毫秒的缓冲时间,并且至少应该能够幸免于这个特定的文件系统事件。
第二件事是 ELM FatF 的配置。许多嵌入式文件系统默认情况下对缓冲区的使用非常吝啬。我见过一个使用单个 512 字节块缓冲区来执行所有操作,并且它针对某些文件系统事务进行爬网的情况。我们给了它几千字节,事情变得更快了几个数量级。
当然,以上两者都取决于您是否有更多可用内存。
第三种选择是预分配一个大文件,然后在数据收集期间覆盖数据。这将消除大量昂贵的簇分配和 FAT 操作操作。
由于编译器优化会影响这一点,因此您还必须考虑它是多线程问题的可能性。是否有其他正在运行的线程可能会干扰较低优先级的读取器线程?您还应该尝试将缓冲更改为样本大小和闪存块大小的倍数以外的值,以防遇到某种系统共振。
The first thing to try could be quite easy: increase the queue depth to 640. That would give you 535 ms of buffering and should survive at least this particular file system event.
The second thing to look at is the configuration of the ELM FatFs. Many embedded file systems are very stingy with buffer usage by default. I've seen one that used a single 512 byte block buffer for all operations and it crawled for certain file system transactions. We gave it a couple of kilobytes and the thing became orders of magnitude faster.
Both of the above are dependent on whether you have more RAM available, of course.
A third option would be to preallocate a huge file and then just overwrite the data during data collection. That would eliminate a number of expensive cluster allocation and FAT manipulation operations.
Since compiler optimization affected this, you must also consider the possibility that it is a multi-threading issue. Are there other threads running that could disturb the lower priority reader thread? You should also try changing the buffering there to something other than a multiple of the sample size and flash block size in case you're hitting some kind of system resonance.
您(或阅读此问题的任何其他人)可以尝试此 FAT 库: https://github.com/fernando -rodriguez/fat32lib。
在具有 10 Mbit/s SPI 的 40 MIPS Microchip dsPIC33 上它可以在我尝试过的任何卡上以 230 Ksps(16 位)的速度进行采样。
You (or anyone else reading this question) could try this FAT library: https://github.com/fernando-rodriguez/fat32lib.
On a 40 MIPS Microchip dsPIC33 with a 10 Mbit/s SPI bus it can sample at 230 Ksps (16-bit) on any card I've tried.