我应该在嵌入式系统上 fsck ext3 吗?
我们有许多嵌入式系统需要对驻留在具有块设备模拟的闪存存储上的文件系统进行读/写访问。 我们最古老的平台在紧凑型闪存上运行,这些系统已经使用了 3 年多,在启动过程中没有运行任何 fsck,到目前为止,我们还没有发现因文件系统或 CF 造成的故障。
在我们最新的平台上,我们使用 USB 闪存进行初始生产,现在正在迁移到模块上的磁盘以进行读/写存储。 不久前,我们在许多运行 USB 存储的设备上遇到了一些文件系统问题,因此我启用了 e2fsck,看看这是否有帮助。 事实证明,我们收到了一批坏闪存,所以一旦更换了这些闪存,问题就消失了。 我已经禁用了 e2fsck,因为我们没有任何迹象表明它使系统变得更加可靠,而且从历史上看,没有它我们也过得很好。
现在我们已经开始放入磁盘模块单元,我又开始看到文件系统错误。 突然,系统无法读取/写入某些文件,如果我尝试从紧急控制台访问该文件,我只会收到“输入/输出错误”。 我再次启用 e2fsck,所有文件都已更正。
O'Reilly 的“构建嵌入式 Linux 系统”建议在 ext2 文件系统上运行 e2fsck,但没有提及它与 ext3 的关系,因此我对是否应该启用它感到有点困惑。
您对在嵌入式系统上运行 fsck 有何看法? 我们正在考虑将二进制文件放在 ar/o 分区上,并且只将必须在同一闪存设备上的 ar/w 分区上修改的文件放在一起,以便 fsck 永远不会意外删除重要的系统二进制文件,有人有这种设置的经验吗(好坏)?
We have a number of embedded systems requiring r/w access to the filesystem which resides on flash storage with block device emulation. Our oldest platform runs on compact flash and these systems have been in use for over 3 years without a single fsck being run during bootup and so far we have no failures attributed to the filesystem or CF.
On our newest platform we used USB-flash for the initial production and are now migrating to Disk-on-Module for r/w storage. A while back we had some issues with the filesystem on a lot of the devices running on USB-storage so I enabled e2fsck in order to see if that would help. As it turned out we had received a shipment of bad flash memories so once those were replaced the problem went away. I have since disabled e2fsck since we had no indication that it made the system any more reliable and historically we have been fine without it.
Now that we have started putting in Disk-on-Module units I've started seeing filesystem errors again. Suddenly the system is unable to read/write certain files and if I try to access the file from the emergency console I just get "Input/output error". I enabled e2fsck again and all the files were corrected.
O'Reilly's "Building Embedded Linux Systems" recommends running e2fsck on ext2 filesystems but does not mention it in relation to ext3 so I'm a bit confused to whether I should enable it or not.
What are your takes on running fsck on an embedded system? We are considering putting binaries on a r/o partition and only the files which has to be modified on a r/w partition on the same flash device so that fsck can never accidentally delete important system binaries, does anyone have any experience with that kind of setup (good/bad)?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
Dave,
我总是建议在多次重新启动后运行 fsck,但不是每次都运行。
原因是 ext3 是日志式的。 因此,除非您启用写回(无日志),否则大多数时候,您的元数据/文件系统表应该与您的数据(文件)同步。
但就像杰夫提到的那样,它不能保证文件系统之上的层。 这意味着,您仍然会得到“损坏的”文件,因为某些记录可能没有写入文件系统。
我不确定您正在运行什么嵌入式设备,但它多久重新启动一次?
如果是受控重启,您始终可以在重启之前执行“sync;sync;sync”。
我自己使用 CF 已经很多年了,极少数情况下我会遇到文件系统错误。
fsck 确实对这种情况有帮助。
至于分离你的分区,我怀疑它的优势。 对于文件系统上的每个数据/文件,都有一个与其关联的元数据。 大多数时候,如果您不更改文件,例如。 二进制/系统文件,那么此元数据不应更改。 除非你的硬件有故障,比如串扰写入和串扰。 读,那些只读文件应该是安全的。
当您有可写的内容时,大多数问题都会出现,无论您将其放在哪里,如果应用程序不能很好地处理它,都可能会导致问题。
希望有帮助。
Dave,
I always recommend running the fsck after a number of reboots, but not every time.
The reason is that, the ext3 is journal-ed. So unless you enable the writeback (journal-less), then most of the time, your metadata/file-system table should be in sync with your data (files).
But like Jeff mentioned, it doesn't guarantee the layer above the file-system. It means, you still get "corrupted" files, because some of the records probably didn't get written to the file system.
I'm not sure what embedded device you're running on, but how often does it get rebooted?
If it's controlled reboot, you can always do "sync;sync;sync" before restart.
I've been using the CF myself for years, and very rare occasion I got file-system errors.
fsck does help on that case.
And about separating your partition, I doubt the advantage of it. For every data/files on the file-system, there's a metadata associated with it. Most of the time, if you don't change the files, eg. binary/system files, then this metadata shouldn't change. Unless you have a faulty hardware, like cross-talking write & read, those read-only files should be safe.
Most problems arises when you have something writable, and regardless where you put this, it can cause problems if the application doesn't handle it well.
Hope that helps.
我认为您问题的答案更多地与您的应用程序相对于其数据具有哪些类型的一致性要求有关。 也就是说,如果在没有正式关闭系统的情况下断电,必须保证什么? 一般来说,如果没有特定的应用程序在应用程序的关键事务点关闭/同步文件并刷新磁盘缓存等,以确保您需要维护的内容在其中,则没有任何桌面操作系统类型的文件系统能够很好地处理这一切。事实向媒体承诺。
运行 fsck 可以修复文件系统,但如果没有上述注意,就无法保证您所做的更改实际上会被保留。 即:电源故障导致的损失并不完全确定。
我同意将二进制文件或其他重要的只读数据放在单独的只读分区上确实有助于确保它们不会由于对文件系统结构的 fsck 更正而被错误地丢弃。 至少,将它们放在与保存 R/W 数据的位置不同的根目录中会有所帮助。 但在这两种情况下,如果您支持软件更新,您仍然需要制定方案来处理写入“只读”区域。
在我们的应用程序中,我们实际上维护一对目录来存放二进制文件等内容,并且系统设置为从这两个区域之一启动。 在软件更新期间,我们更新第一个目录,将所有内容同步到介质并验证磁盘上的 MD5 校验和,然后再进行第二个副本的更新。 在引导期间,仅当 MD5 校验和正确时才使用它们。 这可确保您始终启动一致的映像。
I think the answer to your question more relates to what types of coherency requirements you application has relative to its data. That is, what has to be guaranteed if power is lost without a formal shutdown of the system? In general, none of the desktop operating system type file systems handle this all that well without specific application closing/syncing of files and flushing of the disk caches, etc. at key transaction points in the application to ensure what you need to maintain is in fact committed to the media.
Running fsck fixes the file-system but without the above care, there is no guarantees about what changes you made will actually be kept. ie: It's not exactly deterministic what you'll lose as a result of the power failure.
I agree that putting your binaries or other important read-only data on a separate read-only partition does help ensure that they can't erroneously get tossed due to an fsck correction to file-system structures. As a minimum, putting them in a different sub-directory off the root than where the R/W data is held will help. But in both cases, if you support software updates, you still need to have scheme to deal with writing the "read-only" areas anyway.
In our application, we actually maintain a pair of directories for things like binaries and the system is setup to boot from either one of the two areas. During software updates, we update the first directory, sync everything to the media and verify the MD5 checksums on disk before moving onto the second copy's update. During boot, they are only used if the MD5 checksum is good. This ensures that you are booting a coherent image always.