在 C 中从二进制文件读取不同数据位的良好编码风格是什么?
我是新手程序员,正在用 C 语言编写一个简单的 wav 播放器作为一个宠物项目。文件加载过程的一部分需要从文件头读取特定数据(采样率、通道数……)。 目前我正在做的事情与此类似:
- 扫描字节序列并跳过它
- 将 2 个字节读入变量 a
- 检查值并在错误时返回
- 跳过 4 个字节
- 将 4 个字节读入变量 b
- 检查值并在错误时返回
.. 。等等。 (代码参见:https://github.com/qgi/Player/blob/master /Importer.c)
我编写了许多辅助函数来执行扫描/跳过/读取位。我仍然重复阅读、检查、跳过几次,这似乎既不是很有效,也不是很聪明。对于我的项目来说这不是一个真正的问题,但由于这在处理二进制文件时似乎是一个相当常见的任务,我想知道: 是否有某种模式可以帮助您使用更干净的代码更有效地做到这一点?
I'm novice programmer and am writing a simple wav-player in C as a pet project. Part of the file loading process requires reading specific data (sampling rate, number of channels,...) from the file header.
Currently what I'm doing is similar to this:
- Scan for a sequence of bytes and skip past it
- Read 2 bytes into variable a
- Check value and return on error
- Skip 4 bytes
- Read 4 bytes into variable b
- Check value and return on error
...and so on. (code see: https://github.com/qgi/Player/blob/master/Importer.c)
I've written a number of helper functions to do the scanning/skipping/reading bit. Still I'm repeating the reading, checking, skipping part several times, which doesn't seem to be neither very effective nor very smart. It's not a real issue for my project, but as this seems to be quite a common task when handling binary files, I was wondering:
Is there some kind of a pattern on how to do this more effectively with cleaner code?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
大多数情况下,人们定义与文件结构匹配的结构(通常使用诸如#pragma pack(1) 之类的内容来确保不会填充)。然后,他们使用
fread
之类的方法将数据读入该实例,并使用该结构中的值。Most often, people define structs (often with something like
#pragma pack(1)
to assure against padding) that matches the file's structures. They then read data into an instance of that with something likefread
, and use the values from the struct.我遇到的最简洁的选项是 Kernighan & 提出的类似于
scanf
的函数unpack
。 Pike 编程实践的第 219 页,可以像这样使用The cleanest option that I've come across is the
scanf
-like functionunpack
presented by Kernighan & Pike on page 219 of The Practice of Programming, which can be used like为了提高效率,使用大小为 4096 的缓冲区读入,然后对缓冲区中的数据进行解析会更有效,并且在只前进的情况下进行单次扫描解析是最有效的。
For efficiency using a buffer of say size 4096 to read into and then doing your parsing on the data in the buffer would be more efficient, and ofcource doing a single scan parsing where you only go forward is most efficient.