通常使用什么类型的数据结构来保存从固件/压缩映像中提取的大量文件信息?
这是我的第一篇文章,可能是一个天真的问题,但我在互联网上找不到我想知道的内容。
我想从固件文件中提取文件,并希望保留与这些文件关联的各种类型的信息,例如文件名称/标头/文件正文/某些部分/偏移等,这些信息是混合类型的二进制和文本的数据以及所有文件都与其他文件属性相互链接,例如某些部分/偏移/部分偏移等。某些文件属性信息取决于其他文件属性信息,所以我不知道通过顺序读取和处理每个文件来完成此过程的任何设计或同时:(
我尝试创建一些类来保存这些信息,但我想知道进行此类文件处理的标准是什么。如果有人提供一些建议/链接/文档或示例代码,那就太好了在c++中。
It's my first post and may be a naive question but I couldn't find on the internet what I wanted to know.
I want to extract files from a firmware file as well as want to keep various types of information associated to those files such as file name/header/file body/some sections/offset etc which are mix type of data of binary and text and also all the files are interlinked property with other files properties like some sections/offset/section offset etc. Some file property info depends on other files property info so I don't know any design to complete this process by reading and processing each file sequentially or simultaneously :(
I have tried creating some class to hold those info's but I want to know what is the standard to to do this kind of file processing.This would be very great if someone please have some suggestions/links/docs or example code in c++.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这是一个难题,因为它非常普遍。最佳策略在很大程度上取决于数据的性质、数据的结构以及您计划如何处理数据。
一次只读取一份数据并将其组装到正确的内存结构中并没有什么根本性的错误。当然,大量的小读取可能会很慢,并且读取的结构会被大量的代码所掩盖。
为每个内聚的数据位定义一个纯数据对象(即 c 中的
struct
或 c++ 中没有方法的struct
或class
)将它们放在磁盘上并一次性将它们吸入会更快,更容易理解,但是您将不得不处理使内存中的打包和字节序相匹配的问题(我知道您说过您正在与您编写的同一台机器上读取上,但仍然)。然而,这并不是很“面向对象”。或者您可以定义一堆知道如何从磁盘中提取内容的类。很好,面向对象,虽然代码可能会模糊读取的逻辑,但它是用小块来完成的,而对象文档占据了大部分空闲时间。
无限的变化是可能的。
This is a hard problem because it is very general. The optima strategy is very much dependent on the nature of the data, how it is structured and what you plan to do with it.
There is nothing fundamentally wrong with just reading one piece of data at a time and assembling it into the right in-memory structure as you go. Of course, lots of little reads may be slow, and the structure of the read will be obscured by the pure mass of code.
Defining a pure-data object (i.e. a
struct
in c or astruct
orclass
with no methods in c++) for each cohesive bit of data on disk and sucking them in in one go is faster and more easily comprhended, but you will have to deal with getting the in-memory packing and endianess to match (I know you said that you're reading on the same machine that your wrote on, but still). However, this is not very "object oriented".Or you could define a bunch of classes that know how to suck their contents off the disk. Nice and object oriented, and while the code may obscure the logic of the read it does it in little chunks with the object documentation taking up most of the slack.
Endless variations are possible.