高效灵活的二进制数据解析

发布于 2024-08-03 21:59:39 字数 264 浏览 3 评论 0原文

我有一个外部设备,可以吐出二进制数据的 UDP 数据包,并且在嵌入式系统上运行的软件需要读取该数据流、解析它并执行一些有用的操作。二进制数据也会记录到文件中。我想编写一个解析器,可以轻松地直接从 UDP 流或文件获取输入,将数据解析为特定格式,然后将输出定向到文件(例如 matlab dat 文件)或另一个进程这将进行一些实时处理。有没有任何资源可以帮助我解决这个问题?最好的方法是什么?我认为使用 C++ 流可能有意义,但我不熟悉创建自定义输出流。这看起来是一个好方法还是有更好的方法?

谢谢。

I have an external device that spits out UDP packets of binary data and software running on an embedded system that needs to read this data stream, parse it and do somethign useful. The binary data gets logged to a file as well. I would like to write a parser that can easily take the input directly from either the UDP stream, or a file, parse the data into a specific format and then direct the output to either a file (e.g. matlab dat file) or to another process that will do some real time processing. Are there any resources that would help me with this and what is the best way to go about this? I think it might make sense to use C++ streams but I'm not familiar with creating custom output streams. Does this seem like a good approach to take or is there a better way to go about it?

Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

久光 2024-08-10 21:59:39

二进制数据的优点在于它通常具有非常固定的格式。
解析它的典型方法是声明一个映射到接收到的数据包的结构,然后仅使用类型转换将字段读取为结构元素。

美妙之处在于这不需要解析。

您必须小心结构打包规则和字节顺序,以使结构映射的方式完全相同。使用 C“offsetof”和“sizeof”宏有助于发出一些调试信息,以检查您的结构是否确实映射到您认为它正在映射的内容。

打包规则通常可以通过指令(例如#pragma)或命令行选项来更改。你被困住了。如果它与您的嵌入式系统使用的不同,请将所有字段声明为字节,或使用类似“ntoh”宏的内容来进行字节交换。

The beauty of binary data is that its is generally of very fixed format.
A typical method of parsing it is to declare a structure that maps onto the received packets, and then to just use type-casts to read the fields as structure elements.

The beauty is that this requires no parsing.

you have to be careful about structure packing rules, and endian-ness to make the structure map exactly the same way. Use of the C "offsetof" and "sizeof" macros is useful to emit some debug info to check that your structure is indeed mapping to what you think it is mapping.

Packing rules can typically be altered either by directives (such as #pragma's) or command line options. Endian-ness you are stuck with. If its different from what your embedded system uses, declare all the fields as bytes, or use something like the "ntoh" macro to do the byte swapping.

椒妓 2024-08-10 21:59:39

新泽西机器代码工具包是一种解码任意二进制模式的方案。它最初是为解码指令集而设计的,但它应该适合解码消息格式。您提供二进制格式的描述,它合成代码以访问该格式的字段(有效时)。因此,您可以使用生成的函数调用来引用消息字段,而不用考虑该字段的位置或它是如何编码的。

The New Jersey Machine Code Toolkit is a scheme for decoding arbitrary binary patterns. It was originally designed for decoding instruction sets, but it ought to be just fine for decoding message formats. You provide a description of the binary format, it synthesizes code to access the fields of that format (when valid). THus you can refer to message fields using generated function calls rather than think about where the field is or how it is encoded.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文