如何读取二进制 C++ protobuf数据使用Python protobuf?

发布于 2024-08-13 23:33:19 字数 1673 浏览 2 评论 0原文

Google protobuf 的 Python 版本只为我们提供了:

SerializeAsString()

而 C++ 版本为我们提供了:

SerializeToArray(...)
SerializeAsString()

我们以二进制格式写入 C++ 文件,并且我们希望保持这种方式。也就是说,有没有一种方法可以将二进制数据读入 Python 并像字符串一样解析它?

这是正确的做法吗?

binary = get_binary_data()
binary_size = get_binary_size()

string = None
for i in range(len(binary_size)):
   string += i

message = new MyMessage()
message.ParseFromString(string)

更新:

这是一个新示例,还有一个问题:

message_length = 512

file = open('foobars.bin', 'rb')

eof = False
while not eof:

    data = file.read(message_length)
    eof = not data

    if not eof:
        foo_bar = FooBar()
        foo_bar.ParseFromString(data)

当我们到达 foo_bar.ParseFromString(data) 行,我收到此错误:

Exception Type: DecodeError
Exception Value: Too many bytes when decoding varint.

更新 2:

事实证明,二进制数据上的填充正在抛出 protobuf;正如消息所示,发送了太多字节(在本例中指的是填充)。

此填充来自在固定长度缓冲区上使用 C++ protobuf 函数 SerializeToArray。为了消除这个问题,我使用了这个临时代码:

message_length = 512

file = open('foobars.bin', 'rb')

eof = False
while not eof:

    data = file.read(message_length)
    eof = not data

    string = ''
    for i in range(0, len(data)):
        byte = data[i]
        if byte != '\xcc': # yuck!
            string += data[i]

    if not eof:
        foo_bar = FooBar()
        foo_bar.ParseFromString(string)

我认为这里存在一个设计缺陷。我将重新实现我的 C++ 代码,以便它将可变长度数组写入二进制文件。根据 protobuf 文档的建议,我将为每条消息添加二进制大小的前缀,以便我知道使用 Python 打开文件时要读取多少内容。

The Python version of Google protobuf gives us only:

SerializeAsString()

Where as the C++ version gives us both:

SerializeToArray(...)
SerializeAsString()

We're writing to our C++ file in binary format, and we'd like to keep it this way. That said, is there a way of reading the binary data into Python and parsing it as if it were a string?

Is this the correct way of doing it?

binary = get_binary_data()
binary_size = get_binary_size()

string = None
for i in range(len(binary_size)):
   string += i

message = new MyMessage()
message.ParseFromString(string)

Update:

Here's a new example, and a problem:

message_length = 512

file = open('foobars.bin', 'rb')

eof = False
while not eof:

    data = file.read(message_length)
    eof = not data

    if not eof:
        foo_bar = FooBar()
        foo_bar.ParseFromString(data)

When we get to the foo_bar.ParseFromString(data) line, I get this error:

Exception Type: DecodeError
Exception Value: Too many bytes when decoding varint.

Update 2:

It turns out, that the padding on the binary data was throwing protobuf off; too many bytes were being sent in, as the message suggests (in this case it was referring to the padding).

This padding comes from using the C++ protobuf function, SerializeToArray on a fixed-length buffer. To eliminate this, I have used this temproary code:

message_length = 512

file = open('foobars.bin', 'rb')

eof = False
while not eof:

    data = file.read(message_length)
    eof = not data

    string = ''
    for i in range(0, len(data)):
        byte = data[i]
        if byte != '\xcc': # yuck!
            string += data[i]

    if not eof:
        foo_bar = FooBar()
        foo_bar.ParseFromString(string)

There is a design flaw here I think. I will re-implement my C++ code so that it writes variable length arrays to the binary file. As advised by the protobuf documentation, I will prefix each message with it's binary size so that I know how much to read when I'm opening the file with Python.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

空心空情空意 2024-08-20 23:33:19

我不是 Python 专家,但您可以将 file.read() 操作的结果传递到 message.ParseFromString(...) 中,而无需构建新的字符串类型或任何东西。

I'm not an expert with Python, but you can pass the result of a file.read() operation into message.ParseFromString(...) without having to build a new string type or anything.

老娘不死你永远是小三 2024-08-20 23:33:19

Python 字符串可以包含任何字符,即它们能够直接保存“二进制”数据。应该不需要从字符串转换为“二进制”。

Python strings can contain any character, i.e. they are capable of holding "binary" data directly. There should be no need to convert from string to "binary".

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文