使用 Protobuf-net,我突然遇到一个关于未知线路类型的异常

发布于 2024-08-19 10:13:55 字数 251 浏览 6 评论 0原文

(这是我在 RSS 中看到的一个问题的重新发布,但已被 OP 删除。我重新添加了它,因为我在不同的地方看到这个问题被问了好几次;wiki 上的“好” form")

突然,我在反序列化时收到 ProtoException ,消息是:unknownwire-type 6

  • 什么是wire-type?
  • 有哪些不同的电线类型值及其描述?
  • 我怀疑是某个字段导致了问题,如何调试?

(this is a re-post of a question that I saw in my RSS, but which was deleted by the OP. I've re-added it because I've seen this question asked several times in different places; wiki for "good form")

Suddenly, I receive a ProtoException when deserializing and the message is: unknown wire-type 6

  • What is a wire-type?
  • What are the different wire-type values and their description?
  • I suspect a field is causing the problem, how to debug this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

作妖 2024-08-26 10:13:55

首先要检查:

输入数据是 PROTOBUF 数据吗?如果您尝试解析其他格式(json、xml、csv、二进制格式化程序),或者只是损坏数据(例如“内部服务器错误”html 占位符文本页面),那么它将无法工作。


什么是线型?

它是一个 3 位标志,告诉它(广义上讲;毕竟它只有 3 位)下一个数据是什么样的。

协议缓冲区中的每个字段都有一个标头作为前缀,该标头告诉它代表哪个字段(数字),
以及接下来会出现什么类型的数据;这个“什么类型的数据”对于支持以下情况至关重要
意外数据位于流中(例如,您在一端向数据类型添加了字段),如下所示
它让序列化器知道如何读取该数据(或在需要时存储它以进行往返)。

有哪些不同的电线类型值及其描述?

  • 0:变长整数(最多 64 位) - 使用 MSB 进行 Base-128 编码,指示连续性(用作整数类型的默认值,包括枚举)
  • 1:64 位 - 8 字节数据(用于 double,或选择性用于long/ulong
  • 2:长度前缀 - 首先使用变长编码读取整数;这告诉您后面有多少字节的数据(用于字符串、byte[]、“打包”数组,并作为子对象属性/列表的默认值)
  • 3:“start group”-替代方案使用开始/结束标签对子对象进行编码的机制 - 很大程度上已被 Google 弃用,跳过整个子对象字段的成本更高,因为您不能只是“寻找”意外的对象
  • 4:“结束组” - 孪生with 3
  • 5:32 位 - 4 字节数据(用于float,或选择性用于int/uint 和其他小整数类型)

我怀疑某个字段导致了问题,如何调试?

您正在序列化到文件吗? 最可能的原因(根据我的经验)是您覆盖了现有文件,但没有截断它;即 200 字节;你已经重写了它,但只有 182 字节。现在,流的末尾有 18 字节的垃圾,导致流崩溃。重写协议缓冲区时必须截断文件。您可以使用 FileMode 来完成此操作:

using(var file = new FileStream(path, FileMode.Truncate)) {
    // write
}

或者在写入数据之后通过 SetLength 来完成此操作:

file.SetLength(file.Position);

其他可能的原因

您(意外地)将流反序列化为与序列化类型不同的类型。值得仔细检查对话双方以确保这种情况不会发生。

First thing to check:

IS THE INPUT DATA PROTOBUF DATA? If you try and parse another format (json, xml, csv, binary-formatter), or simply broken data (an "internal server error" html placeholder text page, for example), then it won't work.


What is a wire-type?

It is a 3-bit flag that tells it (in broad terms; it is only 3 bits after all) what the next data looks like.

Each field in protocol buffers is prefixed by a header that tells it which field (number) it represents,
and what type of data is coming next; this "what type of data" is essential to support the case where
unanticipated data is in the stream (for example, you've added fields to the data-type at one end), as
it lets the serializer know how to read past that data (or store it for round-trip if required).

What are the different wire-type values and their description?

  • 0: variant-length integer (up to 64 bits) - base-128 encoded with the MSB indicating continuation (used as the default for integer types, including enums)
  • 1: 64-bit - 8 bytes of data (used for double, or electively for long/ulong)
  • 2: length-prefixed - first read an integer using variant-length encoding; this tells you how many bytes of data follow (used for strings, byte[], "packed" arrays, and as the default for child objects properties / lists)
  • 3: "start group" - an alternative mechanism for encoding child objects that uses start/end tags - largely deprecated by Google, it is more expensive to skip an entire child-object field since you can't just "seek" past an unexpected object
  • 4: "end group" - twinned with 3
  • 5: 32-bit - 4 bytes of data (used for float, or electively for int/uint and other small integer types)

I suspect a field is causing the problem, how to debug this?

Are you serializing to a file? The most likely cause (in my experience) is that you have overwritten an existing file, but have not truncated it; i.e. it was 200 bytes; you've re-written it, but with only 182 bytes. There are now 18 bytes of garbage on the end of your stream that is tripping it up. Files must be truncated when re-writing protocol buffers. You can do this with FileMode:

using(var file = new FileStream(path, FileMode.Truncate)) {
    // write
}

or alternatively by SetLength after writing your data:

file.SetLength(file.Position);

Other possible cause

You are (accidentally) deserializing a stream into a different type than what was serialized. It's worth double-checking both sides of the conversation to ensure this is not happening.

萧瑟寒风 2024-08-26 10:13:55

由于堆栈跟踪引用了此 StackOverflow 问题,因此我想我应该指出,如果您(意外地)将流反序列化为与序列化类型不同的类型,您也可能会收到此异常。因此,值得仔细检查对话双方,以确保这种情况不会发生。

Since the stack trace references this StackOverflow question, I thought I'd point out that you can also receive this exception if you (accidentally) deserialize a stream into a different type than what was serialized. So it's worth double-checking both sides of the conversation to ensure this is not happening.

那伤。 2024-08-26 10:13:55

这也可能是由于尝试将多个 protobuf 消息写入单个流而引起的。解决方案是使用 SerializeWithLengthPrefix 和 DeserializeWithLengthPrefix。


为什么会发生这种情况:

protobuf 规范支持相当少量的线路类型(二进制存储格式)和数据类型(.NET 等数据类型)。此外,这不是 1:1,也不是 1:many 或 Many:1 - 单个线类型可用于多种数据类型,并且单个数据类型可以通过多种线类型中的任何一种进行编码。因此,除非您已经了解场景,否则您无法完全理解 protobuf 片段,从而知道如何解释每个值。例如,当您读取 Int32 数据类型时,支持的线类型可能是“varint”、“fixed32”和“fixed64”,而读取 String< /code> 数据类型,唯一支持的连线类型是“string”。

如果数据类型和线路类型之间没有兼容的映射,则无法读取数据,并引发此错误。

现在让我们看看为什么会在此处的场景中发生这种情况

[ProtoContract]
public class Data1
{
    [ProtoMember(1, IsRequired=true)]
    public int A { get; set; }
}

[ProtoContract]
public class Data2
{
    [ProtoMember(1, IsRequired = true)]
    public string B { get; set; }
}

class Program
{
    static void Main(string[] args)
    {
        var d1 = new Data1 { A = 1};
        var d2 = new Data2 { B = "Hello" };
        var ms = new MemoryStream();
        Serializer.Serialize(ms, d1); 
        Serializer.Serialize(ms, d2);
        ms.Position = 0;
        var d3 = Serializer.Deserialize<Data1>(ms); // This will fail
        var d4 = Serializer.Deserialize<Data2>(ms);
        Console.WriteLine("{0} {1}", d3, d4);
    }
}

:上面,两条消息是直接写在对方之后的。复杂的是:protobuf 是一种可追加的格式,append 的意思是“合并”。 protobuf消息不知道自己的长度,因此读取消息的默认方式是:读取直到EOF。但是,这里我们附加了两种不同类型。如果我们读回来,它不知道我们什么时候读完第一条消息,所以它会继续读。当它从第二条消息获取数据时,我们发现自己正在读取“字符串”线类型,但我们仍在尝试填充 Data1 实例,其中成员 1 是一个 Int32。 “string”和Int32之间没有映射,所以它爆炸了。

*WithLengthPrefix 方法允许序列化器知道每条消息的结束位置;因此,如果我们使用 *WithLengthPrefix 序列化 Data1Data2,然后反序列化 Data1>Data2 使用 *WithLengthPrefix 方法,然后它正确在两个实例之间分割传入数据,仅将正确的值读入正确的对象。

此外,当存储这样的异构数据时,您可能希望为每个类另外分配(通过*WithLengthPrefix)不同的字段编号;这可以更好地了解正在反序列化的类型。 Serializer.NonGeneric 中还有一个方法,可用于反序列化数据而无需提前知道我们要反序列化的内容

// Data1 is "1", Data2 is "2"
Serializer.SerializeWithLengthPrefix(ms, d1, PrefixStyle.Base128, 1);
Serializer.SerializeWithLengthPrefix(ms, d2, PrefixStyle.Base128, 2);
ms.Position = 0;

var lookup = new Dictionary<int,Type> { {1, typeof(Data1)}, {2,typeof(Data2)}};
object obj;
while (Serializer.NonGeneric.TryDeserializeWithLengthPrefix(ms,
    PrefixStyle.Base128, fieldNum => lookup[fieldNum], out obj))
{
    Console.WriteLine(obj); // writes Data1 on the first iteration,
                            // and Data2 on the second iteration
}

This can also be caused by an attempt to write more than one protobuf message to a single stream. The solution is to use SerializeWithLengthPrefix and DeserializeWithLengthPrefix.


Why this happens:

The protobuf specification supports a fairly small number of wire-types (the binary storage formats) and data-types (the .NET etc data-types). Additionally, this is not 1:1, nor is is 1:many or many:1 - a single wire-type can be used for multiple data-types, and a single data-type can be encoded via any of multiple wire-types. As a consequence, you cannot fully understand a protobuf fragment unless you already know the scema, so you know how to interpret each value. When you are, say, reading an Int32 data-type, the supported wire-types might be "varint", "fixed32" and "fixed64", where-as when reading a String data-type, the only supported wire-type is "string".

If there is no compatible map between the data-type and wire-type, then the data cannot be read, and this error is raised.

Now let's look at why this occurs in the scenario here:

[ProtoContract]
public class Data1
{
    [ProtoMember(1, IsRequired=true)]
    public int A { get; set; }
}

[ProtoContract]
public class Data2
{
    [ProtoMember(1, IsRequired = true)]
    public string B { get; set; }
}

class Program
{
    static void Main(string[] args)
    {
        var d1 = new Data1 { A = 1};
        var d2 = new Data2 { B = "Hello" };
        var ms = new MemoryStream();
        Serializer.Serialize(ms, d1); 
        Serializer.Serialize(ms, d2);
        ms.Position = 0;
        var d3 = Serializer.Deserialize<Data1>(ms); // This will fail
        var d4 = Serializer.Deserialize<Data2>(ms);
        Console.WriteLine("{0} {1}", d3, d4);
    }
}

In the above, two messages are written directly after each-other. The complication is: protobuf is an appendable format, with append meaning "merge". A protobuf message does not know its own length, so the default way of reading a message is: read until EOF. However, here we have appended two different types. If we read this back, it does not know when we have finished reading the first message, so it keeps reading. When it gets to data from the second message, we find ourselves reading a "string" wire-type, but we are still trying to populate a Data1 instance, for which member 1 is an Int32. There is no map between "string" and Int32, so it explodes.

The *WithLengthPrefix methods allow the serializer to know where each message finishes; so, if we serialize a Data1 and Data2 using the *WithLengthPrefix, then deserialize a Data1 and a Data2 using the *WithLengthPrefix methods, then it correctly splits the incoming data between the two instances, only reading the right value into the right object.

Additionally, when storing heterogeneous data like this, you might want to additionally assign (via *WithLengthPrefix) a different field-number to each class; this provides greater visibility of which type is being deserialized. There is also a method in Serializer.NonGeneric which can then be used to deserialize the data without needing to know in advance what we are deserializing:

// Data1 is "1", Data2 is "2"
Serializer.SerializeWithLengthPrefix(ms, d1, PrefixStyle.Base128, 1);
Serializer.SerializeWithLengthPrefix(ms, d2, PrefixStyle.Base128, 2);
ms.Position = 0;

var lookup = new Dictionary<int,Type> { {1, typeof(Data1)}, {2,typeof(Data2)}};
object obj;
while (Serializer.NonGeneric.TryDeserializeWithLengthPrefix(ms,
    PrefixStyle.Base128, fieldNum => lookup[fieldNum], out obj))
{
    Console.WriteLine(obj); // writes Data1 on the first iteration,
                            // and Data2 on the second iteration
}
无悔心 2024-08-26 10:13:55

之前的答案已经比我更好地解释了这个问题。我只是想添加一种更简单的方法来重现异常。

如果序列化的 ProtoMember 的类型与反序列化期间的预期类型不同,也会发生此错误。

例如,如果客户端发送以下消息:

public class DummyRequest
{
    [ProtoMember(1)]
    public int Foo{ get; set; }
}

但是服务器将消息反序列化为以下类:

public class DummyRequest
{
    [ProtoMember(1)]
    public string Foo{ get; set; }
}

那么这将导致在这种情况下略有误导性的错误消息

ProtoBuf.ProtoException:无效的线路类型;这通常意味着您已覆盖文件而没有截断或设置长度

。如果属性名称发生更改,甚至会发生这种情况。假设客户端发送了以下内容:

public class DummyRequest
{
    [ProtoMember(1)]
    public int Bar{ get; set; }
}

这仍然会导致服务器将 int Bar 反序列化为 string Foo 这会导致相同的 ProtoBuf.ProtoException

我希望这可以帮助人们调试他们的应用程序。

Previous answers already explain the problem better than I can. I just want to add an even simpler way to reproduce the exception.

This error will also occur simply if the type of a serialized ProtoMember is different from the expected type during deserialization.

For instance if the client sends the following message:

public class DummyRequest
{
    [ProtoMember(1)]
    public int Foo{ get; set; }
}

But what the server deserializes the message into is the following class:

public class DummyRequest
{
    [ProtoMember(1)]
    public string Foo{ get; set; }
}

Then this will result in the for this case slightly misleading error message

ProtoBuf.ProtoException: Invalid wire-type; this usually means you have over-written a file without truncating or setting the length

It will even occur if the property name changed. Let's say the client sent the following instead:

public class DummyRequest
{
    [ProtoMember(1)]
    public int Bar{ get; set; }
}

This will still cause the server to deserialize the int Bar to string Foo which causes the same ProtoBuf.ProtoException.

I hope this helps somebody debugging their application.

软糯酥胸 2024-08-26 10:13:55

还要检查所有子类是否都有 [ProtoContract] 属性。有时当你拥有丰富的DTO时你可能会错过它。

Also check the obvious that all your subclasses have [ProtoContract] attribute. Sometimes you can miss it when you have rich DTO.

画▽骨i 2024-08-26 10:13:55

当使用不正确的 Encoding 类型将字节转入和转出字符串时,我发现了此问题。

需要使用 Encoding.Default 而不是 Encoding.UTF8

using (var ms = new MemoryStream())
{
    Serializer.Serialize(ms, obj);
    var bytes = ms.ToArray();
    str = Encoding.Default.GetString(bytes);
}

I've seen this issue when using the improper Encoding type to convert the bytes in and out of strings.

Need to use Encoding.Default and not Encoding.UTF8.

using (var ms = new MemoryStream())
{
    Serializer.Serialize(ms, obj);
    var bytes = ms.ToArray();
    str = Encoding.Default.GetString(bytes);
}
尸血腥色 2024-08-26 10:13:55

如果您使用 SerializeWithLengthPrefix,请注意将实例转换为 object 类型会破坏反序列化代码并导致 ProtoBuf.ProtoException:无效的线型

using (var ms = new MemoryStream())
{
    var msg = new Message();
    Serializer.SerializeWithLengthPrefix(ms, (object)msg, PrefixStyle.Base128); // Casting msg to object breaks the deserialization code.
    ms.Position = 0;
    Serializer.DeserializeWithLengthPrefix<Message>(ms, PrefixStyle.Base128)
}

If you are using SerializeWithLengthPrefix, please mind that casting instance to object type breaks the deserialization code and causes ProtoBuf.ProtoException : Invalid wire-type.

using (var ms = new MemoryStream())
{
    var msg = new Message();
    Serializer.SerializeWithLengthPrefix(ms, (object)msg, PrefixStyle.Base128); // Casting msg to object breaks the deserialization code.
    ms.Position = 0;
    Serializer.DeserializeWithLengthPrefix<Message>(ms, PrefixStyle.Base128)
}
电影里的梦 2024-08-26 10:13:55

这发生在我的例子中,因为我有这样的事情:

var ms = new MemoryStream();
Serializer.Serialize(ms, batch);

_queue.Add(Convert.ToBase64String(ms.ToArray()));

所以基本上我将一个base64放入队列中,然后,在消费者方面我有:

var stream = new MemoryStream(Encoding.UTF8.GetBytes(myQueueItem));
var batch = Serializer.Deserialize<List<EventData>>(stream);

所以虽然每个myQueueItem的类型是正确的,但我忘记我转换了一个字符串。解决方案是再次转换它:

var bytes = Convert.FromBase64String(myQueueItem);
var stream = new MemoryStream(bytes);
var batch = Serializer.Deserialize<List<EventData>>(stream);

This happened in my case because I had something like this:

var ms = new MemoryStream();
Serializer.Serialize(ms, batch);

_queue.Add(Convert.ToBase64String(ms.ToArray()));

So basically I was putting a base64 into a queue and then, on the consumer side I had:

var stream = new MemoryStream(Encoding.UTF8.GetBytes(myQueueItem));
var batch = Serializer.Deserialize<List<EventData>>(stream);

So though the type of each myQueueItem was correct, I forgot that I converted a string. The solution was to convert it once more:

var bytes = Convert.FromBase64String(myQueueItem);
var stream = new MemoryStream(bytes);
var batch = Serializer.Deserialize<List<EventData>>(stream);
生寂 2024-08-26 10:13:55

当您序列化 ProtoContract 与 1 然后再序列化时,也会发生这种情况
使用另一个 ProtoContract 与 2 反序列化结果。

例如,您更改了 ProtoMember 的订单数量或类型。
不过,您可以使用新数字添加 ProtoMember,而不会破坏反序列化。

It can also happen when you serialize ProtoContract vs. 1 and then
deserialize the result with another ProtoContract vs. 2.

E.g. you have changed aProtoMember's number of order or perhaps it's type.
You can however add ProtoMember's with new numbers without breaking the deserialiazation.

北凤男飞 2024-08-26 10:13:55
using (var fs = File.OpenRead(source))
using (var brotliStream = new BrotliStream(fs, CompressionMode.Fast))
 {
    var index = Serializer.Deserialize<List<Files>>(fs);  <--- Error
 }

就我而言,Brotli 流正在使用源流。反序列化后放置 Brotlistream 后,异常得到修复

using (var fs = File.OpenRead(source))
using (var brotliStream = new BrotliStream(fs, CompressionMode.Fast))
 {
    var index = Serializer.Deserialize<List<Files>>(fs);  <--- Error
 }

In my case the Brotli stream was using source stream. Exception got fixed once I placed Brotlistream after Deserialization

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文