解析可变长度消息

发布于 2024-07-24 13:05:01 字数 770 浏览 13 评论 0 原文

我通过这个 spec 使用 Java 实现 BitTorrent 协议。 在消息部分中,除其中 2 条消息外,所有消息都是固定长度的; 对于其中一个,这是握手后唯一的可变消息,因此我可以检查其他消息,并在没有其他消息满足时假设它是一条消息。 但对于下面的消息

位域:<位域> 
  

位域消息只能在握手序列完成后、发送任何其他消息之前立即发送。 可选,如果客户端没有碎片则无需发送。

位域消息是可变长度的,其中X是位域的长度。 有效负载是一个位字段,表示已成功下载的片段。 第一个字节中的高位对应于片索引 0。清除的位表示丢失的片,设置的位表示有效且可用的片。 末尾的备用位设置为零。

长度错误的位字段被视为错误。 如果客户端收到大小不正确的位字段,或者该位字段设置了任何备用位,则应断开连接。

如果我不知道长度,我就无法想出解析它的方法; 我该如何在字节流中找到 id ?

编辑:位域消息的有效负载是 torrent 文件中每个片段的 0 或 1,消息的长度将根据 torrent 内容的大小而变化。 所以我不认为我可以假设碎片的数量总是适合 5 字节的数字。

I am implementing the BitTorent protocol using Java via this spec. In the messages section all messages are fixed length except 2 of them; for one of them it's the only variable message after the handshake so I can check others and assume it's a piece message when no other messages met. But for the following message

bitfield: <len=0001+X><id=5><bitfield>

The bitfield message may only be sent immediately after the handshaking sequence is completed, and before any other messages are sent. It is optional, and need not be sent if a client has no pieces.

The bitfield message is variable length, where X is the length of the bitfield. The payload is a bitfield representing the pieces that have been successfully downloaded. The high bit in the first byte corresponds to piece index 0. Bits that are cleared indicated a missing piece, and set bits indicate a valid and available piece. Spare bits at the end are set to zero.

A bitfield of the wrong length is considered an error. Clients should drop the connection if they receive bitfields that are not of the correct size, or if the bitfield has any of the spare bits set.

I can't come up with a way to parse it if i do not know the length; how am I supposed to locate id in a stream of bytes?

Edit: In payload of the bitfield message is the 0's or 1's for each piece in the torrent file, length of the message will change depending on the size of the torrent content. So i don't think i can assume that the number of pieces will always fit in a 5 byte number.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

死开点丶别碍眼 2024-07-31 13:05:02

id 字段始终是消息的第 5 个字节,位于 len 字段的四个字节之后。 您可以执行如下操作:

DataInputStream stream;

// ...

int    length  = stream.readInt();
byte   id      = stream.readByte();
byte[] payload = new byte[length - 1];

stream.readFully(payload);

实际上,这应该适用于任何消息,因为它们都具有相同的 len+id 标头。

编辑:“所以我认为我不能假设碎片的数量总是适合 5 字节的数字。”

四字节长度字段最多可以处理有效负载中的 2^32-1 字节,每个字节 8 位,可为您提供 34,359,738,360 个片段的空间。 那应该够了! :-)

The id field will always be the 5th byte of a message, after the four bytes for the len field. You can do something like the following:

DataInputStream stream;

// ...

int    length  = stream.readInt();
byte   id      = stream.readByte();
byte[] payload = new byte[length - 1];

stream.readFully(payload);

That should work for any message, actually, since they all have the same len+id header.

Edit: "So i don't think i can assume that the number of pieces will always fit in a 5 byte number."

A four-byte length field can handle up to 2^32-1 bytes in the payload, and with 8 bits per byte that gives you room for 34,359,738,360 pieces. That should be plenty! :-)

意犹 2024-07-31 13:05:02

我想不出解析它的方法
如果我不知道长度;

从描述来看,长度是在消息的前4个字节中给出的。

我该如何在 a 中找到 id
字节流?

看起来 id 好像是每条消息中的第 5 个字节,就在长度字段之后。 因此,在解析完上一条消息后,您只需查看前 5 个字节即可。

I can't come up with a way to parse it
if i do not know the length;

Judging from the description, the length is given in the first 4 bytes of the message.

how am I supposed to locate id in a
stream of bytes?

It looks as though the id is the 5th byte in each message, right after the length field. So you just have to look at the first 5 bytes after you're finished parsing the previous message.

高速公鹿 2024-07-31 13:05:02

在您引用的规范的前面,我读到:“长度前缀是一个四字节大端值。”。 我将其读为:读取接下来的四个字节,将它们转换为 int,这应该是您的长度。 如果您不熟悉字节到整数的转换过程,我使用了类似于 的内容这个

Earlier in the spec you referenced, I read: 'The length prefix is a four byte big-endian value.'. I read that as: read next four bytes, convert them to an int, and that should be your length. If you are unfamiliar with the bytes-to-int-conversion process, I've used something similar to this.

眼中杀气 2024-07-31 13:05:02

我没有详细阅读规范,但没有明确知道可变长度字段的长度或某些终止定界符,我也不知道如何处理它。 bitfield= 是否可能表明您被预先告知(可变)长度

I've not read the spec in detail but without either explicitly knowing the length of a variable length field or some termination delimiter, I don't see how you can process it either. Does the bitfield=<len=0001+X> not perhaps indicate that you will be told of the (variable) length up-front?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文