为什么我从对等方接收的数据与预期输出不匹配?
在业余时间,我一直致力于用 C 语言实现 BitTorrent 客户端。目前它与跟踪器通信,连接到 swarm,向同级请求 torrent 文件的片段,并接收 torrent 文件的片段。然而,当验证接收到的片段是否正确时(通过获取 SHA1 哈希值并将其与 .torrent 元数据中提供的哈希值进行比较),它总是失败。
为了调试这个问题,我使用已知工作的 BitTorrent 客户端下载了一个 torrent,然后修改了我自己的 BitTorrent 实现,以仅请求和下载 torrent 的开头部分(第一部分)。然后我将这两个文件与 Emacs 的 hexl 模式进行了比较。
已知良好:
00000000: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000020: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000030: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000040: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000050: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000060: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000070: 0000 0000 0000 0000 0000 0000 0000 0000 ................
...
00008000: 0143 4430 3031 0100 4c49 4e55 5820 2020 .CD001..LINUX
00008010: 2020 2020 2020 2020 2020 2020 2020 2020
00008020: 2020 2020 2020 2020 5562 756e 7475 2031 Ubuntu 1
00008030: 312e 3034 2069 3338 3620 2020 2020 2020 1.04 i386
我的实现:
00000000: a616 f132 7f00 0080 5066 0000 0000 0080 ...2....Pf......
00000010: 5066 0000 0000 0060 3b62 0000 0000 0098 Pf.....`;b......
00000020: 3b62 0000 0000 00d0 3b62 0000 0000 0008 ;b......;b......
00000030: 3c62 0000 0000 0040 3c62 0000 0000 0078 <b.....@<b.....x
00000040: 3c62 0000 0000 00b0 3c62 0000 0000 00e8 <b......<b......
00000050: 3c62 0000 0000 0020 3d62 0000 0000 0058 <b..... =b.....X
00000060: 3d62 0000 0000 0090 3d62 0000 0000 00c8 =b......=b......
00000070: 3d62 0000 0000 0000 3e62 0000 0000 0038 =b......>b.....8
...
0000d000: 0243 4430 3031 0100 004c 0049 004e 0055 .CD001...L.I.N.U
0000d010: 0058 0020 0020 0020 0020 0020 0020 0020 .X. . . . . . .
0000d020: 0020 0020 0020 0020 0055 0062 0075 006e . . . . .U.b.u.n
0000d030: 0074 0075 0020 0031 0031 002e 0030 0034 .t.u. .1.1...0.4
0000d040: 0020 0069 0033 0038 0000 0000 0000 0000 . .i.3.8........
然后,我认为我必须将接收到的数据写入到错误的偏移量,从而导致正确的数据出现在文件中的错误位置。为了验证这一点,我启动了 gdb,并在从同行收到第一部分后检查了它的开头,期望它包含全零,就像已知良好文件的开头一样。
(gdb) break network.c:40
Breakpoint 1 at 0x402fe7: file network.c, line 40.
(gdb) run
Starting program: /home/robb/slug/slug
[Thread debugging using libthread_db enabled]
[New Thread 0x7fffcb58d700 (LWP 12936)]
[Thread 0x7fffcb58d700 (LWP 12936) exited]
ANNOUNCE: 50 peers.
CONNECTED: 62.245.41.28
CONNECTED: 89.178.142.45
CONNECTED: 66.65.166.17
...
UNCHOKE: 95.26.0.1
Requested piece 0 from peer 95.26.0.1.
UNCHOKE: 202.231.116.163
PIECE: #0 from 95.26.0.1
Breakpoint 1, handle_piece (p=0x42d7e0) at network.c:41
41 memcpy(p->torrent->mmap + length, &p->message[9], REQUEST_LENGTH);
(gdb) p off
$1 = 0
(gdb) p index
$2 = 0
(gdb) p p->message[9]
$3 = 46 '.'
(gdb) p p->message[10]
$4 = 67 'C'
(gdb) p p->message[11]
$5 = 0 '\000'
(gdb) p p->message[12]
$6 = 0 '\000'
(gdb) p p->message[13]
$7 = 0 '\000'
(gdb) p p->message[14]
$8 = 0 '\000'
(gdb) p p->message[15]
$9 = 0 '\000'
(gdb) p p->message[16]
$10 = 128 '\200'
(gdb) p p->message[17]
$11 = 46 '.'
(gdb) p p->message[18]
$12 = 67 'C'
正如您所看到的,我从对等方收到的数据并不像已知良好文件的开头那样包含全零。为什么?
我的程序的完整源代码可在 https://github.com/robertseaton/slug 获取。
In my spare time, I have been working on implementing a BitTorrent client in C. Currently it communicates with the tracker, connects to the swarm, requests pieces of the torrent file from peers, and receives pieces of the torrent file. However, when it comes to verifying that the received piece is correct (by taking a SHA1 hash and comparing it to the hash provided in the .torrent metadata), it always fails.
To debug this, I downloaded a torrent with a known-working BitTorrent client, and then modified my own BitTorrent implementation to request and download only the very beginning of the torrent (the first piece). I then compared the two files with Emacs' hexl-mode.
Known good:
00000000: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000020: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000030: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000040: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000050: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000060: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000070: 0000 0000 0000 0000 0000 0000 0000 0000 ................
...
00008000: 0143 4430 3031 0100 4c49 4e55 5820 2020 .CD001..LINUX
00008010: 2020 2020 2020 2020 2020 2020 2020 2020
00008020: 2020 2020 2020 2020 5562 756e 7475 2031 Ubuntu 1
00008030: 312e 3034 2069 3338 3620 2020 2020 2020 1.04 i386
My implementation:
00000000: a616 f132 7f00 0080 5066 0000 0000 0080 ...2....Pf......
00000010: 5066 0000 0000 0060 3b62 0000 0000 0098 Pf.....`;b......
00000020: 3b62 0000 0000 00d0 3b62 0000 0000 0008 ;b......;b......
00000030: 3c62 0000 0000 0040 3c62 0000 0000 0078 <b.....@<b.....x
00000040: 3c62 0000 0000 00b0 3c62 0000 0000 00e8 <b......<b......
00000050: 3c62 0000 0000 0020 3d62 0000 0000 0058 <b..... =b.....X
00000060: 3d62 0000 0000 0090 3d62 0000 0000 00c8 =b......=b......
00000070: 3d62 0000 0000 0000 3e62 0000 0000 0038 =b......>b.....8
...
0000d000: 0243 4430 3031 0100 004c 0049 004e 0055 .CD001...L.I.N.U
0000d010: 0058 0020 0020 0020 0020 0020 0020 0020 .X. . . . . . .
0000d020: 0020 0020 0020 0020 0055 0062 0075 006e . . . . .U.b.u.n
0000d030: 0074 0075 0020 0031 0031 002e 0030 0034 .t.u. .1.1...0.4
0000d040: 0020 0069 0033 0038 0000 0000 0000 0000 . .i.3.8........
I figured, then, that I must be writing the received piece to the incorrect offset, resulting in the correct data occuring at the wrong location in the file. To verify this, I fired up gdb and inspected the very beginning of the first piece after receiving it from a peer, expecting it to contain all zeroes, like the beginning of the known-good file.
(gdb) break network.c:40
Breakpoint 1 at 0x402fe7: file network.c, line 40.
(gdb) run
Starting program: /home/robb/slug/slug
[Thread debugging using libthread_db enabled]
[New Thread 0x7fffcb58d700 (LWP 12936)]
[Thread 0x7fffcb58d700 (LWP 12936) exited]
ANNOUNCE: 50 peers.
CONNECTED: 62.245.41.28
CONNECTED: 89.178.142.45
CONNECTED: 66.65.166.17
...
UNCHOKE: 95.26.0.1
Requested piece 0 from peer 95.26.0.1.
UNCHOKE: 202.231.116.163
PIECE: #0 from 95.26.0.1
Breakpoint 1, handle_piece (p=0x42d7e0) at network.c:41
41 memcpy(p->torrent->mmap + length, &p->message[9], REQUEST_LENGTH);
(gdb) p off
$1 = 0
(gdb) p index
$2 = 0
(gdb) p p->message[9]
$3 = 46 '.'
(gdb) p p->message[10]
$4 = 67 'C'
(gdb) p p->message[11]
$5 = 0 '\000'
(gdb) p p->message[12]
$6 = 0 '\000'
(gdb) p p->message[13]
$7 = 0 '\000'
(gdb) p p->message[14]
$8 = 0 '\000'
(gdb) p p->message[15]
$9 = 0 '\000'
(gdb) p p->message[16]
$10 = 128 '\200'
(gdb) p p->message[17]
$11 = 46 '.'
(gdb) p p->message[18]
$12 = 67 'C'
As you can see, the data I received from the peer doesn't contain all zeroes like the beginning of the known-good file. Why?
The full source of my program is availabe at https://github.com/robertseaton/slug.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这没有考虑到 bufferevent_read 可能会失败并返回负数:
替换为:
This fails to take into account that bufferevent_read may fail and return a negative amount:
Replace with:
阅读源代码后,我在 network.c 中发现了这一点:
我认为最后两行的目的是:
BTW REQUEST_LENGTH = 16K。
更可能的是,这个“length-thing”应该是 p->message_length,或 (p->message_length - 9)
另一个 bug 可能是 strlen()+1 类型的 bug。
Reading the source I found this in network.c:
I think the last two lines are intended to be:
BTW REQUEST_LENGTH = 16K.
More probably this "length-thing" should be p->message_length, or (p->message_length - 9)
The other bug is probably a strlen()+1 type of bug.