使用 dpkt 解析 pcap 文件 (Python)
我正在尝试使用 dpkt 模块解析先前捕获的 HTTP 标头跟踪:
import dpkt
import sys
f=file(sys.argv[1],"rb")
pcap=dpkt.pcap.Reader(f)
for ts, buf in pcap:
eth=dpkt.ethernet.Ethernet(buf)
ip=eth.data
tcp=ip.data
if tcp.dport==80 and len(tcp.data)>0:
try:
http=dpkt.http.Request(tcp.data)
print http.uri
except:
print 'issue'
continue
f.close()
虽然它似乎可以有效地解析大多数数据包,但我在某些数据包上收到了 NeedData("标头过早结束") 异常。它们似乎是 WireShark 中的有效数据包,因此我对抛出异常的原因有点困惑。
一些输出:
/ec/fd/ls/GlinkPing.aspx?IG=4a06eefebcc1495f8f4de7cb41f0ce5c&CID=2265e1228f3451ff8011dcbe5e0cdff7&ID=API.YAds%2C5037.1&1307036510547
issue
issue #misses one packet here, two exceptions
/?ld=4vyO5h1FkjCNjBpThUTGnzF50sB7QUGL0Ok8YefDTWNmO6RXghgDqHXtcp1OqeXATbCAHliIkglLj95-VEwG6ZJN3fblgd3Lh5NvTp4mZPcBGXUyKqXn9FViBAsmt1T96oumpCL5gm7gZ3qlZqSdLNUWjpML_9I8FvB2TLKPSYcJmb_VwwvJhiHpiUIvrjRdzqdVVnuQZVjQmZIIlfaMq0LOmgew_plopjt7hYvOSzBi3VJl4bqOBVk3zdhIvgZK0SfJp3kEWTXAr2_UU_q9KHBpSTnvuhY2W1xo3K2BOHKGk1VAlMiWtWC_nUaJdZmhzzWfb6yRAmY3M9YkUzFGs9z10-70OszkkNpVMSS3-p7xsNXQnC3Zpaxks
感谢帮助;也许需要一个替代的图书馆推荐。
I'm trying to parse a previously-captured trace for HTTP headers using the dpkt module:
import dpkt
import sys
f=file(sys.argv[1],"rb")
pcap=dpkt.pcap.Reader(f)
for ts, buf in pcap:
eth=dpkt.ethernet.Ethernet(buf)
ip=eth.data
tcp=ip.data
if tcp.dport==80 and len(tcp.data)>0:
try:
http=dpkt.http.Request(tcp.data)
print http.uri
except:
print 'issue'
continue
f.close()
While it seems to effectively parse most of the packets, I'm receiving a NeedData("premature end of headers") exception on some. They appear to be valid packets within WireShark, so I'm a bit confused as to why the exceptions are being thrown.
Some output:
/ec/fd/ls/GlinkPing.aspx?IG=4a06eefebcc1495f8f4de7cb41f0ce5c&CID=2265e1228f3451ff8011dcbe5e0cdff7&ID=API.YAds%2C5037.1&1307036510547
issue
issue #misses one packet here, two exceptions
/?ld=4vyO5h1FkjCNjBpThUTGnzF50sB7QUGL0Ok8YefDTWNmO6RXghgDqHXtcp1OqeXATbCAHliIkglLj95-VEwG6ZJN3fblgd3Lh5NvTp4mZPcBGXUyKqXn9FViBAsmt1T96oumpCL5gm7gZ3qlZqSdLNUWjpML_9I8FvB2TLKPSYcJmb_VwwvJhiHpiUIvrjRdzqdVVnuQZVjQmZIIlfaMq0LOmgew_plopjt7hYvOSzBi3VJl4bqOBVk3zdhIvgZK0SfJp3kEWTXAr2_UU_q9KHBpSTnvuhY2W1xo3K2BOHKGk1VAlMiWtWC_nUaJdZmhzzWfb6yRAmY3M9YkUzFGs9z10-70OszkkNpVMSS3-p7xsNXQnC3Zpaxks
Help is appreciated; perhaps an alternative library recommendation is needed.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我已向 dpkt 添加了一个示例,用于解析和显示 HTTP 标头。这些文档可以在这里找到: http://dpkt.readthedocs.io/en/latest/ print_http_requests.html 和示例代码可以在 dpkt/examples/print_http_requests.py 中找到
示例输出
I've added an example to dpkt that parses and displays HTTP Headers. The docs can be found here: http://dpkt.readthedocs.io/en/latest/print_http_requests.html and the example code can be found in dpkt/examples/print_http_requests.py
Example Output
我在使用 HTTP 请求和 dpkt 时遇到了同样的问题。
问题在于 dpkt 的 HTTP 标头解析器使用了错误的逻辑。当 HTTP 不以
\r\n\r\n
结尾时,会引发此异常。 (正如你所说,有很多好的数据包末尾没有\r\n\r\n
。)这是 错误报告解决您的问题。
I have encountered the same problem while working with HTTP Requests and dpkt.
The problem is that the dpkt's HTTP headers parser uses wrong logic. This exception is raised when the HTTP doesn't end with
\r\n\r\n
. (And as you say, there are a lot of good packets with no\r\n\r\n
at the end.)Here is the bug report to your problem.
在你的Python代码中,在分配ip=eth.data之前检查以太网类型是否是IP。如果以太网类型不是ip,则对该数据包不执行任何操作。并检查IP协议是否是TCP协议,
修改了您的程序代码
,
Irengbam Tilokchan Singh
In your python code, before assignment ip=eth.data check it that whether the Ethernet type is IP or not. If the Ethernet type is not ip do nothing to that packet. And check whether IP protocol is TCP protocol
modified your program code
as
with regard,
Irengbam Tilokchan Singh