从 PCAP 嗅探重建数据
我试图通过 libpcap 嗅探 HTTP 数据,并在处理 TCP 有效负载后获取所有 http 内容(标头+有效负载)。
根据我在 编写 http 嗅探器(或任何其他应用程序级别的嗅探器) ,我面临着由于碎片而产生的问题 - 我需要重建整个流(或对其进行碎片整理)以获得完整的 HTTP 数据包,这就是我需要一些帮助的地方。
谢谢期待!!
I am trying to sniff HTTP data through libpcap and get all the http contents (header+payload) after processing the TCP payload.
As per my discussion at Writing an http sniffer (or any other application level sniffer) , I am facing problems due to fragmentation - I need to reconstruct the whole stream (or defragment it) to get a complete HTTP packet, and this is where I need some help.
Thanks in anticipation !!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
这真的很简单。只需获取从 pcap 获得的以太网帧并从中提取 IP 数据包,重新组装任何碎片即可。然后,根据序列号对 IP 数据包中的 TCP 段重新排序,注意丢弃任何重复的数据。然后,将该流作为 HTTP 流进行处理。当然,HTTP 不是以数据包的形式出现的;而是以数据包的形式出现的。它是一个应用程序层协议,但我相信一旦您完成了所有其他工作,这一点就会显而易见。在执行所有这些操作时,请注意对 IP 标头和 TCP 段进行校验和,以确保数据正确。另外,如果 pcap 碰巧丢失了任何数据包,请确保正确处理此问题。
为了帮助您了解 Linux TCP 堆栈,应该提供一个简明的参考这个过程发生在内核中。
It's really pretty simple. Just take the ethernet frames that you get from pcap and extract the IP packets from them, reassembling any that were fragmented. Then, reorder the TCP segments from the IP packets, according to the sequence numbers, paying attention that you discard any duplicate data. Then, process the stream as an HTTP stream. Of course, HTTP doesn't come in packets; it is an application layer protocol, but I'm sure this will be obvious once you've done all this other work. Pay attention as you do all these things to checksum the IP headers and TCP segments, to ensure that your data is correct. Also, if pcap happens to miss any packets, make sure you deal with this appropriately.
To help you along the Linux TCP stack should provide a concise reference to this process as it occurs in the kernel.
您可以使用 tcptrace 重新组装 pcap 文件,而不是自行重新组装流。我相信
-e
会做到的。一旦您将应用程序层数据整合在一起,您就可以应用简单的 HTTP 标头解析......来自诸如 http://github.com/ry/http-parser
Rather than reassemble the streams youself, you can use tcptrace to reassemble the pcap file. I believe
-e
will do it.Once you have the application-layer data in one piece, you can apply simple HTTP header parsing.... Perhps from a library such as http://github.com/ry/http-parser
要重建 pcap 文件中包含的数据,Xplico 是一个很棒的工具:http://www.xplico.org
To reconstruct the data contained in a pcap file a wonderful tool is Xplico: http://www.xplico.org
从 pcap 文件重建 http 内容的最佳工具是 justniffer。它使用 Linux 内核的部分来进行 IP 分段和 TCP 数据包重新排序。
The best tool to recostruct http content from pcap files is justniffer. It uses prtion of linux kernel for IP fragmentation e tcp packet reordeiring.
PCapPlusPlus 包含一个示例控制台程序 TCPReassemble 嗅探流量并将每个流输出到单独的文本文件。您可以在许多选项中指示要收听哪个流。
该文档还提到了一个具有更多选项的 Linux 应用程序 tcpflow。
PCapPlusPlus includes an example console program TCPReassembly which sniffs traffic and outputs each stream to a separate text file. You can instruct which stream to listen to, amongst many options.
The documentation also mentions a linux app tcpflow with even more options.