如何使用 SharpPcap 捕获 HTTP 数据包
我想捕获我的机器的所有传入 HTTP 数据包。为此,我使用 SharpPcap,它是 WinPcap 包装器。
SharpPcap 工作得很好,但它捕获 TCP 数据包,这级别太低,无法执行我想要的操作。有谁知道如何轻松地从所有这些 TCP 数据包中获取完整的 HTTP 请求/响应?
谢谢
I would like to capture all incoming HTTP packets of my machine. To do that I'm using SharpPcap which is a WinPcap wrapper.
SharpPcap works very well but it captures TCP packets and this is too low level to do what I want. Does anyone know how can I easly get full HTTP requests/responses from all these TCP packets ?
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
SharpPcap已经能够以与wireshark相同的方式捕获数据包(只是通过代码而不是GUI)。您可以直接解析它们,也可以将它们以常见的 .pcap 文件格式转储到驱动器。
解析捕获的步骤是:
如果您正在读取 .pcap 转储文件,则该过程是几乎相同,只是您调用离线捕获读取器,不需要选择接口,并且不需要设置混杂模式。 SharpPcap 支持wireshark、tcpdump 和大多数其他Pcap 框架使用的所有标准过滤器。有关这些的参考,请查看 tcpdump man。
目前不支持直接解析HTTP,但解析TCP数据包确实很容易。
当您收到原始数据包(未解析)时,请执行以下操作:
Packet.Net(SharpPcap 的一个独立且包含的组件)解析器能够直接提取 TCP 部分,即使通信是通过 VPN、PPoE 或 PPP 封装的。
一旦你解析了 TCPPacket,只需抓取 packet.PayloadBytes 作为字节数组中的有效负载,该数组应该包含原始字节中的 HTTP 标头,可以转换为正确的文本格式(我不太确定 HTTP 标头是否使用 UTF-8或该级别的 ASCII 编码)。应该有大量免费可用的工具/库来解析 HTTP 标头。
要从 TCP 中提取 HTTP 数据包:
您需要在连接的 tcp 数据包传入时收集它们,如果数据是碎片(大于 1500 字节),则需要重新组装这些部分记忆。要发现哪些部分按什么顺序排列,您需要仔细跟踪序列/确认号。
使用 SharpPcap 完成这一任务并不简单,因为您正在使用堆栈的较低部分并手动重新组装连接。
Wireshark 有一篇有趣的文章介绍了如何用 C 语言实现这一点。
截至目前,SharpPcap 不支持 TCP 有效负载解析。
如果您正在寻找有关如何使用 SharpPcap 的易于理解的示例,请下载源代码树并查看其中包含的示例项目。还有一个 codeproject 上的 SharpPcap 教程。
如果您有更多问题和/或想要向该项目提出任何功能请求,请随时在 SourceForge 项目上发帖。它还远未消亡,并且仍在积极开发中。
注:Chris Morgan 是项目负责人,我是 SharpPcap/Packet.Net 的开发人员之一。
更新:代码项目上的教程项目现已更新,以匹配当前的 API。
SharpPcap is already able to capture packets in the same manner that wireshark does (just in code rather than a GUI). And you can either parse them directly or you can dump them to the drive in the common .pcap file format.
The steps to parse a capture are:
If you're reading .pcap dump files the process is almost the same except you call an offline capture reader, don't need to pick an interface, and don't need to set promiscuous mode. All of the standard filters that wireshark, tcpdump, and most other Pcap frameworks use are supported in SharpPcap. For a reference to these check the tcpdump man.
Currently there is no support for parsing HTTP directly but parsing TCP packets is really easy.
When you receive the raw packet (non parsed) do this:
The Packet.Net (A separate and included component of SharpPcap) parser is capable of pulling out the TCP portion directly even if the communication is encapsulated by VPN, PPoE, or PPP.
Once you have the TCPPacket parsed just grab packet.PayloadBytes for the payload in a byte array that should contain the HTTP header in raw bytes that can be converted to the proper text format (I'm not really sure if HTTP headers use UTF-8 or ASCII encoding on that level). There should be plenty of freely available tools/libraries to parse HTTP headers.
To extract the HTTP packet from TCP:
You need to collect the tcp packets of the connection as they come in and if the data is fragmented (greater than 1500 bytes) you need to re-assemble the parts in memory. To discover which parts go in what order you need to carefully track the sequence/acknowledgement numbers.
This is a non-trivial thing to accomplish with SharpPcap because you're working with a much lower part of the stack and re-assembling the connection manually.
Wireshark has an interesting article on how to accomplish this in C.
As of right now, SharpPcap doesn't support TCP payload parsing.
If you're looking for easy-to-follow examples of how to use SharpPcap download the source tree and look at the example projects included. There is also a tutorial for SharpPcap on codeproject.
If you have more questions and/or you want to make any feature requests to the project, feel free to post on the SourceForge project. It is far from dead and continues to be under active development.
Note: Chris Morgan is the project lead and I'm one of the developers for SharpPcap/Packet.Net.
Update: The tutorial project on code project is now up-to-date to match the current API.
将 TCP 流解码为 HTTP 请求/响应对并非易事。像 WireShark 这样的工具可以付出相当大的努力来做到这一点。
我为 Ruby 编写了一个 WireShark 包装器(这不会对您有帮助),但在编写它之前,我尝试使用 tshark(WireShark 的命令行版本)。这并没有解决我的问题,但它可能对你有用。方法如下:
捕获数据包并将它们写入 pcap 文件(SharpPcap 可能有办法做到这一点)。在某个时刻关闭 cap 文件并启动另一个文件,然后在旧文件上运行 tshark,并使用 HTTP 流量过滤器和一个指示您希望以 PDML 格式输出的标志。您会发现这是一种 XML 格式,可以使用 System.Xml 工具轻松解析,其中包含各种格式的每个 HTTP 字段的值。您可以编写 C# 代码来生成 tshark,并将其 StdOut 流通过管道传输到 XML 读取器,以便在数据包出现时将其从 tshark 中取出。我不建议使用 DOM 解析器,因为大型捕获文件的 PDML 输出很快就会变得疯狂。
除非您的要求很复杂(就像我的一样),否则这可能就是您所需要的。
Decoding a TCP stream into HTTP request/response pairs is non-trivial. Tools like WireShark do this with considerable effort.
I wrote a WireShark wrapper for Ruby (not that that will help you), but before I wrote it I tried using tshark (the command-line version of WireShark). That didn't solve my problem but it may work for you. Here's how:
You capture the packets and write them to a pcap file (SharpPcap probably has a way to do this). At some point close the cap file and start another one, then on the old one run tshark with a filter for HTTP traffic, and a flag indicating you want the output in the PDML format. You'll find this is an XML format, easily parsed with the System.Xml tools, which contains the value of every HTTP field in a variety of formats. You can write C# code to spawn tshark, and pipe its StdOut stream into an XML reader so you get the packets out of tshark as they emerge. I don't recommend using the DOM parser as the PDML output for a large capture file can get crazy very quickly.
Unless your requirements are complex (as mine were), this may be all you need.
我认为您已经接近解决方案:如果您有来自 HTTP 流量的 TCP 数据包,则只需提取 TCP 有效负载即可重建 HTTP 请求/响应。请参阅此SO条目了解可能的方法。
I think you are close to the solution: if you have the TCP packets from the HTTP traffic, you only have to extract the TCP payload in order to rebuild the HTTP request/response. See this SO entry on a possible way to do it.