编写一个http嗅探器
我想编写一个程序来通过数据包捕获提取系统访问的网站的 URL(IP 地址)。我认为该 URL 将出现在数据部分中(即不在任何标头中 - ethernet / ip / tcp-udp)..(此类程序有时称为http嗅探器,我不应该使用任何可用的工具)。作为初学者,我刚刚完成了这个基本的嗅探器程序:sniffex.c。 .任何人都可以告诉我应该朝哪个方向前进..
I would like to write a program to extract the URLs of websites visited by a system (an IP address) through packet capture.. I think this URL will come in the data section ( ie not in any of the headers - ethernet / ip / tcp-udp ).. ( Such programs are sometimes referred to as http sniffers , i'm not supposed to use any available tool ). As a beginner , I've just now gone through this basic sniffer program : sniffex.c.. Can anyone please tell me in which direction i should proceed..
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
注意:在下面的信息中,假设 GET 还包括 POST 和其他 HTTP 方法。
这肯定比查看一个数据包要做更多的工作,但如果您捕获整个流,您应该能够从发送的 HTTP 标头中获取它。
尝试查看 Host 标头(如果已提供),以及 GET 实际请求的内容。 GET 可以是完整的 URL,也可以只是服务器上的文件名。
另请注意,这与从 IP 地址获取域名无关。如果你想要域名,就必须挖掘数据。
我的机器上的快速示例,来自 Wireshark:
另一个示例,不是来自浏览器,并且 GET 中只有一个路径:
在第二个示例中,实际的 URL 是 http://example.com/ccnet/XmlStatusReport.aspx
Note: In the info below, assume that GET also includes POST and the other HTTP methods too.
It's definitely going to be a lot more work than looking at one packet, but if you capture the entire stream you should be able to get it from the HTTP headers sent out.
Try looking at the Host header if that's provided, and also what is actually requested by the GET. The GET can be either a full URL or just a file name on the server.
Also note that this has nothing to do with getting a domain name from an IP address. If you want the domain name, you have to dig into the data.
Quick example on my machine, from Wireshark:
Another example, not from a browser, and with only a path in the GET:
In the second example, the actual URL is http://example.com/ccnet/XmlStatusReport.aspx
不,没有足够的信息。一个 IP 可以对应任意数量的域名,并且每个域名实际上可以拥有无限数量的 URL。
但是,请查看 gethostbyaddr(3) 以了解如何在 ip 上执行反向 dns 查找,以至少获取该 ip 的规范名称。
更新:当您编辑问题时,@aehiilrs 有一个 更好的答案r。
No, there is not enough information. A single IP can correspond to any number of domain names, and each of those domains could have literally an infinite number of URLs.
However, look at gethostbyaddr(3) to see how to do a reverse dns lookup on the ip to at least get the canonical name for that ip.
Update: as you've edited the question, @aehiilrs has a much better answer.
您可能想要的是反向 DNS 查找。调用 gethostbyaddr 即可。
What you might want is a reverse DNS lookup. Call gethostbyaddr for that.
如果您使用的是 Linux,您可以在 iptables 中添加一个过滤器来添加一条新规则,该规则会查找包含 HTTP get 请求的数据包并获取 url。
所以规则看起来像这样。
对于来自 localhost 的端口 80 上的每个数据包 ->检查数据包是否包含 GET 请求 ->检索 url 并保存它
此方法应该适用于所有情况,即使对于 HTTPS 标头也是如此。
If you are using Linux, you can add a filter in iptables to add a new rule which looks for packets containing HTTP get requests and get the url.
So rule will look like this.
For each packet going on port 80 from localhost -> check if the packet contains GET request -> retrieve the url and save it
This approach should work in all cases, even for HTTPS headers.
看看PasTmon。 http://pastmon.sourceforge.net
Have a look at PasTmon. http://pastmon.sourceforge.net
我正在研究类似的东西并遇到了这个。
如果您使用 Linux - justniffer,希望这可能是一个好的开始。
http://justniffer.sourceforge.net/
还有一个很好的 http 流量抓取 python 脚本,如果您正在寻找从 HTTP 请求获取信息。
I was researching on something similar and came across this.
Hope this could be a good start if you are using linux - justniffer.
http://justniffer.sourceforge.net/
There is also a nice http traffic grab python script that would help if you are looking to get information from HTTP requests.