如何解析 TCP 数据包负载
我正在使用 pcap 捕获 TCP 数据包,我想解析其有效负载。我的策略如下:
- 获取以太网标头并检查其类型是否为
ETHERTYPE_IP
(IP数据包) - 检查IP数据包是否具有协议
IPPROTO_TCP
(TCP数据包) 检查有效负载大小 > 0
(size = ntohs(ip_header->total_length - ip->header_length*4 - sizeof(struct tcp_header))
.解析负载(抓取主机 url)
我还没有开始解析有效负载,因为我发现了差异。下面是使用过滤器 "host = www.google.com"
捕获的 10 个 TCP 数据包的有效负载的打印输出
。 >数据包编号:3:TCP数据包:源端口:80 目标端口:58723 数据包中没有数据
数据包编号:4:TCP 数据包:源端口:58723 目标端口:80 数据包中没有数据
数据包编号:5:TCP 数据包:源端口:58723 目标端口:80 有效负载: 获取/HTTP/1.1 主办方:www.google.com 用户代理:Mozilla/5.0(Macintosh;U;Intel Mac OS X 10_6_5;en-us)AppleWebKit/533.19.4(KHTML,如 Gecko)版本/5.0.3 Safari/533.19.4 接受:application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,/;q=0.5 接受语言:en-us 接受编码:gzip、deflate Cookie:THICNT=25; SID=DQAAAKAAAB2ktMrEftADifGm05WkZmlHQsiy1Z2v- 连接:保持活动
数据包编号:6:TCP 数据包:源端口:80 目标端口:58723 数据包中没有数据
数据包编号:7:TCP 数据包:源端口:80 目标端口:58723 有效负载: \272äu\243\255\375\375}\336H\221\227\206\312~\322\317N\236\255A\343#\226\370֤\245[\327`\306ünE\263\204\ 313\356\3268 )p\344\301_Y\255\267\240\222x\364
数据包编号: 8 : TCP 数据包: 源端口: 58723 目标端口: 80 数据包中没有数据
数据包编号:9:TCP 数据包:源端口:80 目标端口:58723 有效负载: HTTP/1.1 200 好 日期:2010 年 11 月 29 日星期一 10:11:36 GMT 过期时间:-1 缓存控制:私有,max-age=0 内容类型:text/html;字符集=UTF-8 内容编码:gzip 服务器:gws 内容长度:8806 X-XSS-保护:1;模式=块 \213
为什么负载和端口存在差异?理想情况下,我只想解析像数据包 5 这样的数据包。如何忽略像数据包 7 和 9 这样的数据包?
I'm using pcap to capture TCP packets for which I would like to parse the payload. My strategy is as follows:
- Get the ethernet header and check if it has type
ETHERTYPE_IP
(IP packet) - Check if the IP packet has protocol
IPPROTO_TCP
(TCP packet) Check for payload size > 0
(size = ntohs(ip_header->total_length - ip->header_length*4 - sizeof(struct tcp_header))
.parse payload (grab the host url)
I haven't begun parsing the payload yet because I am getting discrepancies. Below is a printout of the payload of 10 captured TCP packets, using filter "host = www.google.com"
.
packet number: 3 : TCP Packet: Source Port: 80 Dest Port: 58723
No Data in packet
packet number: 4 : TCP Packet: Source Port: 58723 Dest Port: 80
No Data in packet
packet number: 5 : TCP Packet: Source Port: 58723 Dest Port: 80 Payload :
GET / HTTP/1.1
Host: www.google.com
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_5; en-us) AppleWebKit/533.19.4 (KHTML, like Gecko) Version/5.0.3 Safari/533.19.4
Accept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,/;q=0.5
Accept-Language: en-us
Accept-Encoding: gzip, deflate
Cookie: THICNT=25; SID=DQAAAKIAAAB2ktMrEftADifGm05WkZmlHQsiy1Z2v-
Connection: keep-alive
packet number: 6 : TCP Packet: Source Port: 80 Dest Port: 58723
No Data in packet
packet number: 7 : TCP Packet: Source Port: 80 Dest Port: 58723 Payload:
\272نu\243\255\375\375}\336H\221\227\206\312~\322\317N\236\255A\343#\226\370֤\245[\327`\306ըnE\263\204\313\356\3268 )p\344\301_Y\255\267\240\222x\364
packet number: 8 : TCP Packet: Source Port: 58723 Dest Port: 80
No Data in packet
packet number: 9 : TCP Packet: Source Port: 80 Dest Port: 58723 Payload:
HTTP/1.1 200 OK
Date: Mon, 29 Nov 2010 10:11:36 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=UTF-8
Content-Encoding: gzip
Server: gws
Content-Length: 8806
X-XSS-Protection: 1; mode=block
\213
Why is there a discrepancy in the payloads and the ports? Ideally I would like to only parse packets like packet 5. How do I ignore packets like 7 and 9?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
只能通过分析内容。 IP 或 TCP 标头中没有任何内容可以标记“HTTP 请求”数据包。即使“连接中的第一个数据包”也不起作用,因为存在持久连接。
另外,为了完全确定捕获所有 URI,您需要重新组装 TCP 流并解析 HTTP 请求:URI 可以拆分为两个或多个数据包。
Only by analyzing content. Nothing in IP or TCP header what can mark "HTTP Request" packets. Even "first data packet in connection" wouldnot work because there are persistent connections.
Also, to be completely sure about catching all URIs you need to reassemble TCP stream and parse HTTP request: URI can be split on two or more packets.
与 IP 标头一样,TCP 标头也是可变长度的。你没有考虑到这一点。您需要在 IP 数据中找到 TCP 标头,然后使用其长度字段(需要乘以 4,只需就像 IP 标头长度字段一样)来了解实际数据有效负载的位置。
Like the IP header, the TCP header is variable-length as well. You are not taking that into account. Rather than blindly subtracting
sizeof(struct tcp_header))
from the total packet size, you need to locate the TCP header within the IP data, then use its length field (which needs to be multiplied by 4, just like the IP header length field does) to know where the actual data payload is located.您的大小计算不正确 - 您无法按网络主机顺序进行减法,您必须首先将每个字段转换为主机字节顺序:
但是,如 Remy Lebeau 指出,您实际上需要检查 TCP 标头中的
offset
字段来知道有效负载从哪里开始。数据包 5 和数据包 7 之间的区别在于,前者是从客户端发送到服务器,后者是从服务器到客户端的响应。这就是端口交换的原因 - 源地址和目标地址也将交换。
如果您只想查看来自客户端的数据包,请检查源地址是否等于客户端的地址。
Your size calculation is incorrect - you can't do the subtraction in network-host-order, you have to convert each field to host-byte-order first:
However, as Remy Lebeau points out, you actually need to examine the
offset
field in the TCP header to know where the payload starts.The difference between packet 5 and packet 7 is that the former is going from the client, to the server, and the latter is a response from the server to the client. This is why the ports are switched around - the source and destination addresses will be switched also.
If you want to only look at packets coming from the client, check that the source address is equal to the client's address.