C++如何从 Http GET 数据包中获取主机名

发布于 2024-12-17 05:25:05 字数 848 浏览 2 评论 0原文

我想使用从如下所示的数据包中获取的信息创建与 Web 服务器的 tcp 连接。为此,我需要数据包中的主机名和端口号来获取可以与连接功能一起使用的地址

这假设我正在使用 c++ 我可以假设与服务器通信以请求 html 页面的端口号为 80 吗? 假设数据包是字符数组的形式,如何从数据包中获取主机名?我目前提取字符串 bits.wikimedia.org 并将其用作主机名。这是正确的吗? 一旦获得主机名,我假设将其传递给 getaddrinfo 并使用与此函数一起传递的结构来生成包含 connect 函数可以理解的信息的结构。这个假设正确吗?

GET http://bits.wikimedia.org/en.wikipedia.org/load.php?debug=false&lang=en&modules=site&only=scripts&skin=vector&* HTTP/1.1
Host: bits.wikimedia.org
Proxy-Connection: close
User-Agent: Mozilla/5.0 (compatible; Konqueror/4.6; Linux) KHTML/4.6.5 (like Gecko) Fedora/4.6.5-7.fc15
Referer: http://en.wikipedia.org/wiki/Firewall_(computing)
Accept: */*
Accept-Encoding: x-gzip, x-deflate, gzip, deflate
Accept-Charset: utf-8, utf-8;q=0.5, *;q=0.5
Accept-Language: en-US,en;q=0.9

I want to create a tcp connection to a web server using information that I get from a packet like the one shown below. To do this I need the hostname and portno from the packet to get an address that I can use with the connection function

This is assuming that I am using c++
Can I assume that the portno to talk to a server to request for html pages will be 80?
How do I get the hostname from the packet assuming that the packet is in the form of char array? I currently extract the string of characters bits.wikimedia.org and using that as the hostname. Is that correct?
Once I have the host name, I assume that I pass it in to getaddrinfo and use the structure that I passed in along with this function to generate a struct containing information understandable to the connect function. Is this assumption correct?

GET http://bits.wikimedia.org/en.wikipedia.org/load.php?debug=false&lang=en&modules=site&only=scripts&skin=vector&* HTTP/1.1
Host: bits.wikimedia.org
Proxy-Connection: close
User-Agent: Mozilla/5.0 (compatible; Konqueror/4.6; Linux) KHTML/4.6.5 (like Gecko) Fedora/4.6.5-7.fc15
Referer: http://en.wikipedia.org/wiki/Firewall_(computing)
Accept: */*
Accept-Encoding: x-gzip, x-deflate, gzip, deflate
Accept-Charset: utf-8, utf-8;q=0.5, *;q=0.5
Accept-Language: en-US,en;q=0.9

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

初吻给了烟 2024-12-24 05:25:05

如果您获得包含字符串/字符数组的离线 http 数据包,则您需要仅依赖于字符串中的 URL。 HTTP URL 以 http://hostname[:port]/resource... 格式指定,其中端口号是可选的,如果未指定,则默认为 http 端口 80。您需要解析 URL 以提取主机名和端口号 [如果未明确指定,则假定端口 80] 并尝试套接字连接。您需要设置 DNS 并且可以通过程序访问 DNS,以便将主机名解析为其 IP 地址。如果没有这个,您将无法建立连接。

If you are getting an offline http packet containing a string/char array, then you need to solely rely on the URL in the string. HTTP URLs are specified in the format http://hostname[:port]/resource... where the port number is optional and defaults to the http port 80, if not specified. You need to parse the URL to extract the hostname and port number [assume port 80 if not explicitly specified] and attempt a socket connection. You need to have your DNS set and reachable from your program for the hostname to be resolved to its IP address. Without this you would not be able to make the connection.

ま柒月 2024-12-24 05:25:05

您应该能够依赖 Host: 标头字段 其中包含主机名。

查看链接以了解其格式。您需要逐行读取标题,识别“主机:”行,提取以下字符串,如果给定,可能提取端口号(主机:端口)。

是的,getaddrinfo() 可用于获取主机名的 IP 地址。

You should be able to rely on the Host: header field to have the host name in it.

Look at the link to see how this is formatted. You need to read the header line by line, identify the "Host:" line, extract the following string, possibly extract the port number if given (host:port).

Yes, getaddrinfo() could be used to obtain an IP address(es) for the host name.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文