HTTP 1.1 请求行
我建立了代理服务器,它工作得很好,但是有一些网站他无法处理。 我试图将问题减少到其核心,这就是我想到的: 我的测试用例是: http://bits.wikimedia.org/en.wikipedia.org/load.php 这是每个维基百科页面中传输的 http 消息之一。 所以我尝试为它构建一个请求并通过这样的套接字发送它:
String request1 =
"GET http://bits.wikimedia.org/en.wikipedia.org/load.php HTTP/1.1" +
"\r\n" +
"Host: bits.wikimedia.org" + "\r\n" +
"User-Agent: MyHttpProxy/example.java (http://stackoverflow.com/q/5924490/319266)" +
"\r\n" + "\r\n";
但是我得到了 404 返回代码 - 这很奇怪,因为这个页面确实存在! 我做了很多尝试并提出了一个新请求,该请求仅在请求行中有所不同:
String request2 =
"GET /en.wikipedia.org/load.php HTTP/1.1" +
"\r\n" +
"Host: bits.wikimedia.org" +
"\r\n" +
"User-Agent: MyHttpProxy/example.java (http://stackoverflow.com/q/5924490/319266)" +
"\r\n" + "\r\n";
并且它有效!好200被带回来 一些不重要的内容(“/*没有请求模块。Max让我把这个放在这里*/”)
谁能告诉我这里有什么问题吗? 我查看了 rfc,但找不到任何理由...
这里是运行此测试并打印结果的源代码:
i build proxy server and it works great, however there are some sites which he cannot handle.
I tried to reduce the problem to its core and this is what i came up with:
My test case is: http://bits.wikimedia.org/en.wikipedia.org/load.php
which is one of the http messages transfered in each wikipedia page.
So i tried to build a request for it and send it via a socket like this:
String request1 =
"GET http://bits.wikimedia.org/en.wikipedia.org/load.php HTTP/1.1" +
"\r\n" +
"Host: bits.wikimedia.org" + "\r\n" +
"User-Agent: MyHttpProxy/example.java (http://stackoverflow.com/q/5924490/319266)" +
"\r\n" + "\r\n";
However i got 404 return code - which was strange because this page does exist!
I made alot of trys and made a new request which was different only in the request line:
String request2 =
"GET /en.wikipedia.org/load.php HTTP/1.1" +
"\r\n" +
"Host: bits.wikimedia.org" +
"\r\n" +
"User-Agent: MyHttpProxy/example.java (http://stackoverflow.com/q/5924490/319266)" +
"\r\n" + "\r\n";
and it worked! a good 200 was brought back with
some unimportent content("/* No modules requested. Max made me put this here */")
Can anyone tell me what is the problem here?
i looked at the rfc and i couldnt make any reason of this...
Here is the source code for running this test and print the resuls:
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
仅当您通过代理服务器访问时,才需要在请求行中提供完整的 URL。对 Web 服务器的直接请求需要遵循示例中
request2
中的形式。查看源代码,您将请求发送到端口 80,这几乎 100% 意味着它们没有通过代理。我的猜测是,您需要将
request1
发送到端口 8080 或您的代理正在侦听的任何端口。至于 RFC,请查看第 5.1.2 节。请注意,绝对路径用于代理,相对路径用于源服务器。
You would provide the full URL in the request line only if you're going via a proxy server. Direct requests to a web server need to follow the form as in
request2
in your example.Looking at the source, you send requests to port 80, which almost 100% means they're not going through a proxy. My guess is that you need to send
request1
to port 8080 or whatever port your proxy is listening on.As for the RFC, take a look at section 5.1.2. Note that the absolute path is used with proxies, and relative path with origin servers.