java.io.IOException:服务器返回 HTTP 响应代码:URL 为 403

发布于 2024-10-14 09:41:48 字数 720 浏览 2 评论 0原文

我的代码是这样的:

URL url;
URLConnection uc;
StringBuilder parsedContentFromUrl = new StringBuilder();
String urlString="http://www.example.com/content/w2e4dhy3kxya1v0d/";
System.out.println("Getting content for URl : " + urlString);
url = new URL(urlString);
uc = url.openConnection();
uc.connect();
uc.getInputStream();
BufferedInputStream in = new BufferedInputStream(uc.getInputStream());
int ch;
while ((ch = in.read()) != -1) {
    parsedContentFromUrl.append((char) ch);
}
System.out.println(parsedContentFromUrl);

但是,当我尝试通过浏览器访问 URL 时没有问题,但是当我尝试通过 java 程序访问它时,它会抛出期望:

java.io.IOException: Server returned HTTP response code: 403 for URL

解决方案是什么?

My code goes like this:

URL url;
URLConnection uc;
StringBuilder parsedContentFromUrl = new StringBuilder();
String urlString="http://www.example.com/content/w2e4dhy3kxya1v0d/";
System.out.println("Getting content for URl : " + urlString);
url = new URL(urlString);
uc = url.openConnection();
uc.connect();
uc.getInputStream();
BufferedInputStream in = new BufferedInputStream(uc.getInputStream());
int ch;
while ((ch = in.read()) != -1) {
    parsedContentFromUrl.append((char) ch);
}
System.out.println(parsedContentFromUrl);

However when I am trying to access the URL through browser there is no problem , but when I try to access it through a java program, it throws expection:

java.io.IOException: Server returned HTTP response code: 403 for URL

What is the solution?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

聆听风音 2024-10-21 09:41:48

在 uc.connect(); 和 uc.getInputStream(); 之间添加以下代码:

uc = url.openConnection();
uc.addRequestProperty("User-Agent", 
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)");

但是,仅允许某些类型的用户代理是一个好主意。这将保证您的网站安全并降低带宽使用率。

您可能希望从服务器上阻止一些可能的不良“用户代理”,具体取决于您是否不希望人们窃取您的内容和带宽。但是,用户代理可能会被欺骗,正如您在上面的示例中看到的那样。

Add the code below in between uc.connect(); and uc.getInputStream();:

uc = url.openConnection();
uc.addRequestProperty("User-Agent", 
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)");

However, it a nice idea to just allow certain types of user agents. This will keep your website safe and bandwidth usage low.

Some possible bad 'User Agents' you might want to block from your server depending if you don't want people leeching your content and bandwidth. But, user agent can be spoofed as you can see in my example above.

葬花如无物 2024-10-21 09:41:48

403表示禁止。来自此处:-

10.4.4 403 禁止

服务器理解了请求,但是
拒绝履行它。
授权没有帮助,而且
请求不应重复。如果
请求方法不是 HEAD 并且
服务器希望公开为什么
请求尚未得到满足,则
应该描述原因
实体内的拒绝。如果服务器
不想提供此信息
可供客户端使用的状态
可以使用代码 404(未找到)
相反。

您需要联系网站所有者以确保权限设置正确。

编辑我看到你的问题了。我通过 Fiddler 运行了该 URL。我注意到我收到了 407,其含义如下。这应该可以帮助您朝着正确的方向前进。

10.4.8 407 需要代理身份验证

此代码类似于401
(未经授权),但表明
客户端必须首先验证自己的身份
与代理。代理必须返回
Proxy-Authenticate 标头字段
(第 14.33 节)包含挑战
适用于代理
请求的资源。客户可以
用合适的重复请求
代理授权标头字段
(第 14.34 节)。 HTTP访问
身份验证在“HTTP
身份验证:基本身份验证和摘要身份验证
访问认证”

另请参阅此相关问题。

403 means forbidden. From here:-

10.4.4 403 Forbidden

The server understood the request, but
is refusing to fulfill it.
Authorization will not help and the
request SHOULD NOT be repeated. If the
request method was not HEAD and the
server wishes to make public why the
request has not been fulfilled, it
SHOULD describe the reason for the
refusal in the entity. If the server
does not wish to make this information
available to the client, the status
code 404 (Not Found) can be used
instead.

You need to contact the owner of the site to make sure the permissions are set properly.

EDIT I see your problem. I ran the URL through Fiddler. I noticed that I am getting a 407 which means below. This should help you go in the right direction.

10.4.8 407 Proxy Authentication Required

This code is similar to 401
(Unauthorized), but indicates that the
client must first authenticate itself
with the proxy. The proxy MUST return
a Proxy-Authenticate header field
(section 14.33) containing a challenge
applicable to the proxy for the
requested resource. The client MAY
repeat the request with a suitable
Proxy-Authorization header field
(section 14.34). HTTP access
authentication is explained in "HTTP
Authentication: Basic and Digest
Access Authentication"

Also see this relevant question.

念﹏祤嫣 2024-10-21 09:41:48

如果浏览器可以访问该页面,而您的代码不能访问,则浏览器请求和您的请求之间存在差异。您可以使用 Firebug 等工具查看浏览器请求,看看有什么区别。我能想到的一些事情是:

  • 该网站设置了
    cookie(可能在登录期间)。你也许能够处理
    这在代码中,你必须
    显式添加对传递的支持
    饼干。这是最有可能的。

  • 该站点基于用户代理进行过滤。您可以设置用户代理。这不太可能。

IF the browser can access the page, and your code cannot, then there's something different between the browser request and your request. You can look at the browser request, using, say, Firebug, to see what the differences are. Some things I can think of are:

  • The site sets a
    cookie (maybe during login). You may be able to handle
    this in code, you will have to
    explicitly add support for passing
    the cookie. This is most likely.

  • The site filters based on user agents. You can set the user agent. This is not as likely.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文