java.io.IOException:服务器返回 HTTP 响应代码:URL 为 403
我的代码是这样的:
URL url;
URLConnection uc;
StringBuilder parsedContentFromUrl = new StringBuilder();
String urlString="http://www.example.com/content/w2e4dhy3kxya1v0d/";
System.out.println("Getting content for URl : " + urlString);
url = new URL(urlString);
uc = url.openConnection();
uc.connect();
uc.getInputStream();
BufferedInputStream in = new BufferedInputStream(uc.getInputStream());
int ch;
while ((ch = in.read()) != -1) {
parsedContentFromUrl.append((char) ch);
}
System.out.println(parsedContentFromUrl);
但是,当我尝试通过浏览器访问 URL 时没有问题,但是当我尝试通过 java 程序访问它时,它会抛出期望:
java.io.IOException: Server returned HTTP response code: 403 for URL
解决方案是什么?
My code goes like this:
URL url;
URLConnection uc;
StringBuilder parsedContentFromUrl = new StringBuilder();
String urlString="http://www.example.com/content/w2e4dhy3kxya1v0d/";
System.out.println("Getting content for URl : " + urlString);
url = new URL(urlString);
uc = url.openConnection();
uc.connect();
uc.getInputStream();
BufferedInputStream in = new BufferedInputStream(uc.getInputStream());
int ch;
while ((ch = in.read()) != -1) {
parsedContentFromUrl.append((char) ch);
}
System.out.println(parsedContentFromUrl);
However when I am trying to access the URL through browser there is no problem , but when I try to access it through a java program, it throws expection:
java.io.IOException: Server returned HTTP response code: 403 for URL
What is the solution?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
在 uc.connect(); 和 uc.getInputStream(); 之间添加以下代码:
但是,仅允许某些类型的用户代理是一个好主意。这将保证您的网站安全并降低带宽使用率。
您可能希望从服务器上阻止一些可能的不良“用户代理”,具体取决于您是否不希望人们窃取您的内容和带宽。但是,用户代理可能会被欺骗,正如您在上面的示例中看到的那样。
Add the code below in between
uc.connect();
anduc.getInputStream();
:However, it a nice idea to just allow certain types of user agents. This will keep your website safe and bandwidth usage low.
Some possible bad 'User Agents' you might want to block from your server depending if you don't want people leeching your content and bandwidth. But, user agent can be spoofed as you can see in my example above.
403表示禁止。来自此处:-
您需要联系网站所有者以确保权限设置正确。
编辑我看到你的问题了。我通过 Fiddler 运行了该 URL。我注意到我收到了 407,其含义如下。这应该可以帮助您朝着正确的方向前进。
另请参阅此相关问题。
403 means forbidden. From here:-
You need to contact the owner of the site to make sure the permissions are set properly.
EDIT I see your problem. I ran the URL through Fiddler. I noticed that I am getting a 407 which means below. This should help you go in the right direction.
Also see this relevant question.
如果浏览器可以访问该页面,而您的代码不能访问,则浏览器请求和您的请求之间存在差异。您可以使用 Firebug 等工具查看浏览器请求,看看有什么区别。我能想到的一些事情是:
该网站设置了
cookie(可能在登录期间)。你也许能够处理
这在代码中,你必须
显式添加对传递的支持
饼干。这是最有可能的。
该站点基于用户代理进行过滤。您可以设置用户代理。这不太可能。
IF the browser can access the page, and your code cannot, then there's something different between the browser request and your request. You can look at the browser request, using, say, Firebug, to see what the differences are. Some things I can think of are:
The site sets a
cookie (maybe during login). You may be able to handle
this in code, you will have to
explicitly add support for passing
the cookie. This is most likely.
The site filters based on user agents. You can set the user agent. This is not as likely.