当浏览器正常工作时,Java 中的 HtmlUnit 返回错误
我正在尝试使用 HTMLUnit 登录网站,但每当我提交登录详细信息时,都会出现大量错误。我将代码分成小块,这样我就可以看到它是在单击提交按钮之后,但在其他任何事情发生之前;这需要一段时间,因为它是一个非常慢的网站。不幸的是,因为它是在登录后发生的,所以我无法向您展示它的含义。我可以说成功的登录有一些重定向,因为它给了我一个页面未找到的错误,我假设它是导致问题的重定向之一。我之前在 Chrome 中遇到过重定向问题,尽管不是在这个特定页面上,但 Chrome 和 IE8 现在都可以正常加载它。
为您保存完整的堆栈跟踪,这似乎是最重要的内容:
SEVERE: Error loading JavaScript from [http://servicedeskmt.det.nsw.edu.au:8090/kinetic/displayPage.jsp/../resources/js/jquery/jquery-1.3.2.js].
com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException: 404 /kinetic/resources/js/jquery/jquery-1.3.2.js for http://servicedeskmt.det.nsw.edu.au:8090/kinetic/resources/js/jquery/jquery-1.3.2.js
at com.gargoylesoftware.htmlunit.WebClient.throwFailingHttpStatusCodeExceptionIfNecessary(WebClient.java:535)
INFO: statusCode=[404] contentType=[text/html]
Oct 31, 2011 2:31:29 PM com.gargoylesoftware.htmlunit.WebClient printContentIfNecessary
INFO: <html>
<head>
<title>Page cannot be found</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div align="center">
<p> </p>
<p> </p>
<p><b><font face="Verdana, Arial, Helvetica, sans-serif" size="2">There was
an error on the page you were attempting to reach or the page could not be
found.</font></b> <br>
</p>
<p><br>
<br>
<a href="http://www.kineticdata.com"><img src="resources/poweredByKS.gif" width="131" height="45" border="0"></a>
</p>
</div>
</body>
</html>
任何建议将不胜感激。 谢谢。
编辑:添加更多细节 无论我是否将结果设置为等于新页面,该错误都会发生在 loginButton.click() 上。仅 loginButton.click() 的一行会导致长时间暂停(就像我说的,页面需要一段时间才能加载),然后抛出错误。如果我捕获异常,然后尝试加载页面的登录版本,则会发生相同的情况,这告诉我我的登录尝试成功,但加载登录页面会导致问题。 将凭据存储在 DefaultCredentialsProvider 中,然后直接进入登录页面,会出现相同的错误。我想我可以有把握地说这是页面,而不是登录。
加载页面,并在同一语句中运行 javascript 或单击按钮,具有相同的效果。我希望页面的其他部分没有正确加载,但我仍然可以触发我想要的部分,但没有运气。
I'm trying to log into a site using HTMLUnit, but whenever I submit my logon details I get a massive chain of errors. I broke my code up into little pieces so I can see that it's after the submit button is clicked, but before anything else happens; it takes a while because it's a pretty slow site. Because it happens after a login I can't show you what it's meant to look like, unfortunately. I can say that a successful login has a few redirects, and because it's giving me a page not found error I'm assuming that it's one of those redirects that's causing the trouble. I have had redirect trouble with Chrome here before, although not on this particular page, and both Chrome and IE8 are loading it fine for me now.
Saving you the full stack trace, here's what appears to be the most important stuff:
SEVERE: Error loading JavaScript from [http://servicedeskmt.det.nsw.edu.au:8090/kinetic/displayPage.jsp/../resources/js/jquery/jquery-1.3.2.js].
com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException: 404 /kinetic/resources/js/jquery/jquery-1.3.2.js for http://servicedeskmt.det.nsw.edu.au:8090/kinetic/resources/js/jquery/jquery-1.3.2.js
at com.gargoylesoftware.htmlunit.WebClient.throwFailingHttpStatusCodeExceptionIfNecessary(WebClient.java:535)
INFO: statusCode=[404] contentType=[text/html]
Oct 31, 2011 2:31:29 PM com.gargoylesoftware.htmlunit.WebClient printContentIfNecessary
INFO: <html>
<head>
<title>Page cannot be found</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div align="center">
<p> </p>
<p> </p>
<p><b><font face="Verdana, Arial, Helvetica, sans-serif" size="2">There was
an error on the page you were attempting to reach or the page could not be
found.</font></b> <br>
</p>
<p><br>
<br>
<a href="http://www.kineticdata.com"><img src="resources/poweredByKS.gif" width="131" height="45" border="0"></a>
</p>
</div>
</body>
</html>
Any advice would be greatly appreciated.
Thanks.
EDIT: Adding some more detail
The error occurs on loginButton.click(), whether I set the result to equal a new page or not. A line that is just loginButton.click() causes a long pause (like I say, the page takes a while to load), then throws the error. If I catch the exception and then try to load the logged-in version of the page the same thing happens, which tells me that my login attempt is successful, but loading the logged in page is causing a problem.
Storing the credentials in the DefaultCredentialsProvider, then going straight to the logged in page, gives the same error. I think I can safely say it's the page, not the login.
Loading the page, and in the same statement running javascript or clicking on a button, has the same effect. I was hoping that some other part of the page wasn't loading properly but I could still fire off the bit that I wanted, but no luck.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我遇到了类似的问题并使用以下代码使其正常工作:
I experienced similar issue and got it working using below code:
使用
Use
我不在乎它是旧线程:P
如果在日志中你有“printContentIfNecessary”,你可以通过设置值将其关闭:
对我来说很有用
I don't care that it's old thread:P
If in logs you have "printContentIfNecessary" you can turn it of by setting value:
Works good for me
尝试其中之一或全部取决于您不想打印的内容
请参阅文档: https://htmlunit.sourceforge.io/apidocs/com/gargoylesoftware/htmlunit/WebClientOptions.html
就我而言,我在控制台中遇到了很多错误,因为webClient 无法加载 css,下面的设置对我有用:
Try one of them or all depend on what you don't want to print
Refer doc: https://htmlunit.sourceforge.io/apidocs/com/gargoylesoftware/htmlunit/WebClientOptions.html
In my case, I got a lot error in console because webClient cannot load css, the setting below work for me: