HttpClient 不返回完整的 html 源代码
我需要登录一个网站,点击几个链接到最终屏幕下载一些数据,步骤如下:
- 步骤1:在首页登录该网站;
- 步骤2:点击第一页上的“查看”链接进入第二页;
- 步骤3:在第二页,输入“账号”,点击提交按钮即可显示多行数据,我称之为第三页) (我获得了第三页的直接URL,通过将此URL粘贴到浏览器上的地址栏,第三页可以正确显示)
这是我的问题: 我正在使用 Httpclient。通过了登录页面,可以到第三页,但也只是 返回页面上的静态部分,根据输入“账号”动态生成数据的部分不返回。
代码如下:
HttpClient client = new HttpClient();
client.getHostConfiguration().setHost(loginUrl);
PostMethod postMethod = new PostMethod(serverUrl);
// Prepare login parameters
NameValuePair[] data = {
new NameValuePair("passUID",account),
new NameValuePair("passUCD",password)
};
postMethod.setRequestBody(data);
// I can print out the html code of the login page here
//request the third page with URL: serverUrl4
postMethod = new PostMethod(serverUrl4);
NameValuePair[] data2 = {
new NameValuePair("passUID",account),
new NameValuePair("passUCD",""),
new NameValuePair("page", "view"),
new NameValuePair("procacct", "0"),
new NameValuePair("AcctNo", "xxxxxxxxx")
};
postMethod.setRequestBody(data2);
client.executeMethod(postMethod);
byte[] responseBody = postMethod.getResponseBody();
如果我将 URL 中包含上述名称值对的 URL 粘贴到浏览器中,帐户数据将正确显示。但响应正文不会返回动态生成的帐户数据,除了“帐户数据”部分之外,还会返回其他任何内容。
有人知道为什么吗?非常感谢任何帮助。
I need to login to a website, click a few links to a final screen to download some data, here is the steps:
- step1: login into the site on the first page;
- step2: click a 'view' link on the first page to get to second page;
- step3: on the second page, put in 'account number', click submit button to get the many lines of data displayed, I call this as third page)
(I get the direct URL to the third page, by pasting this URL to the address bar on the browser, the third page is displayed correctly)
here is my problem:
I am using Httpclient. It passed the login page, and it can get to the third page, but it only
return the static part on the page, the part dynamically generated data based on input 'account number' is not returned.
Here is the code:
HttpClient client = new HttpClient();
client.getHostConfiguration().setHost(loginUrl);
PostMethod postMethod = new PostMethod(serverUrl);
// Prepare login parameters
NameValuePair[] data = {
new NameValuePair("passUID",account),
new NameValuePair("passUCD",password)
};
postMethod.setRequestBody(data);
// I can print out the html code of the login page here
//request the third page with URL: serverUrl4
postMethod = new PostMethod(serverUrl4);
NameValuePair[] data2 = {
new NameValuePair("passUID",account),
new NameValuePair("passUCD",""),
new NameValuePair("page", "view"),
new NameValuePair("procacct", "0"),
new NameValuePair("AcctNo", "xxxxxxxxx")
};
postMethod.setRequestBody(data2);
client.executeMethod(postMethod);
byte[] responseBody = postMethod.getResponseBody();
If I paste the URL with above namevaluepairs in the URL to the browser, the account data is displayed correctly. But the responsebody doesn't return the dynamically generated account data, anything else is returned but the section of the 'account data'.
Does anybody know why? any help is highly appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
相关页面是否使用 JavaScript 来生成此数据?如果是这样,HTTPClient 将不足以获得您想要的东西。
Does the page in question use JavaScript to generate this data? If so, HTTPClient isn't going to be enough to get what you want.
通常POST后服务器会发出
redirect
请求(HTTP/1.1 302),检查服务器响应的状态码。此外,您还应该提供服务器用来识别登录用户的cookie
。编辑:
希望此代码片段有帮助:
Usually there will be a
redirect
request(HTTP/1.1 302) from the server after you POST, check the status code of server responses. Also you should supplycookies
which used by the server to identify logged in users.Edit:
Wish this code snippet helps: