如何使用 htmlunit 获取“下一页”在谷歌上
我使用下面的代码来获取谷歌搜索结果的前两页 但我只能获取第一页(当搜索第2页时,它与第1页相同)
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlElement;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
import com.gargoylesoftware.htmlunit.html.HtmlTextInput;
/**
* A simple Google search test using HtmlUnit.
*
* @author Rahul Poonekar
* @since Apr 18, 2010
*/
public class Author_search {
static final WebClient browser;
static {
browser = new WebClient();
browser.setJavaScriptEnabled(false);
}
public static void main(String[] arguments) {
searchTest();
}
private static void searchTest() {
HtmlPage currentPage = null;
try {
currentPage = (HtmlPage) browser.getPage("http://www.google.com");
} catch (Exception e) {
System.out.println("Could not open browser window");
e.printStackTrace();
}
System.out.println("Simulated browser opened.");
try {
((HtmlTextInput) currentPage.getElementByName("q")).setValueAttribute("xxoo");
currentPage = currentPage.getElementByName("btnG").click();
System.out.println("contents: " + currentPage.asText());
HtmlElement next = (HtmlElement)currentPage.getByXPath("//span[contains(text(), 'Next')]").get(0);
currentPage = next.click();
System.out.println("contents: " + currentPage.asText());
} catch (Exception e) {
System.out.println("Could not search");
e.printStackTrace();
}
}
}
有人能告诉我如何解决这个问题吗?
顺便问一下:
- 如何使用 htmlunit 更改 google 中的语言设置?任何 方便的方法?
- htmlunit 是否将 html 视为“firebug” firefox,或者只是像“文件->保存”中的文本一样对待它。在我的 意见,我相信它像探险家一样对待它,对吗?
I use the code below to fetch the first two pages of google search results
but i can only fetch the first page(when search page 2, it is the same with page 1)
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlElement;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
import com.gargoylesoftware.htmlunit.html.HtmlTextInput;
/**
* A simple Google search test using HtmlUnit.
*
* @author Rahul Poonekar
* @since Apr 18, 2010
*/
public class Author_search {
static final WebClient browser;
static {
browser = new WebClient();
browser.setJavaScriptEnabled(false);
}
public static void main(String[] arguments) {
searchTest();
}
private static void searchTest() {
HtmlPage currentPage = null;
try {
currentPage = (HtmlPage) browser.getPage("http://www.google.com");
} catch (Exception e) {
System.out.println("Could not open browser window");
e.printStackTrace();
}
System.out.println("Simulated browser opened.");
try {
((HtmlTextInput) currentPage.getElementByName("q")).setValueAttribute("xxoo");
currentPage = currentPage.getElementByName("btnG").click();
System.out.println("contents: " + currentPage.asText());
HtmlElement next = (HtmlElement)currentPage.getByXPath("//span[contains(text(), 'Next')]").get(0);
currentPage = next.click();
System.out.println("contents: " + currentPage.asText());
} catch (Exception e) {
System.out.println("Could not search");
e.printStackTrace();
}
}
}
can anybody tell me how to fix this?
by the way:
- How to change the language settings in google using htmlunit? any
convenient ways? - Does htmlunit treat the html like "firebug" in
firefox, or just treat it like the texts in "file->save".In my
opinion, I believe it treat it like it was a explorer, am i right?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我替换:
替换为:
I replaced:
with: