如何使用 htmlunit 获取“下一页”在谷歌上

发布于 2025-01-07 01:11:33 字数 1822 浏览 3 评论 0原文

我使用下面的代码来获取谷歌搜索结果的前两页但我只能获取第一页（当搜索第2页时，它与第1页相同）

import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlElement;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
import com.gargoylesoftware.htmlunit.html.HtmlTextInput;


/**
 * A simple Google search test using HtmlUnit.
 *
 * @author Rahul Poonekar
 * @since Apr 18, 2010
 */
public class Author_search {
    static final WebClient browser;

    static {
        browser = new WebClient();
        browser.setJavaScriptEnabled(false);
    }

    public static void main(String[] arguments) {
            searchTest();
    }

    private static void searchTest() {
        HtmlPage currentPage = null;

        try {
            currentPage = (HtmlPage) browser.getPage("http://www.google.com");
        } catch (Exception e) {
            System.out.println("Could not open browser window");
            e.printStackTrace();
        }
        System.out.println("Simulated browser opened.");

        try {
            ((HtmlTextInput) currentPage.getElementByName("q")).setValueAttribute("xxoo");
            currentPage = currentPage.getElementByName("btnG").click();
            System.out.println("contents: " + currentPage.asText());
            HtmlElement next = (HtmlElement)currentPage.getByXPath("//span[contains(text(), 'Next')]").get(0);
            currentPage = next.click();
            System.out.println("contents: " + currentPage.asText());
        } catch (Exception e) {
            System.out.println("Could not search");
            e.printStackTrace();
        }
    } 
}

有人能告诉我如何解决这个问题吗？

顺便问一下：

如何使用 htmlunit 更改 google 中的语言设置？任何方便的方法？
htmlunit 是否将 html 视为“firebug” firefox，或者只是像“文件->保存”中的文本一样对待它。在我的意见，我相信它像探险家一样对待它，对吗？

原文

I use the code below to fetch the first two pages of google search results
but i can only fetch the first page(when search page 2, it is the same with page 1)

import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlElement;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
import com.gargoylesoftware.htmlunit.html.HtmlTextInput;


/**
 * A simple Google search test using HtmlUnit.
 *
 * @author Rahul Poonekar
 * @since Apr 18, 2010
 */
public class Author_search {
    static final WebClient browser;

    static {
        browser = new WebClient();
        browser.setJavaScriptEnabled(false);
    }

    public static void main(String[] arguments) {
            searchTest();
    }

    private static void searchTest() {
        HtmlPage currentPage = null;

        try {
            currentPage = (HtmlPage) browser.getPage("http://www.google.com");
        } catch (Exception e) {
            System.out.println("Could not open browser window");
            e.printStackTrace();
        }
        System.out.println("Simulated browser opened.");

        try {
            ((HtmlTextInput) currentPage.getElementByName("q")).setValueAttribute("xxoo");
            currentPage = currentPage.getElementByName("btnG").click();
            System.out.println("contents: " + currentPage.asText());
            HtmlElement next = (HtmlElement)currentPage.getByXPath("//span[contains(text(), 'Next')]").get(0);
            currentPage = next.click();
            System.out.println("contents: " + currentPage.asText());
        } catch (Exception e) {
            System.out.println("Could not search");
            e.printStackTrace();
        }
    } 
}

can anybody tell me how to fix this?

by the way:

How to change the language settings in google using htmlunit? any
convenient ways?
Does htmlunit treat the html like "firebug" in
firefox, or just treat it like the texts in "file->save".In my
opinion, I believe it treat it like it was a explorer, am i right?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

べ繥欢鉨o。 2025-01-14 01:11:33

我替换：

HtmlElement next = (HtmlElement)currentPage.getByXPath("//span[contains(text(),'Next')]").get(0);
currentPage = next.click();

替换为：

HtmlAnchor nextAnchor =currentPage.getAnchorByText("Next");
currentPage = nextAnchor.click();

I replaced:

HtmlElement next = (HtmlElement)currentPage.getByXPath("//span[contains(text(),'Next')]").get(0);
currentPage = next.click();

with:

HtmlAnchor nextAnchor =currentPage.getAnchorByText("Next");
currentPage = nextAnchor.click();

回复收藏 0 原文

~没有更多了~

关于作者

不甘平庸

暂无简介

文章

27 人气

关注发私信

李珊平

文章 0 评论 0

关注

Quxin

文章 0 评论 0

关注

范无咎

文章 0 评论 0

关注

github_ZOJ2N8YxBm

文章 0 评论 0

关注

若言

文章 0 评论 0

关注

南…巷孤猫

文章 0 评论 0

友情链接

文江博客

如何使用 htmlunit 获取“下一页”在谷歌上

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

李珊平

Quxin

范无咎

github_ZOJ2N8YxBm

若言

南…巷孤猫

友情链接

如何使用 htmlunit 获取“下一页”在谷歌上

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

李珊平

Quxin

范无咎

github_ZOJ2N8YxBm

若言

南…巷孤猫

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。