HtmlUnit API for Java(无头浏览器)出现问题?

发布于 2024-08-19 05:42:46 字数 2995 浏览 9 评论 0 原文

我正在使用 HtmlUnit 无头浏览器浏览此 网页(您可以查看网页以更好地了解问题)。

我已将选择的值设置为“1”

通过以下命令

final WebClient webClient = new WebClient(BrowserVersion.INTERNET_EXPLORER_7);
    try {
        // Configuring the webClient
        webClient.setJavaScriptEnabled(true);
        webClient.setThrowExceptionOnScriptError(false);
        webClient.setCssEnabled(true);
        webClient.setUseInsecureSSL(true);
        webClient.setRedirectEnabled(true);
        webClient.setActiveXNative(true);
        webClient.setAppletEnabled(true);
        webClient.setPrintContentOnFailingStatusCode(true);
        webClient.setAjaxController(new NicelyResynchronizingAjaxController());

        // Adding listeners
        webClient.addWebWindowListener(new com.gargoylesoftware.htmlunit.WebWindowListener() {

            public void webWindowOpened(WebWindowEvent event) {
                numberOfWebWindowOpened++;
                System.out.println("Number of opened WebWindow: " + numberOfWebWindowOpened);
            }

            public void webWindowContentChanged(WebWindowEvent event) {
            }

            public void webWindowClosed(WebWindowEvent event) {
                numberOfWebWindowClosed++;
                System.out.println("Number of closed WebWindow: " + numberOfWebWindowClosed);
            }
        });

        webClient.setWebConnection(new HttpWebConnection(webClient) {
            public WebResponse getResponse(WebRequestSettings settings) throws IOException {
                System.out.println(settings.getUrl());
                return super.getResponse(settings);
            }
        });

        CookieManager cm = new CookieManager();
        webClient.setCookieManager(cm);


        HtmlPage page = webClient.getPage("http://www.ticketmaster.com/event/0B004354D90759FD?artistid=1073053&majorcatid=10002&minorcatid=207");

        HtmlSelect select = (HtmlSelect) page.getElementById("quantity_select");
select.setSelectedAttribute("1", true);

,然后单击以下按钮

通过以下命令

HtmlButtonInput button = (HtmlButtonInput) page.getElementById("find_tickets_button");
HtmlPage captchaPage = button.click();
Thread.sleep(60*1000);
System.out.println("======captcha page=======");
System.out.println(captchaPage.asXml());

,但即使在单击按钮并通过 Thread.sleep() 方法等待 60 秒后,我也得到了相同的 HtmlPage。

但是,当我通过真正的浏览器执行相同的操作时,我会得到包含验证码的页面。

我想我在 htmlunit 中遗漏了一些东西。

Q1.为什么我无法通过 htmlunit 的浏览器获得相同的页面(包含验证码)?

I am using HtmlUnit headless browser to browse this webpage (you can see the webpage to have a better understanding of the problem).

I have set the select's value to "1"

by the following commands

final WebClient webClient = new WebClient(BrowserVersion.INTERNET_EXPLORER_7);
    try {
        // Configuring the webClient
        webClient.setJavaScriptEnabled(true);
        webClient.setThrowExceptionOnScriptError(false);
        webClient.setCssEnabled(true);
        webClient.setUseInsecureSSL(true);
        webClient.setRedirectEnabled(true);
        webClient.setActiveXNative(true);
        webClient.setAppletEnabled(true);
        webClient.setPrintContentOnFailingStatusCode(true);
        webClient.setAjaxController(new NicelyResynchronizingAjaxController());

        // Adding listeners
        webClient.addWebWindowListener(new com.gargoylesoftware.htmlunit.WebWindowListener() {

            public void webWindowOpened(WebWindowEvent event) {
                numberOfWebWindowOpened++;
                System.out.println("Number of opened WebWindow: " + numberOfWebWindowOpened);
            }

            public void webWindowContentChanged(WebWindowEvent event) {
            }

            public void webWindowClosed(WebWindowEvent event) {
                numberOfWebWindowClosed++;
                System.out.println("Number of closed WebWindow: " + numberOfWebWindowClosed);
            }
        });

        webClient.setWebConnection(new HttpWebConnection(webClient) {
            public WebResponse getResponse(WebRequestSettings settings) throws IOException {
                System.out.println(settings.getUrl());
                return super.getResponse(settings);
            }
        });

        CookieManager cm = new CookieManager();
        webClient.setCookieManager(cm);


        HtmlPage page = webClient.getPage("http://www.ticketmaster.com/event/0B004354D90759FD?artistid=1073053&majorcatid=10002&minorcatid=207");

        HtmlSelect select = (HtmlSelect) page.getElementById("quantity_select");
select.setSelectedAttribute("1", true);

and then clicked on the following button

by the following commands

HtmlButtonInput button = (HtmlButtonInput) page.getElementById("find_tickets_button");
HtmlPage captchaPage = button.click();
Thread.sleep(60*1000);
System.out.println("======captcha page=======");
System.out.println(captchaPage.asXml());

but even after clicking on the button and waiting for 60 seconds through the Thread.sleep() method, I am getting the same HtmlPage.

But when I do the same thing through real browser then I get the page that contains CAPTCHA.

I think I am missing something in the htmlunit.

Q1. Why am I not getting the same page (that contains CAPTCHA) through htmlunit's browser?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

甜中书 2024-08-26 05:42:46

该页面上的网络表单需要填写数量选择下拉列表。您尝试通过假设下拉列表是一个选择元素来在代码中执行此操作。但是,它不再是一个选择元素。尝试使用 Firebug 检查下拉列表,您将看到 JavaScript 已将 select 替换为一组复杂的嵌套 div 元素。

如果您弄清楚如何模拟每个用户单击该不寻常下拉列表的 div,那么您应该能够提交表单。

The web form on that page requires the quantity_select drop-down to be filled in. You're attempting to do that in your code by assuming the drop-down is a select element. However, it's no longer a select element. Try using Firebug to inspect the drop-down and you'll see that JavaScript has replaced the select with a complex set of nested div elements.

If you figure out how to emulate each user click on the divs for that unusual drop-down then you should be able to submit the form.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文