HtmlUnit & GWT错误

发布于 2024-10-08 23:37:20 字数 8988 浏览 2 评论 0原文

我有一个尝试为其建立索引的 GWT 应用程序。

我正在使用 HtmlUnit 获取生成的 HTML 的内容:

WebClient webClient = new WebClient(BrowserVersion.FIREFOX_3_6);
HtmlPage refDesing = webClient.getPage("http://localhost:8080/MyGWTApp/#page2");
FileOutputStream fos1 = new FileOutputStream("D:\\work\\out\\page2.html");
fos1.write(refDesing.asXml().getBytes());
fos1.close();

但是我收到以下错误,并且返回的页面大约为空!

Dec 22, 2010 6:16:25 PM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Expected content type of 'application/javascript' or 'application/ecmascript' for remotely loaded JavaScript element at 'http://xxxxxxxxxxxx/xxxxxxxx/xxxxxxxx/xxxxxxxxxx.nocache.js', but got 'application/x-javascript'.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [485:24] Error in expression. Invalid token "=". Was expecting one of: <S>, <COMMA>, "/", <PLUS>, "-", <HASH>, <STRING>, ")", <URI>, "inherit", <EMS>, <EXS>, <LENGTH_PX>, <LENGTH_CM>, <LENGTH_MM>, <LENGTH_IN>, <LENGTH_PT>, <LENGTH_PC>, <ANGLE_DEG>, <ANGLE_RAD>, <ANGLE_GRAD>, <TIME_MS>, <TIME_S>, <FREQ_HZ>, <FREQ_KHZ>, <DIMENSION>, <PERCENTAGE>, <NUMBER>, <FUNCTION>, <IDENT>.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [485:29] Error in style rule. Invalid token "\n". Was expecting one of: "}", ";".
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler warning
WARNING: CSS warning: null [485:29] Ignoring the following declarations in this rule.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [518:24] Error in expression. Invalid token "=". Was expecting one of: <S>, <COMMA>, "/", <PLUS>, "-", <HASH>, <STRING>, ")", <URI>, "inherit", <EMS>, <EXS>, <LENGTH_PX>, <LENGTH_CM>, <LENGTH_MM>, <LENGTH_IN>, <LENGTH_PT>, <LENGTH_PC>, <ANGLE_DEG>, <ANGLE_RAD>, <ANGLE_GRAD>, <TIME_MS>, <TIME_S>, <FREQ_HZ>, <FREQ_KHZ>, <DIMENSION>, <PERCENTAGE>, <NUMBER>, <FUNCTION>, <IDENT>.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [518:29] Error in style rule. Invalid token "\n  ". Was expecting one of: "}", ";".
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler warning
WARNING: CSS warning: null [518:29] Ignoring the following declarations in this rule.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [541:24] Error in expression. Invalid token "=". Was expecting one of: <S>, <COMMA>, "/", <PLUS>, "-", <HASH>, <STRING>, ")", <URI>, "inherit", <EMS>, <EXS>, <LENGTH_PX>, <LENGTH_CM>, <LENGTH_MM>, <LENGTH_IN>, <LENGTH_PT>, <LENGTH_PC>, <ANGLE_DEG>, <ANGLE_RAD>, <ANGLE_GRAD>, <TIME_MS>, <TIME_S>, <FREQ_HZ>, <FREQ_KHZ>, <DIMENSION>, <PERCENTAGE>, <NUMBER>, <FUNCTION>, <IDENT>.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [541:29] Error in style rule. Invalid token "\n  ". Was expecting one of: "}", ";".
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler warning
WARNING: CSS warning: null [541:29] Ignoring the following declarations in this rule.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [951:24] Error in expression. Invalid token "=". Was expecting one of: <S>, <COMMA>, "/", <PLUS>, "-", <HASH>, <STRING>, ")", <URI>, "inherit", <EMS>, <EXS>, <LENGTH_PX>, <LENGTH_CM>, <LENGTH_MM>, <LENGTH_IN>, <LENGTH_PT>, <LENGTH_PC>, <ANGLE_DEG>, <ANGLE_RAD>, <ANGLE_GRAD>, <TIME_MS>, <TIME_S>, <FREQ_HZ>, <FREQ_KHZ>, <DIMENSION>, <PERCENTAGE>, <NUMBER>, <FUNCTION>, <IDENT>.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [951:29] Error in style rule. Invalid token "\n". Was expecting one of: "}", ";".
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler warning
WARNING: CSS warning: null [951:29] Ignoring the following declarations in this rule.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [977:24] Error in expression. Invalid token "=". Was expecting one of: <S>, <COMMA>, "/", <PLUS>, "-", <HASH>, <STRING>, ")", <URI>, "inherit", <EMS>, <EXS>, <LENGTH_PX>, <LENGTH_CM>, <LENGTH_MM>, <LENGTH_IN>, <LENGTH_PT>, <LENGTH_PC>, <ANGLE_DEG>, <ANGLE_RAD>, <ANGLE_GRAD>, <TIME_MS>, <TIME_S>, <FREQ_HZ>, <FREQ_KHZ>, <DIMENSION>, <PERCENTAGE>, <NUMBER>, <FUNCTION>, <IDENT>.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [977:29] Error in style rule. Invalid token "\n". Was expecting one of: "}", ";".
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler warning
WARNING: CSS warning: null [977:29] Ignoring the following declarations in this rule.

编辑:

我所说的“大约为空”是指,这里是返回的 HTML 的快照:

请注意,并非原始页面(从数据库获取的原始页面)中显示的所有数据都是由 HtmlUnit 返回的。还有什么“?”方法?我不认为这意味着任何编码错误,因为所有单词都是清晰的 ASCII 字符。

<td align="center" style="vertical-align: top;">
    <table class="refDesignGrid" cellspacing="5">
      <colgroup>
        <col/>
      </colgroup>
      <tbody align="left">
        <tr>
          <td align="left" style="vertical-align: top;">
            <table cellpadding="0" class="categoryItem" cellspacing="0">
              <tbody align="left">
                <tr>
                  <td align="left" style="vertical-align: top;">
                    <div class="header4">
                      C++
                    </div>
                  </td>
                </tr>
              </tbody>
            </table>
          </td>
          <td align="left" style="vertical-align: top;">
            <table cellpadding="0" class="categoryItem" cellspacing="0">
              <tbody align="left">
                <tr>
                  <td align="left" style="vertical-align: top;">
                    <div class="header4">
                      Java
                    </div>
                  </td>
                </tr>
              </tbody>
            </table>
          </td>
          <td align="left">
            <table cellpadding="0" class="categoryItem" cellspacing="0">
              <tbody align="left">
                <tr>
                  <td align="left" style="vertical-align: top;">
                    <div class="header4">
                      C#
                    </div>
                  </td>
                </tr>
              </tbody>
            </table>
          </td>
          <td>
            ?
          </td>
        </tr>
        <tr>
          <td>
            ?
          </td>
          <td>
            ?
          </td>
          <td>
            ?
          </td>
          <td>
            ?
          </td>
        </tr>
        <tr>
          <td>
            ?
          </td>
          <td>
            ?
          </td>
          <td>
            ?
          </td>
          <td>
            ?
          </td>
        </tr>
        <tr>
          <td>
            ?
          </td>
          <td>
            ?
          </td>
          <td>
            ?
          </td>
          <td>
            ?
          </td>
        </tr>
        <tr>
          <td>
            ?
          </td>
          <td>
            ?
          </td>
          <td>
            ?
          </td>
          <td>
            ?
          </td>
        </tr>
      </tbody>
    </table>
  </td>
</tr>
</tbody>
</table>
</div>

I've a GWT application that I try to index.

I am using HtmlUnit to get the content of the generated HTML:

WebClient webClient = new WebClient(BrowserVersion.FIREFOX_3_6);
HtmlPage refDesing = webClient.getPage("http://localhost:8080/MyGWTApp/#page2");
FileOutputStream fos1 = new FileOutputStream("D:\\work\\out\\page2.html");
fos1.write(refDesing.asXml().getBytes());
fos1.close();

But I get the following error and the page returned approximately empty!

Dec 22, 2010 6:16:25 PM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Expected content type of 'application/javascript' or 'application/ecmascript' for remotely loaded JavaScript element at 'http://xxxxxxxxxxxx/xxxxxxxx/xxxxxxxx/xxxxxxxxxx.nocache.js', but got 'application/x-javascript'.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [485:24] Error in expression. Invalid token "=". Was expecting one of: <S>, <COMMA>, "/", <PLUS>, "-", <HASH>, <STRING>, ")", <URI>, "inherit", <EMS>, <EXS>, <LENGTH_PX>, <LENGTH_CM>, <LENGTH_MM>, <LENGTH_IN>, <LENGTH_PT>, <LENGTH_PC>, <ANGLE_DEG>, <ANGLE_RAD>, <ANGLE_GRAD>, <TIME_MS>, <TIME_S>, <FREQ_HZ>, <FREQ_KHZ>, <DIMENSION>, <PERCENTAGE>, <NUMBER>, <FUNCTION>, <IDENT>.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [485:29] Error in style rule. Invalid token "\n". Was expecting one of: "}", ";".
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler warning
WARNING: CSS warning: null [485:29] Ignoring the following declarations in this rule.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [518:24] Error in expression. Invalid token "=". Was expecting one of: <S>, <COMMA>, "/", <PLUS>, "-", <HASH>, <STRING>, ")", <URI>, "inherit", <EMS>, <EXS>, <LENGTH_PX>, <LENGTH_CM>, <LENGTH_MM>, <LENGTH_IN>, <LENGTH_PT>, <LENGTH_PC>, <ANGLE_DEG>, <ANGLE_RAD>, <ANGLE_GRAD>, <TIME_MS>, <TIME_S>, <FREQ_HZ>, <FREQ_KHZ>, <DIMENSION>, <PERCENTAGE>, <NUMBER>, <FUNCTION>, <IDENT>.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [518:29] Error in style rule. Invalid token "\n  ". Was expecting one of: "}", ";".
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler warning
WARNING: CSS warning: null [518:29] Ignoring the following declarations in this rule.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [541:24] Error in expression. Invalid token "=". Was expecting one of: <S>, <COMMA>, "/", <PLUS>, "-", <HASH>, <STRING>, ")", <URI>, "inherit", <EMS>, <EXS>, <LENGTH_PX>, <LENGTH_CM>, <LENGTH_MM>, <LENGTH_IN>, <LENGTH_PT>, <LENGTH_PC>, <ANGLE_DEG>, <ANGLE_RAD>, <ANGLE_GRAD>, <TIME_MS>, <TIME_S>, <FREQ_HZ>, <FREQ_KHZ>, <DIMENSION>, <PERCENTAGE>, <NUMBER>, <FUNCTION>, <IDENT>.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [541:29] Error in style rule. Invalid token "\n  ". Was expecting one of: "}", ";".
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler warning
WARNING: CSS warning: null [541:29] Ignoring the following declarations in this rule.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [951:24] Error in expression. Invalid token "=". Was expecting one of: <S>, <COMMA>, "/", <PLUS>, "-", <HASH>, <STRING>, ")", <URI>, "inherit", <EMS>, <EXS>, <LENGTH_PX>, <LENGTH_CM>, <LENGTH_MM>, <LENGTH_IN>, <LENGTH_PT>, <LENGTH_PC>, <ANGLE_DEG>, <ANGLE_RAD>, <ANGLE_GRAD>, <TIME_MS>, <TIME_S>, <FREQ_HZ>, <FREQ_KHZ>, <DIMENSION>, <PERCENTAGE>, <NUMBER>, <FUNCTION>, <IDENT>.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [951:29] Error in style rule. Invalid token "\n". Was expecting one of: "}", ";".
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler warning
WARNING: CSS warning: null [951:29] Ignoring the following declarations in this rule.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [977:24] Error in expression. Invalid token "=". Was expecting one of: <S>, <COMMA>, "/", <PLUS>, "-", <HASH>, <STRING>, ")", <URI>, "inherit", <EMS>, <EXS>, <LENGTH_PX>, <LENGTH_CM>, <LENGTH_MM>, <LENGTH_IN>, <LENGTH_PT>, <LENGTH_PC>, <ANGLE_DEG>, <ANGLE_RAD>, <ANGLE_GRAD>, <TIME_MS>, <TIME_S>, <FREQ_HZ>, <FREQ_KHZ>, <DIMENSION>, <PERCENTAGE>, <NUMBER>, <FUNCTION>, <IDENT>.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [977:29] Error in style rule. Invalid token "\n". Was expecting one of: "}", ";".
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler warning
WARNING: CSS warning: null [977:29] Ignoring the following declarations in this rule.

EDIT:

What I mean by approximately empty is, here's snapshot of the returned HTML:

Please note that, not all data that is displayed in the original page (which original got from DB) is returned by HtmlUnit. Also What "?" means? I don't think it means any encoding error cause all words are clear ASCII characters.

<td align="center" style="vertical-align: top;">
    <table class="refDesignGrid" cellspacing="5">
      <colgroup>
        <col/>
      </colgroup>
      <tbody align="left">
        <tr>
          <td align="left" style="vertical-align: top;">
            <table cellpadding="0" class="categoryItem" cellspacing="0">
              <tbody align="left">
                <tr>
                  <td align="left" style="vertical-align: top;">
                    <div class="header4">
                      C++
                    </div>
                  </td>
                </tr>
              </tbody>
            </table>
          </td>
          <td align="left" style="vertical-align: top;">
            <table cellpadding="0" class="categoryItem" cellspacing="0">
              <tbody align="left">
                <tr>
                  <td align="left" style="vertical-align: top;">
                    <div class="header4">
                      Java
                    </div>
                  </td>
                </tr>
              </tbody>
            </table>
          </td>
          <td align="left">
            <table cellpadding="0" class="categoryItem" cellspacing="0">
              <tbody align="left">
                <tr>
                  <td align="left" style="vertical-align: top;">
                    <div class="header4">
                      C#
                    </div>
                  </td>
                </tr>
              </tbody>
            </table>
          </td>
          <td>
            ?
          </td>
        </tr>
        <tr>
          <td>
            ?
          </td>
          <td>
            ?
          </td>
          <td>
            ?
          </td>
          <td>
            ?
          </td>
        </tr>
        <tr>
          <td>
            ?
          </td>
          <td>
            ?
          </td>
          <td>
            ?
          </td>
          <td>
            ?
          </td>
        </tr>
        <tr>
          <td>
            ?
          </td>
          <td>
            ?
          </td>
          <td>
            ?
          </td>
          <td>
            ?
          </td>
        </tr>
        <tr>
          <td>
            ?
          </td>
          <td>
            ?
          </td>
          <td>
            ?
          </td>
          <td>
            ?
          </td>
        </tr>
      </tbody>
    </table>
  </td>
</tr>
</tbody>
</table>
</div>

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

趴在窗边数星星i 2024-10-15 23:37:20

HtmlUnit 可能有点啰嗦,特别是会让事情看起来比实际情况更糟糕。

创建这两个类:

import org.w3c.css.sac.CSSException;
import org.w3c.css.sac.CSSParseException;
import com.gargoylesoftware.htmlunit.DefaultCssErrorHandler;

/*
 * get rid of warnings... and provide a place to hang a break point
 */
public class QuietCssErrorHandler
    extends DefaultCssErrorHandler
{

    @Override public void error( CSSParseException e ) throws CSSException 
    {
        super.error( e ) ;
    }

    @Override public void fatalError( CSSParseException e ) throws CSSException 
    { 
        super.fatalError( e ) ; 
    }

    @Override public void warning( CSSParseException e ) throws CSSException 
    {
    }
}

然后

import com.gargoylesoftware.htmlunit.IncorrectnessListener;

public class SilentIncorrectnessListener
    implements IncorrectnessListener
{
    @Override public void notify( String message, Object origin ) 
    {
        // do nuttin' honey!
    }
}

当您创建 WebClient 时...

wc.setIncorrectnessListener( new SilentIncorrectnessListener() ) ;
wc.setCssErrorHandler( new QuietCssErrorHandler() ) ;

然后您应该会收到更少的警告。

至于“大约空”……是什么意思?

HtmlUnit can be kinda chatty, and in particular can make things look worse than they are.

Create these two classes:

import org.w3c.css.sac.CSSException;
import org.w3c.css.sac.CSSParseException;
import com.gargoylesoftware.htmlunit.DefaultCssErrorHandler;

/*
 * get rid of warnings... and provide a place to hang a break point
 */
public class QuietCssErrorHandler
    extends DefaultCssErrorHandler
{

    @Override public void error( CSSParseException e ) throws CSSException 
    {
        super.error( e ) ;
    }

    @Override public void fatalError( CSSParseException e ) throws CSSException 
    { 
        super.fatalError( e ) ; 
    }

    @Override public void warning( CSSParseException e ) throws CSSException 
    {
    }
}

and

import com.gargoylesoftware.htmlunit.IncorrectnessListener;

public class SilentIncorrectnessListener
    implements IncorrectnessListener
{
    @Override public void notify( String message, Object origin ) 
    {
        // do nuttin' honey!
    }
}

then when you create your WebClient...

wc.setIncorrectnessListener( new SilentIncorrectnessListener() ) ;
wc.setCssErrorHandler( new QuietCssErrorHandler() ) ;

And you should then get fewer warnings.

As for "approximately empty"... what does that mean?

失与倦" 2024-10-15 23:37:20

答案在这里:
http://htmlunit.sourceforge.net/faq.html#AJAXDoesNotWork

使用 HtmlUnit 的主线程可能会在此之前完成执行
允许后台线程运行。您有几个选择:

webClient.setAjaxController(新
NicelyResynchronizingAjaxController());会告诉您的 WebClient
实例重新同步异步 XHR。
webClient.waitForBackgroundJavaScript(10000);或者
webClient.waitForBackgroundJavaScriptStartingBefore(10000);就在之后
获取页面并在操作它之前。显式等待
JavaScript 运行时预期满足的条件,
例如

//尝试 20 次,每次等待 0.5 秒以填充页面。
for (int i = 0; i < 20; i++) {
    if (condition_to_happen_after_js_execution) {
         休息;
     }
     同步(页面){
         页面.等待(500);
     }
}

Answer is here:
http://htmlunit.sourceforge.net/faq.html#AJAXDoesNotWork

The main thread using HtmlUnit may be finishing execution before
allowing background threads to run. You have a couple of options:

webClient.setAjaxController(new
NicelyResynchronizingAjaxController()); will tell your WebClient
instance to re-synchronize asynchronous XHR.
webClient.waitForBackgroundJavaScript(10000); or
webClient.waitForBackgroundJavaScriptStartingBefore(10000); just after
getting the page and before manipulating it. Explicitly wait for a
condition that is expected be fulfilled when your JavaScript runs,
e.g.

//try 20 times to wait .5 second each for filling the page.
for (int i = 0; i < 20; i++) {
    if (condition_to_happen_after_js_execution) {
         break;
     }
     synchronized (page) {
         page.wait(500);
     }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文