Htmlunit getByXPath 不返回图像标签
我正在尝试搜索特定页面上的所有图像标签。示例页面是 www.chapitre.com
我正在使用以下代码来搜索页面上的所有图像:
HtmlPage page = HTMLParser.parseHtml(webResponse, webClient.openWindow(null,"testwindow"));
List<?> imageList = page.getByXPath("//img");
ListIterator li = imageList.listIterator();
while (li.hasNext() ) {
HtmlImage image = (HtmlImage)li.next();
URL url = new URL(image.getSrcAttribute());
//For now, only load 1X1 pixels
if (image.getHeightAttribute().equals("1") && image.getWidthAttribute().equals("1")) {
System.out.println("This is an image: " + url + " from page " + webRequest.getUrl() );
}
}
这不会返回页面中的所有图像标签。例如,具有属性“src=”http://ace-lb.advertising.com/site=703223/mnum=1516/bins=1/rich=0/logs=0/betr=A2099=[+ ]LP2" width="1" height="1"" 应该被捕获,但事实并非如此。我在这里做错了什么吗?
非常感谢任何帮助。
干杯!
I am trying to search all image tags on a specific page. An example page would be www.chapitre.com
I am using the following code to search for all images on the page:
HtmlPage page = HTMLParser.parseHtml(webResponse, webClient.openWindow(null,"testwindow"));
List<?> imageList = page.getByXPath("//img");
ListIterator li = imageList.listIterator();
while (li.hasNext() ) {
HtmlImage image = (HtmlImage)li.next();
URL url = new URL(image.getSrcAttribute());
//For now, only load 1X1 pixels
if (image.getHeightAttribute().equals("1") && image.getWidthAttribute().equals("1")) {
System.out.println("This is an image: " + url + " from page " + webRequest.getUrl() );
}
}
This doesn't return me all the image tags in the page. For example, an image tag with attributes "src="http://ace-lb.advertising.com/site=703223/mnum=1516/bins=1/rich=0/logs=0/betr=A2099=[+]LP2" width="1" height="1"" should be captured, but its not. Am I doing something wrong here?
Any help is really appreciated.
Cheers!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
那是因为
Is 抛出了一个异常:)
尝试这个代码:
您甚至可以通过 xpath 获取那些 1x1 像素图像。
希望这有帮助。
That's because
Is throwing you an exception :)
Try this code:
You can even get those 1x1 pixel images by xpath.
Hope this helps.