beautifulsoup 解析时出现问题

发布于 2024-12-02 15:27:35 字数 519 浏览 0 评论 0原文

我正在尝试解析以下网页链接。下面的代码：

import urllib2
import sys
from BeautifulSoup import BeautifulSoup

url = 'http://www.etsy.com/teams/list'
source = urllib2.urlopen(url)

soup = BeautifulSoup(source)
print soup.prettify()

print len(soup('h3')) #to print the no of occurances of h3 
h3s = soup.findAll('h3') #finding the same as above
print len(h3s)

问题是，它打印 1. 而网页包含至少 10 个“h3”。我无法弄清楚问题出在哪里我正在使用 python 2.7 和 BeautifulSoup 3.0.7

原文

I'm trying to parse the following web page link.
Code below:

import urllib2
import sys
from BeautifulSoup import BeautifulSoup

url = 'http://www.etsy.com/teams/list'
source = urllib2.urlopen(url)

soup = BeautifulSoup(source)
print soup.prettify()

print len(soup('h3')) #to print the no of occurances of h3 
h3s = soup.findAll('h3') #finding the same as above
print len(h3s)

The problem is, it prints 1. while the web page contains atleast 10 'h3'.I couldn't figure out where the problem lies
I am using python 2.7 and BeautifulSoup 3.0.7

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

望她远 2024-12-09 15:27:35

我建议使用 lxml 代替：

>>> import lxml.html
>>> doc = lxml.html.parse('http://www.etsy.com/teams/list')
>>> len(doc.xpath('//h3'))
<<< 10

I'd recommend using lxml instead:

>>> import lxml.html
>>> doc = lxml.html.parse('http://www.etsy.com/teams/list')
>>> len(doc.xpath('//h3'))
<<< 10

回复收藏 0 原文

~没有更多了~

关于作者

殊姿

暂无简介

0 文章

0 评论

25 人气

关注发私信

Gabu-gabumon

文章 0 评论 0

关注

qq_CgiN62

文章 0 评论 0

关注

荔枝明

文章 0 评论 0

关注

赏烟花じ飞满天

文章 0 评论 0

关注

独守阴晴ぅ圆缺

文章 0 评论 0

关注

¤→小豸慧

文章 0 评论 0

友情链接

文江博客

beautifulsoup 解析时出现问题

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

Gabu-gabumon

qq_CgiN62

荔枝明

赏烟花じ飞满天

独守阴晴ぅ圆缺

¤→小豸慧

友情链接

beautifulsoup 解析时出现问题

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

Gabu-gabumon

qq_CgiN62

荔枝明

赏烟花じ飞满天

独守阴晴ぅ圆缺

¤→小豸慧

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。