BeautifulSoup - 帮我挑选 div 和类

发布于 2024-11-01 15:45:16 字数 1025 浏览 1 评论 0原文

这是我的 HMTL 代码：

<div class="BlockA">
    <h4>BlockA</h4>
    <div class="name">John Smith</div>
    <div class="number">2</div>
    <div class="name">Paul Peterson</div>
    <div class="number">14</div>
</div>

<div class="BlockB">
    <h4>BlockB</h4>
    <div class="name">Steve Jones</div>
    <div class="number">5</div>
</div>

注意 BlockA 和 BlockB。两者都包含相同的元素，即 name 和 number，但位于不同的类中。我是 python 新手，正在考虑尝试类似的方法：

parsedHTML = soup.findAll("div", attrs={"name" : "number"})

但这只会给我一个空白屏幕。我是否可以从 blockA 中执行 findAll，显示数据，然后从 BlockB 启动另一个循环并执行相同操作？

谢谢。

编辑：对于那些询问的人，我想简单地循环遍历 JSON 中的值和输出，如下所示：

BlockA
    John Smith
    2
    Paul Peterson
    14

BlockB
    Steve Whoever
    123
    Mr Whathisface
    23

原文

Heres my HMTL code:

<div class="BlockA">
    <h4>BlockA</h4>
    <div class="name">John Smith</div>
    <div class="number">2</div>
    <div class="name">Paul Peterson</div>
    <div class="number">14</div>
</div>

<div class="BlockB">
    <h4>BlockB</h4>
    <div class="name">Steve Jones</div>
    <div class="number">5</div>
</div>

Notice BlockA and BlockB. Both contain the same elements, ie name and number but are inside seperate classes. I'm new to python and was thinking of trying something like:

parsedHTML = soup.findAll("div", attrs={"name" : "number"})

but that just gives me a blank screen. Is it possible for me to do a findAll from within blockA, display the data, then start another loop from BlockB and do the same?

Thanks.

EDIT: For those asking, I want to simply loop through the values and output in JSON like this:

BlockA
    John Smith
    2
    Paul Peterson
    14

BlockB
    Steve Whoever
    123
    Mr Whathisface
    23

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

魂ガ小子 2024-11-08 15:45:16

您想查找包含“名称”或“数字”类属性的 div 吗？

>>> import re
>>> soup.findAll("div", {"class":re.compile("name|number")})

[<div class="name">John Smith</div>, <div class="number">2</div>, <div class="name">Paul Peterson</div>, <div class="number">14</div>, <div class="name">Steve Jones</div>, <div class="number">5</div>]

You want to find divs that contain a class attribute of "name" or "number"?

>>> import re
>>> soup.findAll("div", {"class":re.compile("name|number")})

[<div class="name">John Smith</div>, <div class="number">2</div>, <div class="name">Paul Peterson</div>, <div class="number">14</div>, <div class="name">Steve Jones</div>, <div class="number">5</div>]

回复收藏 0 原文

倾其所爱 2024-11-08 15:45:16

您需要使用可能的 class 值的列表。

soup.findAll('div', {'class': ['name', 'number']})

看到您的编辑后：

def grab_content(heading):
    siblings = [s.contents[0] for s in heading.findNextSiblings()]
    return {heading.contents[0]: siblings}

headings = soup.findAll('h4')
[grab_content(h) for h in headings]

原始 HTML 片段的输出将是：

[{u'BlockA': [u'John Smith', u'2', u'Paul Peterson', u'14']},
 {u'BlockB': [u'Steve Jones', u'5']}]

You need to use a list of possible class values.

soup.findAll('div', {'class': ['name', 'number']})

After seeing your edit:

def grab_content(heading):
    siblings = [s.contents[0] for s in heading.findNextSiblings()]
    return {heading.contents[0]: siblings}

headings = soup.findAll('h4')
[grab_content(h) for h in headings]

And the output for your original HTML snippet would be:

[{u'BlockA': [u'John Smith', u'2', u'Paul Peterson', u'14']},
 {u'BlockB': [u'Steve Jones', u'5']}]

回复收藏 0 原文

~没有更多了~

关于作者

你げ笑在眉眼

暂无简介

0 文章

0 评论

22 人气

关注发私信

淡笑忘祈一世凡恋

文章 0 评论 0

关注

我们的影子

文章 0 评论 0

关注

素年丶

文章 0 评论 0

关注

南笙

文章 0 评论 0

关注

18215568913

文章 0 评论 0

关注

qq_xk7Ean

文章 0 评论 0

友情链接

文江博客

BeautifulSoup - 帮我挑选 div 和类

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

淡笑忘祈一世凡恋

我们的影子

素年丶

南笙

18215568913

qq_xk7Ean

友情链接

BeautifulSoup - 帮我挑选 div 和类

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

淡笑忘祈一世凡恋

我们的影子

素年丶

南笙

18215568913

qq_xk7Ean

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。