使用 BeautifulSoup 时抑制/忽略特定类型错误的任何方法

发布于 2024-10-14 22:26:43 字数 422 浏览 7 评论 0原文

我抓取的每个页面上都有许多元素，但许多页面没有我需要的所有项目，因此我最终不得不将每个项目都包裹在其中，

try:
    itemNeeded = soup.find(text="yada yada yada").next
except AttributeError:
    pass

这使我的代码膨胀了 400%。
有没有什么办法可以把它抽象出来，或者至少减少碍眼的地方？

编辑：我不仅搜索字符串，而且还做这样的事情：

navLinks = carSoup.find("span", "nav").findAll("a")
carDict['manufacturer'] = navLinks[1].next
carDict['model'] = navLinks[2].next

原文

There are many elements that I need on each page I scrape, but many pages don't have all the items I need, so I end up having to wrap each and every item grab in

try:
    itemNeeded = soup.find(text="yada yada yada").next
except AttributeError:
    pass

This balloons my code by 400%.
Is there any way to abstract this away, or at least reduce the eyesore?

Edit: I'm not only searching for strings, but doing things like this as well:

navLinks = carSoup.find("span", "nav").findAll("a")
carDict['manufacturer'] = navLinks[1].next
carDict['model'] = navLinks[2].next

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

想念有你 2024-10-21 22:26:43

构建一个列表并迭代该列表...使用一些模板..您只需要弄清楚如何以更小、更简单的方式迭代整个页面。

text_list = ['items', 'to', 'search', 'for']
pre_find = {'items': (('span', 'nav'), 'a', ('manufacturer', 'model'))}
carDict = {}
for text in text_list:
    try:
        if pre_find.has_key(text):
            x = 1
            navLinks = carSoup.find(pre_find[text][0]).findAll(pre_find[text][1])
            for item in pre_find[text][2]:
                carDict[item] = navLinks[x].next
                x += 1
        else:
            carDict[text] = soup.find(text=text).next
    except AttributeError:
        pass

Build a list and iterate over the list... Use some templating.. You just need to figure out how to iterate over the whole page, in a smaller, simpler fashion.

text_list = ['items', 'to', 'search', 'for']
pre_find = {'items': (('span', 'nav'), 'a', ('manufacturer', 'model'))}
carDict = {}
for text in text_list:
    try:
        if pre_find.has_key(text):
            x = 1
            navLinks = carSoup.find(pre_find[text][0]).findAll(pre_find[text][1])
            for item in pre_find[text][2]:
                carDict[item] = navLinks[x].next
                x += 1
        else:
            carDict[text] = soup.find(text=text).next
    except AttributeError:
        pass

回复收藏 0 原文

戏剧牡丹亭 2024-10-21 22:26:43

您是否考虑过编写一个更全局的 try except 块，例如：

try:
    itemNeeded = soup.find(text="yada yada yada").next
    nextItem = soup.find(text = "blah blah blah").next
except AttributeError:
    pass

Have you considered writing a more global try except block, something like:

try:
    itemNeeded = soup.find(text="yada yada yada").next
    nextItem = soup.find(text = "blah blah blah").next
except AttributeError:
    pass

回复收藏 0 原文

~没有更多了~

关于作者

酒浓于脸红

暂无简介

0 文章

0 评论

25 人气

关注发私信

束缚ｍ

文章 0 评论 0

关注

alipaysp_VP2a8Q4rgx

文章 0 评论 0

关注

α

文章 0 评论 0

关注

一口甜

文章 0 评论 0

关注

厌味

文章 0 评论 0

关注

转身泪倾城

文章 0 评论 0

友情链接

文江博客

使用 BeautifulSoup 时抑制/忽略特定类型错误的任何方法

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

束缚ｍ

alipaysp_VP2a8Q4rgx

α

一口甜

厌味

转身泪倾城

友情链接

使用 BeautifulSoup 时抑制/忽略特定类型错误的任何方法

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

束缚ｍ

alipaysp_VP2a8Q4rgx

α

一口甜

厌味

转身泪倾城

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。