BeautifulSoup：使用字符串获取值

发布于 2024-11-07 06:36:43 字数 494 浏览 1 评论 0原文

是否可以使用字符串来获取标签的值？

XML 结构：

book
   title
      titletext
book
   title
      titletext

代码：

books = BeautifulStoneSoup().findAll('book')
for book in books:
    book.title.titletext.string
    #book.get_by_string('title.titletext').string is this possible?

如果不可能，getattr 支持多个级别吗？

getattr(book, 'title.titletext').string

我做了一些测试，这似乎不可能，但也许还有其他选择？

如果没有，我想我必须编写自己的递归函数来查找属性？

原文

Is it possible to use a string to get a value of a tag?

XML structure:

book
   title
      titletext
book
   title
      titletext

Code:

books = BeautifulStoneSoup().findAll('book')
for book in books:
    book.title.titletext.string
    #book.get_by_string('title.titletext').string is this possible?

If it's not possible does getattr support multiple levels?

getattr(book, 'title.titletext').string

I did some testing and this doesn't seem to be possible but maybe there is an alternative?

If there isn't I guess I have to write my own recursive function to find the attribute?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

彼岸花似海 2024-11-14 06:36:43

我建议研究 ElementTree。它有你需要的东西。举一个简单的例子：

import xml.etree.cElementTree

doc = xml.etree.cElementTree.parse( filename )
for e in doc.getiterator( 'title' ):
    book_title = e.attrib[ 'titletext' ]

显然我没有处理错误条件，但使用 try/ except 或检查“titletext”是否在字典中就足够了。

如果您正在寻找特定的标签，而不是标签的属性，上面的代码仍然有效：

import xml.etree.cElementTree

doc = xml.etree.cElementTree.parse( filename )
for e in doc.getiterator( 'titletext' ):
    book_title = e.text

一般来说，我发现 ElementTree 比 BeautifulSoup 更容易使用，至少对于我使用的东西来说是这样。我发现它对于我们的案例来说稍微快一些，并且它可以更轻松地处理像您这样的案例（在我看来）。

HTH。

I would suggest looking into ElementTree. It has what you need. As a quick example:

import xml.etree.cElementTree

doc = xml.etree.cElementTree.parse( filename )
for e in doc.getiterator( 'title' ):
    book_title = e.attrib[ 'titletext' ]

Obviously I'm not handling error conditions, but using try/except or checking to see if 'titletext' is in the dict is sufficient.

If you are looking for a specific tag, and not an attribute of the tag, the above code will still work:

import xml.etree.cElementTree

doc = xml.etree.cElementTree.parse( filename )
for e in doc.getiterator( 'titletext' ):
    book_title = e.text

In general, I've found ElementTree easier to work with than BeautifulSoup, at least for the kinds of things that I work with. I've found that it's slightly faster for our cases and it handles cases like yours more easily (in my opinion).

HTH.

回复收藏 0 原文

~没有更多了~