BeautifulSoup 中的字典索引和“if x in Dict”
我认为我不明白如何检查数组索引是否存在...
for tag in soup.findAll("input"):
print tag['type']
if 'type' in tag:
print "b"
输出:
2255
text
hidden
text
text
text
Traceback (most recent call last):
File "/home//workspace//src/x.py", line 268, in <module>
print tag['type']
File "/home//workspace//src/BeautifulSoup.py", line 601, in __getitem__
return self._getAttrMap()[key]
KeyError: 'type'
为什么它不输出“b”?
I don't think I understand how to check if an array index exists...
for tag in soup.findAll("input"):
print tag['type']
if 'type' in tag:
print "b"
Outputs:
2255
text
hidden
text
text
text
Traceback (most recent call last):
File "/home//workspace//src/x.py", line 268, in <module>
print tag['type']
File "/home//workspace//src/BeautifulSoup.py", line 601, in __getitem__
return self._getAttrMap()[key]
KeyError: 'type'
Why is it not outputting 'b' ever?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
BeautifulSoup
Tag
不是dict
。有时它在某些方面表现得像一个(您发现[]
表示法获取属性的值),但在其他方面却不然。Tag
上的in
将检查标签是否是该标签的直接子代;它不检查属性。相反,你可以这样做:
A BeautifulSoup
Tag
is not adict
. Sometimes it acts like one in certain ways ([]
notation as you discovered gets the value of an attribute), but in other ways it doesn't.in
on aTag
will check if a tag is a direct child of that tag; it does not check attributes.Instead, you could do something like this:
您假设从 findAll 返回的标签是字典,但实际上它们不是。您使用的 BeautifulSoup 库有自己的自定义类,在本例中为 BeautifulSoup.Tag,它的工作方式可能很像字典,但事实并非如此。
在这里,检查一下:
由于它实际上不是一个字典,因此您会得到一些不同的(特定于域的)行为,在本例中是直接子级列表(立即包含在您要“索引”的标签中的标签)。
看起来您想知道 input 标签是否有属性 type,因此根据 BeautifulSoup 文档,您可以使用 tag.attrs 和 attrMap 列出标签的属性。
BeautifulSoup 是一个非常有用的库,但您必须稍微使用它才能获得您想要的结果。确保花时间在交互式控制台中使用这些类,并记住使用 help(someobject) 语法来查看您正在使用的内容以及它具有哪些方法。
You're assuming that the tags returned from findAll are dicts, when in fact they're not. The BeautifulSoup library that you're using has its own custom classes, in this case BeautifulSoup.Tag, which may work a lot like a dict, but isn't.
Here, check this out:
Since it's not actually a dict, you're getting some different (domain-specific) behavior, in this case a list of immediate children (tags immediately contained within the tag you're "indexing").
It looks like you want to know if the input tag has an attribute type, so according to the BeautifulSoup documentation you can list the attributes of a tag using tag.attrs and attrMap.
BeautifulSoup is a really helpful library, but it's one that you have to play with a bit to get the results you want. Make sure to spend time in the interactive console playing with the classes, and remember to use the help(someobject) syntax to see what you're playing with and what methods it has.