美丽的汤没有通过ID找到特定的桌子
我正在尝试解析一个篮球参考玩家页面,以从页面上提取一张表,并使用其中的数据。但是,由于某种原因,美丽的汤找不到页面中的桌子。我试图在页面中搜索其他表格,并成功找到了它们,但由于某种原因,它找不到这个特定的表。
我有以下行,该行链接到了我要搜索的特定播放器的页面,并获取了它的“美丽”版本:
page_soup = BeautifulSoup(bball_ref_page.content, 'lxml')
然后,我搜索具有以下行的表:
table = page_soup.find('table', attrs={'id': 'per_poss'})
每当我尝试print(表)时
它只是没有。 我还尝试通过执行以下操作来搜索内容:
table = page_soup.find(attrs={'id': 'per_poss'})
没有一个结果,
我也尝试在page_soup
中搜索所有表因为
我尝试将page_soup
分配中的解析更改为html.parser
,结果保持不变。我还尝试打印page_soup
的内容,并可以在其中找到该表:
<div class="table_container current" id="div_per_poss">
<table class="stats_table sortable row_summable" id="per_poss" data-cols-to-freeze="1,3"> <caption>Per 100 Poss Table</caption> <colgroup><col>....
有什么想法会导致这种情况发生?
I am trying to parse a basketball reference player page to extract one of the tables from the page and work with the data from it. For some reason, though, beautiful soup cannot find the table in the page. I have tried to search for other tables in the page and it has successfully found them but for some reason will not find this specific one.
I have the following line which takes a link to the page of the specific player I am searching for and gets the BeautifulSoup version of it:
page_soup = BeautifulSoup(bball_ref_page.content, 'lxml')
I then search for the table with the following line:
table = page_soup.find('table', attrs={'id': 'per_poss'})
Whenever I try to print(table)
it just comes out as None.
I have also tried searching for the contents by doing:
table = page_soup.find(attrs={'id': 'per_poss'})
same result of None
I have also tried searching for all tables in the page_soup
and it returns a list of a bunch of tables not including the one I am looking for
I have tried changing the parse in the page_soup
assignment to html.parser
and the result remains the same. I have also tried printing the contents of page_soup
and can find the table in their:
<div class="table_container current" id="div_per_poss">
<table class="stats_table sortable row_summable" id="per_poss" data-cols-to-freeze="1,3"> <caption>Per 100 Poss Table</caption> <colgroup><col>....
Any ideas what might be causing this to happen?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
该页面将
数据存储在 HTML 注释
中,因此通常 BeautifulSoup 看不到它。要将其加载为 pandas 数据框,您可以使用下一个示例:
打印:
要迭代数据帧的行,您可以使用
df.iterrows()
例如:打印:
The page is storing the
<table>
data inside the HTML comment<!-- -->
so normally BeautifulSoup doesn't see it. To load it as pandas dataframe you can use next example:Prints:
To iterate the rows of dataframe, you can use
df.iterrows()
for example:Prints: