美丽的汤没有通过ID找到特定的桌子

发布于 2025-01-20 06:15:59 字数 970 浏览 0 评论 0原文

我正在尝试解析一个篮球参考玩家页面，以从页面上提取一张表，并使用其中的数据。但是，由于某种原因，美丽的汤找不到页面中的桌子。我试图在页面中搜索其他表格，并成功找到了它们，但由于某种原因，它找不到这个特定的表。

我有以下行，该行链接到了我要搜索的特定播放器的页面，并获取了它的“美丽”版本：

page_soup = BeautifulSoup(bball_ref_page.content, 'lxml')

然后，我搜索具有以下行的表：

table = page_soup.find('table', attrs={'id': 'per_poss'})

每当我尝试print（表）时它只是没有。我还尝试通过执行以下操作来搜索内容：

table = page_soup.find(attrs={'id': 'per_poss'})

没有一个结果，

我也尝试在page_soup中搜索所有表因为

我尝试将page_soup分配中的解析更改为html.parser，结果保持不变。我还尝试打印page_soup的内容，并可以在其中找到该表：

<div class="table_container current" id="div_per_poss">
        
        <table class="stats_table sortable row_summable" id="per_poss" data-cols-to-freeze="1,3"> <caption>Per 100 Poss Table</caption> <colgroup><col>....

有什么想法会导致这种情况发生？

原文

I am trying to parse a basketball reference player page to extract one of the tables from the page and work with the data from it. For some reason, though, beautiful soup cannot find the table in the page. I have tried to search for other tables in the page and it has successfully found them but for some reason will not find this specific one.

I have the following line which takes a link to the page of the specific player I am searching for and gets the BeautifulSoup version of it:

page_soup = BeautifulSoup(bball_ref_page.content, 'lxml')

I then search for the table with the following line:

table = page_soup.find('table', attrs={'id': 'per_poss'})

Whenever I try to print(table) it just comes out as None.
I have also tried searching for the contents by doing:

table = page_soup.find(attrs={'id': 'per_poss'})

same result of None

I have also tried searching for all tables in the page_soup and it returns a list of a bunch of tables not including the one I am looking for

I have tried changing the parse in the page_soup assignment to html.parser and the result remains the same. I have also tried printing the contents of page_soup and can find the table in their:

<div class="table_container current" id="div_per_poss">
        
        <table class="stats_table sortable row_summable" id="per_poss" data-cols-to-freeze="1,3"> <caption>Per 100 Poss Table</caption> <colgroup><col>....

Any ideas what might be causing this to happen?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

与往事干杯 2025-01-27 06:16:00

该页面将 数据存储在 HTML 注释中，因此通常 BeautifulSoup 看不到它。要将其加载为 pandas 数据框，您可以使用下一个示例：

import requests
import pandas as pd
from bs4 import BeautifulSoup, Comment


url = "https://www.basketball-reference.com/players/j/jordami01.html"

soup = BeautifulSoup(requests.get(url).content, "lxml")
soup = BeautifulSoup("\n".join(soup.find_all(text=Comment)), "lxml")

df = pd.read_html(str(soup.select_one("table#per_poss")))[0]
print(df.to_markdown())

打印：

	赛季	年龄	Tm	Lg	Pos	G	GS	MP	FG	FGA	FG%	3P	3PA	3P%	2P	2PA	2P%	FT	FTA	FT%	ORB	DRB	TRB	AST	STL	BLK	TOV	PF	PTS	未命名：29	ORtg	DRtg
0	1984-85	21	CHI	NBA	SG	82	82	3144	12.9	25	0.515	0.1	0.8	0.173	12.7	24.2	0.526	9.7	11.5	0.845	2.6	5.6	8.2	7.4	3	1.1	4.5	4.4	35.5	南	118	107
1	1985-86	22	CHI	NBA	SG	18	7	451	16	35	0.457	0.3	1.9	0.167	15.7	33.1	0.474	11.2	13.3	0.84	2.5	4.4	6.8	5.7	3.9	2.2	4.8	4.9	43.5	南	109	107
2	1986-87	23	CHI	NBA	SG	82	82	3281	16.8	34.8	0.482	0.2	1	0.182	16.6	33.8	0.491	12.7	14.8	0.857	2.5	4	6.6	5.8	3.6	1.9	4.2	3.6	46.4	南	117	104
3	1987-88	24	CHI	NBA	SG	82	82	3311	16.2	30.3	0.535	0.1	0.8	0.132	16.1	29.5	0.546	11	13.1	0.841	2.1	4.7	6.8	7.4	3.9	2	3.8	4.1	43.6	南	123	101
4	1988-89	25	CHI	NBA	SG	81	81	3255	14.7	27.3	0.538	0.4	1.5	0.276	14.3	25.8	0.553	10.2	12.1	0.85	2.3	7.6	9.9	9.9	3.6	1	4.4	3.8	40	南	123	103
5	1989-90	26	CHI	NBA	SG	82	82	3197	16	30.5	0.526	1.4	3.8	0.376	14.6	26.7	0.548	9.2	10.8	0.848	2.2	6.6	8.8	8.1	3.5	0.8	3.8	3.7	42.7	南	123	106
6	1990-91	27	CHI	NBA	SG	82	82	3034	16.4	30.4	0.539	0.5	1.5	0.312	15.9	28.9	0.551	9.4	11.1	0.851	2	6.2	8.1	7.5	3.7	1.4	3.3	3.8	42.7	南	125	102
7	1991-92	28	CHI	NBA	SG	80	80	3102	15.5	29.8	0.519	0.4	1.6	0.27	15	28.2	0.533	8	9.7	0.832	1.5	6.9	8.4	8	3	1.2	3.3	3.3	39.4	南	121	102
8	1992-93	29	CHI	NBA	SG	78	78	3067	16.8	33.9	0.495	1.4	3.9	0.352	15.4	30	0.514	8.1	9.6	0.837	2.3	6.5	8.8	7.2	3.7	1	3.5	3.2	43	南	119	102
9	1994-95	31	CHI	NBA	SG	17	17	668	13	31.5	0.411	1.2	2.5	0.5	11.7	29	0.403	8.5	10.6	0.801	2	7.2	9.1	7	2.3	1	2.7	3.7	35.7	南	109	103
10	1995-96	32	CHI	NBA	SG	82	82	3090	15.6	31.5	0.495	1.9	4.4	0.427	13.7	27.1	0.506	9.3	11.2	0.834	2.5	6.7	9.3	6	3.1	0.7	3.4	3.3	42.5	南	124	100
11	1996-97	33	CHI	NBA	SG	82	82	3106	15.8	32.5	0.486	1.9	5.1	0.374	13.9	27.4	0.507	8.2	9.9	0.833	1.9	6.3	8.3	6	2.4	0.8	2.9	2.7	41.8	南	121	102
12	1997-98	34	CHI	NBA	SG	82	82	3181	14.9	32.1	0.465	0.5	2.1	0.238	14.4	30	0.482	9.6	12.2	0.784	2.2	5.8	8.1	4.8	2.4	0.8	3.1	2.6	40	南	114	100
13	2001-02	38	WAS	NBA	小前锋	60	53	2093	14.3	34.4	0.416	0.3	1.4	0.189	14	33	0.426	6.8	8.6	0.79	1.3	7.5	8.8	8	2.2	0.7	4.2	3.1	35.7	南	99	105
14	2002-03	39	WAS	NBA	SF	82	67	3031	12.2	27.4	0.445	0.3	1	0.291	11.9	26.4	0.45	4.8	5.8	0.821	1.3	7.7	8.9	5.6	2.2	0.7	3.1	3.1	29.5	楠	101	103
15	生涯	职业	NBA	楠楠	1039	1072	楠	41011	15.3	30.7	0.497	0.7	2.2	0.327	14.5	28.5	0.51	9.2	11	0.835	2.1	6.3	8.3	7	3.1	1.1	3.7	3.5	40.4	楠	118	103
16	南	南	南	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan
17	13 seasons	nan	CHI	NBA	nan	930	919	35887	15.5	30.8	0.505	0.8	2.4	0.332	14.8	28.4	0.52	9.6	11.5	0.838	2.2	6.1	8.3	7.1	3.3	1.2	3.7	3.5	41.5	南	120	103
18 南	个赛季	2	WAS	NBA	南	142	120	5124	13.1	30.3	0.431	0.3	1.1	0.241	12.8	29.1	0.439	5.6	7	0.805	1.3	7.6	8.9	6.6	2.2	0.7	3.6	3.1	32	nan	100	104

要迭代数据帧的行，您可以使用df.iterrows() 例如：

for index, row in df.iterrows():
    print(row["Season"], row["Age"])

打印：

1984-85 21.0
1985-86 22.0
1986-87 23.0
1987-88 24.0
1988-89 25.0

...

The page is storing the <table> data inside the HTML comment  so normally BeautifulSoup doesn't see it. To load it as pandas dataframe you can use next example:

import requests
import pandas as pd
from bs4 import BeautifulSoup, Comment


url = "https://www.basketball-reference.com/players/j/jordami01.html"

soup = BeautifulSoup(requests.get(url).content, "lxml")
soup = BeautifulSoup("\n".join(soup.find_all(text=Comment)), "lxml")

df = pd.read_html(str(soup.select_one("table#per_poss")))[0]
print(df.to_markdown())

Prints:

	Season	Age	Tm	Lg	Pos	G	GS	MP	FG	FGA	FG%	3P	3PA	3P%	2P	2PA	2P%	FT	FTA	FT%	ORB	DRB	TRB	AST	STL	BLK	TOV	PF	PTS	Unnamed: 29	ORtg	DRtg
0	1984-85	21	CHI	NBA	SG	82	82	3144	12.9	25	0.515	0.1	0.8	0.173	12.7	24.2	0.526	9.7	11.5	0.845	2.6	5.6	8.2	7.4	3	1.1	4.5	4.4	35.5	nan	118	107
1	1985-86	22	CHI	NBA	SG	18	7	451	16	35	0.457	0.3	1.9	0.167	15.7	33.1	0.474	11.2	13.3	0.84	2.5	4.4	6.8	5.7	3.9	2.2	4.8	4.9	43.5	nan	109	107
2	1986-87	23	CHI	NBA	SG	82	82	3281	16.8	34.8	0.482	0.2	1	0.182	16.6	33.8	0.491	12.7	14.8	0.857	2.5	4	6.6	5.8	3.6	1.9	4.2	3.6	46.4	nan	117	104
3	1987-88	24	CHI	NBA	SG	82	82	3311	16.2	30.3	0.535	0.1	0.8	0.132	16.1	29.5	0.546	11	13.1	0.841	2.1	4.7	6.8	7.4	3.9	2	3.8	4.1	43.6	nan	123	101
4	1988-89	25	CHI	NBA	SG	81	81	3255	14.7	27.3	0.538	0.4	1.5	0.276	14.3	25.8	0.553	10.2	12.1	0.85	2.3	7.6	9.9	9.9	3.6	1	4.4	3.8	40	nan	123	103
5	1989-90	26	CHI	NBA	SG	82	82	3197	16	30.5	0.526	1.4	3.8	0.376	14.6	26.7	0.548	9.2	10.8	0.848	2.2	6.6	8.8	8.1	3.5	0.8	3.8	3.7	42.7	nan	123	106
6	1990-91	27	CHI	NBA	SG	82	82	3034	16.4	30.4	0.539	0.5	1.5	0.312	15.9	28.9	0.551	9.4	11.1	0.851	2	6.2	8.1	7.5	3.7	1.4	3.3	3.8	42.7	nan	125	102
7	1991-92	28	CHI	NBA	SG	80	80	3102	15.5	29.8	0.519	0.4	1.6	0.27	15	28.2	0.533	8	9.7	0.832	1.5	6.9	8.4	8	3	1.2	3.3	3.3	39.4	nan	121	102
8	1992-93	29	CHI	NBA	SG	78	78	3067	16.8	33.9	0.495	1.4	3.9	0.352	15.4	30	0.514	8.1	9.6	0.837	2.3	6.5	8.8	7.2	3.7	1	3.5	3.2	43	nan	119	102
9	1994-95	31	CHI	NBA	SG	17	17	668	13	31.5	0.411	1.2	2.5	0.5	11.7	29	0.403	8.5	10.6	0.801	2	7.2	9.1	7	2.3	1	2.7	3.7	35.7	nan	109	103
10	1995-96	32	CHI	NBA	SG	82	82	3090	15.6	31.5	0.495	1.9	4.4	0.427	13.7	27.1	0.506	9.3	11.2	0.834	2.5	6.7	9.3	6	3.1	0.7	3.4	3.3	42.5	nan	124	100
11	1996-97	33	CHI	NBA	SG	82	82	3106	15.8	32.5	0.486	1.9	5.1	0.374	13.9	27.4	0.507	8.2	9.9	0.833	1.9	6.3	8.3	6	2.4	0.8	2.9	2.7	41.8	nan	121	102
12	1997-98	34	CHI	NBA	SG	82	82	3181	14.9	32.1	0.465	0.5	2.1	0.238	14.4	30	0.482	9.6	12.2	0.784	2.2	5.8	8.1	4.8	2.4	0.8	3.1	2.6	40	nan	114	100
13	2001-02	38	WAS	NBA	SF	60	53	2093	14.3	34.4	0.416	0.3	1.4	0.189	14	33	0.426	6.8	8.6	0.79	1.3	7.5	8.8	8	2.2	0.7	4.2	3.1	35.7	nan	99	105
14	2002-03	39	WAS	NBA	SF	82	67	3031	12.2	27.4	0.445	0.3	1	0.291	11.9	26.4	0.45	4.8	5.8	0.821	1.3	7.7	8.9	5.6	2.2	0.7	3.1	3.1	29.5	nan	101	103
15	Career	nan	nan	NBA	nan	1072	1039	41011	15.3	30.7	0.497	0.7	2.2	0.327	14.5	28.5	0.51	9.2	11	0.835	2.1	6.3	8.3	7	3.1	1.1	3.7	3.5	40.4	nan	118	103
16	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan
17	13 seasons	nan	CHI	NBA	nan	930	919	35887	15.5	30.8	0.505	0.8	2.4	0.332	14.8	28.4	0.52	9.6	11.5	0.838	2.2	6.1	8.3	7.1	3.3	1.2	3.7	3.5	41.5	nan	120	103
18	2 seasons	nan	WAS	NBA	nan	142	120	5124	13.1	30.3	0.431	0.3	1.1	0.241	12.8	29.1	0.439	5.6	7	0.805	1.3	7.6	8.9	6.6	2.2	0.7	3.6	3.1	32	nan	100	104

To iterate the rows of dataframe, you can use df.iterrows() for example:

for index, row in df.iterrows():
    print(row["Season"], row["Age"])

Prints:

1984-85 21.0
1985-86 22.0
1986-87 23.0
1987-88 24.0
1988-89 25.0

...

回复收藏 0 原文

~没有更多了~

关于作者

静若繁花

暂无简介

文章

25 人气

关注发私信

友情链接

文江博客

美丽的汤没有通过ID找到特定的桌子

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

饮湿

明月

02

hs1283

风向决定发型

落花浅忆

友情链接

美丽的汤没有通过ID找到特定的桌子

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

饮湿

明月

02

hs1283

风向决定发型

落花浅忆

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。