我正在尝试弄清楚如何在列表中嵌入链接或超链接元素

发布于 2025-02-12 12:09:01 字数 1345 浏览 0 评论 0原文

我正在尝试弄清楚如何将每个播放器的链接嵌入列表中的玩家。因此,当我单击播放器的名称时,我将被指向该玩家的MLB.com页面。例如,如果我单击 yordan alvarez 由于它将嵌入到Yordan Alvarez中。

这是我到目前为止尝试过的,但我目前被困。我将如何能够在玩家内部的链接上启动链接,从而像 yordan Alvarez

from bs4 import BeautifulSoup
import requests 
import re 

# Request URL 

url_1 = 'https://www.mlb.com/stats/'
req = requests.get(url_1).text
document = BeautifulSoup(req, 'html.parser')

# Body 

tbody = document.tbody

# Headers

thead = document.thead 

# Player Names 

full_name = tbody.find_all('a') 

# List of Players 

players_list = []

for name in full_name: 
    if name.get('aria-label'):
        names = name.get('aria-label')
        players_list.append(names)

# List of Links

hrefs_list = []

hrefs = tbody.find_all('a',href = True) 

# Players & Their Links 

for link,player in zip(hrefs, players_list):
    href_link = link['href']
    if re.search('^/player', href_link):
        stats_link = f'https://www.mlb.com{href_link}'
        hrefs_list.append(stats_link)
        hyperlink_format = f'<a href= {stats_link}>{player}</a>'
print(dict(zip(players_list, hrefs_list)))

I am trying to figure out how to embed the links from mlb.com for each player to the player inside of a list. So, when I click on the player's name then I will get directed to the mlb.com page for that player. For example, if I click on Yordan Alvarez it would take me to his stats since it would be embedded into Yordan Alvarez.

This is what I have tried so far, but I am currently stuck. How would I be able to embed the links inside of the players so that it works like this Yordan Alvarez?

from bs4 import BeautifulSoup
import requests 
import re 

# Request URL 

url_1 = 'https://www.mlb.com/stats/'
req = requests.get(url_1).text
document = BeautifulSoup(req, 'html.parser')

# Body 

tbody = document.tbody

# Headers

thead = document.thead 

# Player Names 

full_name = tbody.find_all('a') 

# List of Players 

players_list = []

for name in full_name: 
    if name.get('aria-label'):
        names = name.get('aria-label')
        players_list.append(names)

# List of Links

hrefs_list = []

hrefs = tbody.find_all('a',href = True) 

# Players & Their Links 

for link,player in zip(hrefs, players_list):
    href_link = link['href']
    if re.search('^/player', href_link):
        stats_link = f'https://www.mlb.com{href_link}'
        hrefs_list.append(stats_link)
        hyperlink_format = f'<a href= {stats_link}>{player}</a>'
print(dict(zip(players_list, hrefs_list)))

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

相思碎 2025-02-19 12:09:01

您可以使用find_all可以在属性上使用Regexp的事实。
结合理解将简化为:

from bs4 import BeautifulSoup
import requests 
import re 

base_url = 'https://www.mlb.com'
stats_url = f'{base_url}/stats/'
req = requests.get(stats_url).text
soup = BeautifulSoup(req, 'html.parser')

pattern = re.compile(r"/player/\d+")
links = soup.find_all('a', attrs={'href': pattern})

{a.text: f"{base_url}/{a.attrs.get('href')}" for a in links}

You could use the fact that find_all can use regexp on attributes.
Combining this with a dict comprehension would simplify this to:

from bs4 import BeautifulSoup
import requests 
import re 

base_url = 'https://www.mlb.com'
stats_url = f'{base_url}/stats/'
req = requests.get(stats_url).text
soup = BeautifulSoup(req, 'html.parser')

pattern = re.compile(r"/player/\d+")
links = soup.find_all('a', attrs={'href': pattern})

{a.text: f"{base_url}/{a.attrs.get('href')}" for a in links}
咿呀咿呀哟 2025-02-19 12:09:01

尽量避免应将其拉开的多个列表,因为它们应该具有相同的长度,而是尝试一口气收集您的数据。

您可以使用CSS选择器选择链接/播放器名称,并取决于预期的结果,您可以创建dict

{a.get('aria-label'): f"{base_url}{a.attrs.get('href')}" for a in soup.select('a[href^="/player/"]')}

或DICS列表:

data = []

for a in soup.select('a[href^="/player/"]'):
    data.append({
        'name':a.get('aria-label'),
        'url':f"{base_url}{a.attrs.get('href')}"
    })
data
示例
from bs4 import BeautifulSoup
import requests 

base_url = 'https://www.mlb.com'
req = requests.get(f'{base_url}/stats/').text
soup = BeautifulSoup(req)

data = []

for a in soup.select('a[href^="/player/"]'):
    data.append({
        'name':a.get('aria-label'),
        'url':f"{base_url}{a.attrs.get('href')}"
    })
data
输出
[{'name': 'Yordan Alvarez', 'url': 'https://www.mlb.com/player/670541'},
 {'name': 'Paul Goldschmidt', 'url': 'https://www.mlb.com/player/502671'},
 {'name': 'Mike Trout', 'url': 'https://www.mlb.com/player/545361'},
 {'name': 'Aaron Judge', 'url': 'https://www.mlb.com/player/592450'},
 {'name': 'Bryce Harper', 'url': 'https://www.mlb.com/player/547180'},
 {'name': 'Rafael Devers', 'url': 'https://www.mlb.com/player/646240'},
 {'name': 'Jose Ramirez', 'url': 'https://www.mlb.com/player/608070'},
 {'name': 'Manny Machado', 'url': 'https://www.mlb.com/player/592518'},
 {'name': 'Pete Alonso', 'url': 'https://www.mlb.com/player/624413'},...]

Try to avoid multiple lists that should be zipped, cause they should have the same length, instead try to collect your data in one go.

You could use css selectors to select the links / player names and depending on the expected result you could create a dict:

{a.get('aria-label'): f"{base_url}{a.attrs.get('href')}" for a in soup.select('a[href^="/player/"]')}

or a list of dicts:

data = []

for a in soup.select('a[href^="/player/"]'):
    data.append({
        'name':a.get('aria-label'),
        'url':f"{base_url}{a.attrs.get('href')}"
    })
data
Example
from bs4 import BeautifulSoup
import requests 

base_url = 'https://www.mlb.com'
req = requests.get(f'{base_url}/stats/').text
soup = BeautifulSoup(req)

data = []

for a in soup.select('a[href^="/player/"]'):
    data.append({
        'name':a.get('aria-label'),
        'url':f"{base_url}{a.attrs.get('href')}"
    })
data
Output
[{'name': 'Yordan Alvarez', 'url': 'https://www.mlb.com/player/670541'},
 {'name': 'Paul Goldschmidt', 'url': 'https://www.mlb.com/player/502671'},
 {'name': 'Mike Trout', 'url': 'https://www.mlb.com/player/545361'},
 {'name': 'Aaron Judge', 'url': 'https://www.mlb.com/player/592450'},
 {'name': 'Bryce Harper', 'url': 'https://www.mlb.com/player/547180'},
 {'name': 'Rafael Devers', 'url': 'https://www.mlb.com/player/646240'},
 {'name': 'Jose Ramirez', 'url': 'https://www.mlb.com/player/608070'},
 {'name': 'Manny Machado', 'url': 'https://www.mlb.com/player/592518'},
 {'name': 'Pete Alonso', 'url': 'https://www.mlb.com/player/624413'},...]
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文