在Beautifuresoup Python Web刮板中获得错误的链接

发布于 2025-01-22 16:48:04 字数 836 浏览 0 评论 0原文

我正在编写网络刮板,并正在努力从网页上获取HREF链接。 URL为 https://www.seedinvest.com/auto 我试图获得HREF链接他们的个人文章。这是一个示例:

<a class="card-url full-size-content" href="https://www.seedinvest.com/soil.connect/seed"></a>

这是我的代码:

import pandas as pd
import os
import requests
from bs4 import BeautifulSoup
import re



URL = "https://www.seedinvest.com/offerings"
page = requests.get(URL)
soup = BeautifulSoup(page.text, "html.parser")


### Searching for all embeded company website links and displaying them

links = []
for link in soup.findAll(class_="card-url full-size-content"):
    links.append(link.get('href'))
print(links)

当我运行代码时,我得到了:

[]

您能帮我找到正确的链接吗?

I am writing a web scraper and am struggling to get the href link from a web page. The URL is https://www.seedinvest.com/auto I am trying to get the href links of their individual articles. Here is an example:

<a class="card-url full-size-content" href="https://www.seedinvest.com/soil.connect/seed"></a>

Here is my code:

import pandas as pd
import os
import requests
from bs4 import BeautifulSoup
import re



URL = "https://www.seedinvest.com/offerings"
page = requests.get(URL)
soup = BeautifulSoup(page.text, "html.parser")


### Searching for all embeded company website links and displaying them

links = []
for link in soup.findAll(class_="card-url full-size-content"):
    links.append(link.get('href'))
print(links)

When I run my code, I get this:

[]

Can you help me find the right links?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

狼性发作 2025-01-29 16:48:04

它在

import pandas as pd
import os
import requests
from bs4 import BeautifulSoup
import re


URL = "https://www.seedinvest.com/auto"
page = requests.get(URL)
soup = BeautifulSoup(page.text, "html.parser")


### Searching for all embeded company website links and displaying them

links = []
for link in soup.findAll(class_="card-url full-size-content"):
    links.append(link.get('href'))
print(links)

['https://www.seedinvest.com/nowrx/series.c', 'https://www.seedinvest.com/appmail/seed', 'https://www.seedinvest.com/soil.connect/seed', 'https://www.seedinvest.com/cytonics/series.c.2']

It's working in url

import pandas as pd
import os
import requests
from bs4 import BeautifulSoup
import re


URL = "https://www.seedinvest.com/auto"
page = requests.get(URL)
soup = BeautifulSoup(page.text, "html.parser")


### Searching for all embeded company website links and displaying them

links = []
for link in soup.findAll(class_="card-url full-size-content"):
    links.append(link.get('href'))
print(links)

Output:

['https://www.seedinvest.com/nowrx/series.c', 'https://www.seedinvest.com/appmail/seed', 'https://www.seedinvest.com/soil.connect/seed', 'https://www.seedinvest.com/cytonics/series.c.2']
等数载,海棠开 2025-01-29 16:48:04

也许您在代码中使用了错误的URL:https://www.seedinvest.com/offerings而不是https://www.seedinvest.com/auto

Maybe you're using the wrong URL in your code: https://www.seedinvest.com/offerings instead of https://www.seedinvest.com/auto?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文