我正在努力用美丽的小组刮擦正确的URL

发布于 2025-01-22 12:22:54 字数 1033 浏览 0 评论 0 原文

我正在编写网络刮板,并正在努力从网页上获取HREF链接。 url是我正在尝试获得此HREF链接: https://wwww.tesseratheratheraperapeutics.com 在网站的部分下方

<a class="text-border-botton-color " target="_blank" href="https://www.tesseratherapeutics.com/">https://www.tesseratherapeutics.com/</a>

是我的代码:

from cgi import print_directory
import pandas as pd
import os
import requests
from bs4 import BeautifulSoup
import re

URL = "https://vcnewsdaily.com/Tessera%20Therapeutics/venture-funding.php"
page = requests.get(URL)
soup = BeautifulSoup(page.text, "html.parser")
links = []

for link in soup.findAll(class_="text-border-botton-color "):
    links.append(link.get("href"))
print(links)

当我运行代码时,我得到了:

[]

有人可以帮助我获得正确的HREF链接吗?

谢谢!

I am writing a web scraper and am struggling to get the href link from a web page. The URL is https://vcnewsdaily.com/Tessera%20Therapeutics/venture-funding.php I am trying to get this href link: https://www.tesseratherapeutics.com from the below section of the website

<a class="text-border-botton-color " target="_blank" href="https://www.tesseratherapeutics.com/">https://www.tesseratherapeutics.com/</a>

Here is my code:

from cgi import print_directory
import pandas as pd
import os
import requests
from bs4 import BeautifulSoup
import re

URL = "https://vcnewsdaily.com/Tessera%20Therapeutics/venture-funding.php"
page = requests.get(URL)
soup = BeautifulSoup(page.text, "html.parser")
links = []

for link in soup.findAll(class_="text-border-botton-color "):
    links.append(link.get("href"))
print(links)

When I run my code, I get this:

[]

Can someone help me get the correct href link?

Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

酷到爆炸 2025-01-29 12:22:54

@Ggorlen表示错字:“ Text-Border-Botton-Color” 而不是“ Text-Border-Botton-Color” 意味着您必须删除以后颜色后存在的额外空间。

from cgi import print_directory
import pandas as pd
import os
import requests
from bs4 import BeautifulSoup
import re

URL = "https://vcnewsdaily.com/Tessera%20Therapeutics/venture-funding.php"
page = requests.get(URL)
soup = BeautifulSoup(page.text, "html.parser")
links = []

for link in soup.findAll(class_="text-border-botton-color"):
    links.append(link.get("href"))
print(links)

输出:

['https://www.tesseratherapeutics.com/']

@ggorlen stated Typo: "text-border-botton-color" not "text-border-botton-color " meaning you have to remove extra space that exists after color.

from cgi import print_directory
import pandas as pd
import os
import requests
from bs4 import BeautifulSoup
import re

URL = "https://vcnewsdaily.com/Tessera%20Therapeutics/venture-funding.php"
page = requests.get(URL)
soup = BeautifulSoup(page.text, "html.parser")
links = []

for link in soup.findAll(class_="text-border-botton-color"):
    links.append(link.get("href"))
print(links)

Output:

['https://www.tesseratherapeutics.com/']
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文