当前位置：文江博客话题详情

HTML Python beautifulsoup

我正在努力用美丽的小组刮擦正确的URL

发布于 2025-01-22 12:22:54 字数 1033 浏览 0 评论 0 原文

我正在编写网络刮板，并正在努力从网页上获取HREF链接。 url是我正在尝试获得此HREF链接： https://wwww.tesseratheratheraperapeutics.com 在网站的部分下方

<a class="text-border-botton-color " target="_blank" href="https://www.tesseratherapeutics.com/">https://www.tesseratherapeutics.com/</a>

是我的代码：

from cgi import print_directory
import pandas as pd
import os
import requests
from bs4 import BeautifulSoup
import re

URL = "https://vcnewsdaily.com/Tessera%20Therapeutics/venture-funding.php"
page = requests.get(URL)
soup = BeautifulSoup(page.text, "html.parser")
links = []

for link in soup.findAll(class_="text-border-botton-color "):
    links.append(link.get("href"))
print(links)

当我运行代码时，我得到了：

[]

有人可以帮助我获得正确的HREF链接吗？

谢谢！

原文

I am writing a web scraper and am struggling to get the href link from a web page. The URL is https://vcnewsdaily.com/Tessera%20Therapeutics/venture-funding.php I am trying to get this href link: https://www.tesseratherapeutics.com from the below section of the website

<a class="text-border-botton-color " target="_blank" href="https://www.tesseratherapeutics.com/">https://www.tesseratherapeutics.com/</a>

Here is my code:

from cgi import print_directory
import pandas as pd
import os
import requests
from bs4 import BeautifulSoup
import re

URL = "https://vcnewsdaily.com/Tessera%20Therapeutics/venture-funding.php"
page = requests.get(URL)
soup = BeautifulSoup(page.text, "html.parser")
links = []

for link in soup.findAll(class_="text-border-botton-color "):
    links.append(link.get("href"))
print(links)

When I run my code, I get this:

[]

Can someone help me get the correct href link?

Thanks!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

酷到爆炸 2025-01-29 12:22:54

@Ggorlen表示错字：“ Text-Border-Botton-Color” 而不是“ Text-Border-Botton-Color” 意味着您必须删除以后颜色后存在的额外空间。

from cgi import print_directory
import pandas as pd
import os
import requests
from bs4 import BeautifulSoup
import re

URL = "https://vcnewsdaily.com/Tessera%20Therapeutics/venture-funding.php"
page = requests.get(URL)
soup = BeautifulSoup(page.text, "html.parser")
links = []

for link in soup.findAll(class_="text-border-botton-color"):
    links.append(link.get("href"))
print(links)

输出：

['https://www.tesseratherapeutics.com/']

@ggorlen stated Typo: "text-border-botton-color" not "text-border-botton-color " meaning you have to remove extra space that exists after color.

from cgi import print_directory
import pandas as pd
import os
import requests
from bs4 import BeautifulSoup
import re

URL = "https://vcnewsdaily.com/Tessera%20Therapeutics/venture-funding.php"
page = requests.get(URL)
soup = BeautifulSoup(page.text, "html.parser")
links = []

for link in soup.findAll(class_="text-border-botton-color"):
    links.append(link.get("href"))
print(links)

Output:

['https://www.tesseratherapeutics.com/']

回复收藏 0 原文

~没有更多了~

关于作者

遗失的美好

暂无简介

文章

26 人气

关注发私信

饮湿

文章 0 评论 0

关注

明月

文章 0 评论 0

关注

02

文章 0 评论 0

关注

hs1283

文章 0 评论 0

关注

风向决定发型

文章 0 评论 0

关注

落花浅忆

文章 0 评论 0

友情链接

文江博客

我正在努力用美丽的小组刮擦正确的URL

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

饮湿

明月

02

hs1283

风向决定发型

落花浅忆

友情链接

我正在努力用美丽的小组刮擦正确的URL

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

饮湿

明月

02

hs1283

风向决定发型

落花浅忆

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。