如何从App Store中选择应用程序的所有链接并提取其HREF?

发布于 2025-02-01 23:03:54 字数 451 浏览 4 评论 0原文

from bs4 import BeautifulSoup
import requests
from urllib.request import urlopen

url = f'https://www.apple.com/kr/search/youtube?src=globalnav'
response = requests.get(url)
html = response.text
soup = BeautifulSoup(html, 'html.parser')
links = soup.select(".rf-serp-productname-list")
print(links)

我想爬行所显示的应用程序的所有链接。当我搜索关键字时,我以为links = sop.Select(“。rf-serp-productname-list”)可以使用,但是链接列表为空。

我应该怎么办?

from bs4 import BeautifulSoup
import requests
from urllib.request import urlopen

url = f'https://www.apple.com/kr/search/youtube?src=globalnav'
response = requests.get(url)
html = response.text
soup = BeautifulSoup(html, 'html.parser')
links = soup.select(".rf-serp-productname-list")
print(links)

I want to crawl through all links of shown apps. When I searched for a keyword, I thought links = soup.select(".rf-serp-productname-list") would work, but links list is empty.

What should I do?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

Saygoodbye 2025-02-08 23:03:55

只需检查此代码,我认为是您想要的:

import re
import requests
from bs4 import BeautifulSoup

pages = set()

def get_links(page_url):
  global pages
  pattern = re.compile("^(/)")
  html = requests.get(f"your_URL{page_url}").text # fstrings require Python 3.6+
  soup = BeautifulSoup(html, "html.parser")
  for link in soup.find_all("a", href=pattern):
    if "href" in link.attrs:
      if link.attrs["href"] not in pages:
        new_page = link.attrs["href"]
        print(new_page)
        pages.add(new_page)
        get_links(new_page)
        
get_links("")

来源:
htttts> a>

部分:

for link in soup.find_all("a", href=pattern):
     #do something

您可以更改 我认为检查关键字

Just check this code, I think is what you want:

import re
import requests
from bs4 import BeautifulSoup

pages = set()

def get_links(page_url):
  global pages
  pattern = re.compile("^(/)")
  html = requests.get(f"your_URL{page_url}").text # fstrings require Python 3.6+
  soup = BeautifulSoup(html, "html.parser")
  for link in soup.find_all("a", href=pattern):
    if "href" in link.attrs:
      if link.attrs["href"] not in pages:
        new_page = link.attrs["href"]
        print(new_page)
        pages.add(new_page)
        get_links(new_page)
        
get_links("")

Source:
https://gist.github.com/AO8/f721b6736c8a4805e99e377e72d3edbf

You can change the part:

for link in soup.find_all("a", href=pattern):
     #do something

To check for a keyword I think

爱你不解释 2025-02-08 23:03:55

您正在烹饪,因此首先要品尝一下,然后检查您期望的一切是否包含在其中

。作为响应,您的预期工具的预期有点有所不同。

要获取链接列表,请选择更具体的选择:

links = [a.get('href') for a in soup.select('a.icon')]  

输出:

['https://apps.apple.com/kr/app/youtube/id544007664', 'https://apps.apple.com/kr/app/%EC%BF%A0%ED%8C%A1%ED%94%8C%EB%A0%88%EC%9D%B4/id1536885649', 'https://apps.apple.com/kr/app/youtube-music/id1017492454', 'https://apps.apple.com/kr/app/instagram/id389801252', 'https://apps.apple.com/kr/app/youtube-kids/id936971630', 'https://apps.apple.com/kr/app/youtube-studio/id888530356', 'https://apps.apple.com/kr/app/google-chrome/id535886823', 'https://apps.apple.com/kr/app/tiktok-%ED%8B%B1%ED%86%A1/id1235601864', 'https://apps.apple.com/kr/app/google/id284815942']

You are cooking a soup so first at all taste it and check if everything you expect contains in it.

ResultSet of your selection is empty cause structure in response differs a bit from your expected one from the developer tools.

To get the list of links select more specific:

links = [a.get('href') for a in soup.select('a.icon')]  

Output:

['https://apps.apple.com/kr/app/youtube/id544007664', 'https://apps.apple.com/kr/app/%EC%BF%A0%ED%8C%A1%ED%94%8C%EB%A0%88%EC%9D%B4/id1536885649', 'https://apps.apple.com/kr/app/youtube-music/id1017492454', 'https://apps.apple.com/kr/app/instagram/id389801252', 'https://apps.apple.com/kr/app/youtube-kids/id936971630', 'https://apps.apple.com/kr/app/youtube-studio/id888530356', 'https://apps.apple.com/kr/app/google-chrome/id535886823', 'https://apps.apple.com/kr/app/tiktok-%ED%8B%B1%ED%86%A1/id1235601864', 'https://apps.apple.com/kr/app/google/id284815942']
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文