Python打印整个URL
我正在尝试拉出所有包含“ https://play.google.com/store/”的URL并打印整个字符串。当我运行当前代码时,它只打印“ https://play.google.com/store/”,但我正在寻找整个URL。有人可以将我指向正确的方向吗?这是我的代码:
import pandas as pd
import os
import requests
from bs4 import BeautifulSoup
import re
URL = "https://www.pocketgamer.com/android/best-tycoon-games-android/?page=3"
page = requests.get(URL)
soup = BeautifulSoup(page.text, "html.parser")
links = []
for link in soup.findAll("a", target="_blank"):
links.append(link.get('href'))
x = re.findall("https://play.google.com/store/", str(links))
print(x)
I am trying to pull all the urls that contain "https://play.google.com/store/" and print the entire string. When I run my current code, it only prints "https://play.google.com/store/" but I am looking for the entire URL. Can someone point me in the right direction? Here is my code:
import pandas as pd
import os
import requests
from bs4 import BeautifulSoup
import re
URL = "https://www.pocketgamer.com/android/best-tycoon-games-android/?page=3"
page = requests.get(URL)
soup = BeautifulSoup(page.text, "html.parser")
links = []
for link in soup.findAll("a", target="_blank"):
links.append(link.get('href'))
x = re.findall("https://play.google.com/store/", str(links))
print(x)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
re.findall
只是返回与正则符合正则匹配的文本的一部分,因此您要获得的只是https://play.google.com/store/
是在正则。您可以修改正则表达式,但是鉴于您要搜索的内容是链接列表,只需检查它们是否以https://play.google.com/store/
开始。例如:输出(用于查询):
re.findall
just returns the part of the text that matches the regex, so all you are getting is thehttps://play.google.com/store/
that is in the regex. You could modify the regex, but given what you are searching is a list of links, it's easier to just check if they start withhttps://play.google.com/store/
. For example:Output (for your query):