如何获取需要复选框的下载链接,请在其他对话框中检查

发布于 2025-02-03 02:56:29 字数 2092 浏览 2 评论 0 原文

我想从

尝试手动下载时,真实的下载链接看起来像:

https://falextracts.s3.amazonaws.com/exclusions/public%20v2/sam_exclusions_public_extract_v2_22150。 ZIP?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20220530T143743Z&X-Amz-SignedHeaders=host&X-Amz-Expires=2699&X-Amz-Credential=AKIAY3LPYEEXWOQWHCIY%2F20220530%2Fus- east-1%2Fs3%2Faws4_request&X-Amz-Signature=3eca59f75a4e1f6aa59fc810da8f391f1ebfd8ca5a804d56b79c3eb9c4d82e32

My function gets only initial link, which refers to the real link:

import json
import requests
from operator import itemgetter


files_url = 'https://sam.gov/api/prod/fileextractservices/v1/api/listfiles?random=1653676394983&domain=Exclusions/Public%20V2&privacy=Public'

def get_file():
    response = requests.get(files_url, stream=True)
    links_resp = json.loads(response.text)
    links_dicts = [d for d in links_resp['_embedded']['customS3ObjectSummaryList'] if d['displayKey'].count('SAM_Exclus')]
    sorted_links = sorted(links_dicts, key=itemgetter('dateModified'), reverse=True)
    return sorted_links[0]['_links']['self']['href']

get_file()

Result:

'https://s3.amazonaws.com/falextracts/Exclusions/Public V2/SAM_Exclusions_Public_Extract_V2_22150.ZIP'

But by following the above link, I get Access denied< /strong>

因此,我将感谢有关如何获得真实下载链接的任何提示

I want to download the last publicly available file from https://sam.gov/data-services/Exclusions/Public%20V2?privacy=Public

while trying to download manually, the real download links look like:

https://falextracts.s3.amazonaws.com/Exclusions/Public%20V2/SAM_Exclusions_Public_Extract_V2_22150.ZIP?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20220530T143743Z&X-Amz-SignedHeaders=host&X-Amz-Expires=2699&X-Amz-Credential=AKIAY3LPYEEXWOQWHCIY%2F20220530%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Signature=3eca59f75a4e1f6aa59fc810da8f391f1ebfd8ca5a804d56b79c3eb9c4d82e32

My function gets only initial link, which refers to the real link:

import json
import requests
from operator import itemgetter


files_url = 'https://sam.gov/api/prod/fileextractservices/v1/api/listfiles?random=1653676394983&domain=Exclusions/Public%20V2&privacy=Public'

def get_file():
    response = requests.get(files_url, stream=True)
    links_resp = json.loads(response.text)
    links_dicts = [d for d in links_resp['_embedded']['customS3ObjectSummaryList'] if d['displayKey'].count('SAM_Exclus')]
    sorted_links = sorted(links_dicts, key=itemgetter('dateModified'), reverse=True)
    return sorted_links[0]['_links']['self']['href']

get_file()

Result:

'https://s3.amazonaws.com/falextracts/Exclusions/Public V2/SAM_Exclusions_Public_Extract_V2_22150.ZIP'

But by following the above link, I get Access denied

So I will appreciate any hints on how to get real download links

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

通知家属抬走 2025-02-10 02:56:29

我已经尽可能地编辑了您的代码,以便您可以理解。请求库可以将其转换为JSON本身。

不在代码开头的导入看起来不太适合阅读...

import requests as req
from operator import itemgetter

files_url = "https://sam.gov/api/prod/fileextractservices/v1/api/listfiles?random=1653676394983&domain=Exclusions/Public%20V2&privacy=Public"
down_url = "https://sam.gov/api/prod/fileextractservices/v1/api/download/Exclusions/Public%20V2/{}?privacy=Public"

def get_file():
    response = req.get(files_url, stream=True).json()

    links_dicts = [d for d in response["_embedded"]["customS3ObjectSummaryList"]]
    sorted_links = sorted(links_dicts, key=itemgetter('dateModified'), reverse=True)

    key = sorted_links[0]['displayKey']
    
    down = req.get(down_url.format(key))

    if not down.status_code == 200:
        return False

    print(key)
    open(key, 'wb').write(down.content)
    
    return True

get_file()

I've edited your code as much as possible so you can understand. The requests library can convert it to json itself.

imports that are not at the beginning of the code do not look very good for reading...

import requests as req
from operator import itemgetter

files_url = "https://sam.gov/api/prod/fileextractservices/v1/api/listfiles?random=1653676394983&domain=Exclusions/Public%20V2&privacy=Public"
down_url = "https://sam.gov/api/prod/fileextractservices/v1/api/download/Exclusions/Public%20V2/{}?privacy=Public"

def get_file():
    response = req.get(files_url, stream=True).json()

    links_dicts = [d for d in response["_embedded"]["customS3ObjectSummaryList"]]
    sorted_links = sorted(links_dicts, key=itemgetter('dateModified'), reverse=True)

    key = sorted_links[0]['displayKey']
    
    down = req.get(down_url.format(key))

    if not down.status_code == 200:
        return False

    print(key)
    open(key, 'wb').write(down.content)
    
    return True

get_file()
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文