如何获取需要复选框的下载链接，请在其他对话框中检查

发布于 2025-02-03 02:56:29 字数 2092 浏览 2 评论 0 原文

我想从

尝试手动下载时，真实的下载链接看起来像：

https://falextracts.s3.amazonaws.com/exclusions/public%20v2/sam_exclusions_public_extract_v2_22150。 ZIP?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20220530T143743Z&X-Amz-SignedHeaders=host&X-Amz-Expires=2699&X-Amz-Credential=AKIAY3LPYEEXWOQWHCIY%2F20220530%2Fus- east-1%2Fs3%2Faws4_request&X-Amz-Signature=3eca59f75a4e1f6aa59fc810da8f391f1ebfd8ca5a804d56b79c3eb9c4d82e32

My function gets only initial link, which refers to the real link:

import json
import requests
from operator import itemgetter


files_url = 'https://sam.gov/api/prod/fileextractservices/v1/api/listfiles?random=1653676394983&domain=Exclusions/Public%20V2&privacy=Public'

def get_file():
    response = requests.get(files_url, stream=True)
    links_resp = json.loads(response.text)
    links_dicts = [d for d in links_resp['_embedded']['customS3ObjectSummaryList'] if d['displayKey'].count('SAM_Exclus')]
    sorted_links = sorted(links_dicts, key=itemgetter('dateModified'), reverse=True)
    return sorted_links[0]['_links']['self']['href']

get_file()

Result:

'https://s3.amazonaws.com/falextracts/Exclusions/Public V2/SAM_Exclusions_Public_Extract_V2_22150.ZIP'

But by following the above link, I get Access denied< /strong>

因此，我将感谢有关如何获得真实下载链接的任何提示

原文

I want to download the last publicly available file from https://sam.gov/data-services/Exclusions/Public%20V2?privacy=Public

while trying to download manually, the real download links look like:

https://falextracts.s3.amazonaws.com/Exclusions/Public%20V2/SAM_Exclusions_Public_Extract_V2_22150.ZIP?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20220530T143743Z&X-Amz-SignedHeaders=host&X-Amz-Expires=2699&X-Amz-Credential=AKIAY3LPYEEXWOQWHCIY%2F20220530%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Signature=3eca59f75a4e1f6aa59fc810da8f391f1ebfd8ca5a804d56b79c3eb9c4d82e32

My function gets only initial link, which refers to the real link:

import json
import requests
from operator import itemgetter


files_url = 'https://sam.gov/api/prod/fileextractservices/v1/api/listfiles?random=1653676394983&domain=Exclusions/Public%20V2&privacy=Public'

def get_file():
    response = requests.get(files_url, stream=True)
    links_resp = json.loads(response.text)
    links_dicts = [d for d in links_resp['_embedded']['customS3ObjectSummaryList'] if d['displayKey'].count('SAM_Exclus')]
    sorted_links = sorted(links_dicts, key=itemgetter('dateModified'), reverse=True)
    return sorted_links[0]['_links']['self']['href']

get_file()

Result:

'https://s3.amazonaws.com/falextracts/Exclusions/Public V2/SAM_Exclusions_Public_Extract_V2_22150.ZIP'

But by following the above link, I get Access denied

So I will appreciate any hints on how to get real download links

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

通知家属抬走 2025-02-10 02:56:29

我已经尽可能地编辑了您的代码，以便您可以理解。请求库可以将其转换为JSON本身。

不在代码开头的导入看起来不太适合阅读...

import requests as req
from operator import itemgetter

files_url = "https://sam.gov/api/prod/fileextractservices/v1/api/listfiles?random=1653676394983&domain=Exclusions/Public%20V2&privacy=Public"
down_url = "https://sam.gov/api/prod/fileextractservices/v1/api/download/Exclusions/Public%20V2/{}?privacy=Public"

def get_file():
    response = req.get(files_url, stream=True).json()

    links_dicts = [d for d in response["_embedded"]["customS3ObjectSummaryList"]]
    sorted_links = sorted(links_dicts, key=itemgetter('dateModified'), reverse=True)

    key = sorted_links[0]['displayKey']
    
    down = req.get(down_url.format(key))

    if not down.status_code == 200:
        return False

    print(key)
    open(key, 'wb').write(down.content)
    
    return True

get_file()

I've edited your code as much as possible so you can understand. The requests library can convert it to json itself.

imports that are not at the beginning of the code do not look very good for reading...

import requests as req
from operator import itemgetter

files_url = "https://sam.gov/api/prod/fileextractservices/v1/api/listfiles?random=1653676394983&domain=Exclusions/Public%20V2&privacy=Public"
down_url = "https://sam.gov/api/prod/fileextractservices/v1/api/download/Exclusions/Public%20V2/{}?privacy=Public"

def get_file():
    response = req.get(files_url, stream=True).json()

    links_dicts = [d for d in response["_embedded"]["customS3ObjectSummaryList"]]
    sorted_links = sorted(links_dicts, key=itemgetter('dateModified'), reverse=True)

    key = sorted_links[0]['displayKey']
    
    down = req.get(down_url.format(key))

    if not down.status_code == 200:
        return False

    print(key)
    open(key, 'wb').write(down.content)
    
    return True

get_file()

回复收藏 0 原文

~没有更多了~