将AWS配置文件与FS S3Filesystem一起使用

发布于 2025-02-09 20:55:17 字数 982 浏览 2 评论 0原文

使用Apache Pyarrow时尝试使用特定的AWS配置文件。 s3filesystem时,没有选项可以传递配置名称。

该文档显示使用Pyarrow fs [https://arrow.apache.org/docs/python/generaty/generat/pyarrow.fs.s3filesystem.html]实例化 使用boto3并使用它:

# include mfa profile
session = boto3.session.Session(profile_name="custom_profile")

# create filesystem with session
bucket = fs.S3FileSystem(session_name=session)

bucket.get_file_info(fs.FileSelector('bucket_name', recursive=True))

但是这也失败了:

OSError: When listing objects under key '' in bucket 'bucket_name': AWS Error [code 15]: Access Denied

是否可以将FS与自定义AWS配置文件一起使用?

〜/.aws/凭据:

[default]
aws_access_key_id = <access_key>
aws_secret_access_key = <secret_key>

[custom_profile]
aws_access_key_id = <access_key>
aws_secret_access_key = <secret_key>
aws_session_token = <token>

其他上下文:用户的所有操作都需要MFA。凭据AWS配置文件存储在CLI上的基于MFA的POST验证中的自定义AWS配置文件,需要在脚本中使用该配置文件

Trying to use a specific AWS profile when using Apache Pyarrow. The documentation show no option to pass a profile name when instantiating S3FileSystem using pyarrow fs [https://arrow.apache.org/docs/python/generated/pyarrow.fs.S3FileSystem.html]

Tried to get around this by creating a session with boto3 and using that :

# include mfa profile
session = boto3.session.Session(profile_name="custom_profile")

# create filesystem with session
bucket = fs.S3FileSystem(session_name=session)

bucket.get_file_info(fs.FileSelector('bucket_name', recursive=True))

but this too fails :

OSError: When listing objects under key '' in bucket 'bucket_name': AWS Error [code 15]: Access Denied

is it possible to use fs with custom aws profile ?

~/.aws/credentials :

[default]
aws_access_key_id = <access_key>
aws_secret_access_key = <secret_key>

[custom_profile]
aws_access_key_id = <access_key>
aws_secret_access_key = <secret_key>
aws_session_token = <token>

additional context : all actions of users require MFA. custom AWS profile in credentials file stores token generated post MFA based authentication on the CLI, need to use that profile in the script

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

热血少△年 2025-02-16 20:55:17

我认为这样更好:

session = boto3.session.Session(profile_name="custom_profile")
credentials = session.get_credentials()

s3_files = fs.S3FileSystem(
    secret_key=credentials.secret_key,
    access_key=credentials.access_key,
    region=session.region_name,
    session_token=credentials.token)

I think is better this way:

session = boto3.session.Session(profile_name="custom_profile")
credentials = session.get_credentials()

s3_files = fs.S3FileSystem(
    secret_key=credentials.secret_key,
    access_key=credentials.access_key,
    region=session.region_name,
    session_token=credentials.token)
孤君无依 2025-02-16 20:55:17

您应该使用环境变量。例如,

import os

os.environ["AWS_PROFILE"] = "custom_profile"

s3fs = fs.S3FileSystem()

You should use environment variables. For example,

import os

os.environ["AWS_PROFILE"] = "custom_profile"

s3fs = fs.S3FileSystem()
朱染 2025-02-16 20:55:17

如果您使用的是pyarrow.fs.s3filesystem(与S3FS软件包中的S3FileSystem不同),则使用命名SAML配置文件或SSO配置文件的唯一方法是Rafael建议的解决方法

,请注意,该路径并未以'开头' s3://'

import boto3
session = boto3.session.Session(profile_name="custom_profile")
credentials = session.get_credentials()

from pyarrow.fs import S3FileSystem

fs = S3FileSystem(
    secret_key=credentials.secret_key,
    access_key=credentials.access_key,
    region='your-region-3',
    session_token=credentials.token)

import pyarrow.parquet as pq
df = pq.read_table("bucket_name/path1/path2/",filesystem=fs)
print(df)

If you are using the pyarrow.fs.S3FileSystem (which is different from the S3FileSystem in the s3fs package) then the only way to use a named SAML profile or SSO profile is this workaround suggested by Rafael

Note that the path does not start with 's3://'

import boto3
session = boto3.session.Session(profile_name="custom_profile")
credentials = session.get_credentials()

from pyarrow.fs import S3FileSystem

fs = S3FileSystem(
    secret_key=credentials.secret_key,
    access_key=credentials.access_key,
    region='your-region-3',
    session_token=credentials.token)

import pyarrow.parquet as pq
df = pq.read_table("bucket_name/path1/path2/",filesystem=fs)
print(df)
七禾 2025-02-16 20:55:17

一个人可以指定令牌,但还必须指定访问密钥和秘密密钥:

s3 = fs.S3FileSystem(access_key="", 
                     secret_key="",
                     session_token="")

一个人还必须实现一些方法来解析〜/.aws/cordentials文件以获取访问这些值或每次手动执行

one can specify a token, but must also specify access key and secret key :

s3 = fs.S3FileSystem(access_key="", 
                     secret_key="",
                     session_token="")

one would also have to implement some method to parse the ~/.aws/credentials file to get access to these values or do it manually each time

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文