连接气流和Minio S3

发布于 2025-02-06 23:48:31 字数 1246 浏览 2 评论 0原文

我使用的是带有比特纳米(Bitnami)的气流图像以及Minio的Docker。 我可以让气流与AWS S3交谈,但是当我尝试替换Minio时,我会遇到此错误:

File "/opt/bitnami/airflow/venv/lib/python3.8/site-packages/botocore/client.py", line 719, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden

这是.env:

OBJECT_STORE=s3://xxxx:xxxxx@S3?host%3Dhttp%3A%2F%2Fminio1%3A9001

这是Compose的环境连接:

AIRFLOW_CONN_AWS_S3=${OBJECT_STORE}

这是气流测试DAG:

default_args = {
    'owner': 'airflow', 
    'retries': 1,
    'retry_delay': timedelta(seconds=5),
    'provide_context': True
}

dag = DAG(
    dag_id='s3_test',
    tags=['ti'],
    default_args=default_args,
    start_date=days_ago(2),
    schedule_interval='0 * * * *',
    catchup=False
)

def func_test():
    s3 = S3Hook('aws_s3')
    obj = s3.get_key("file.csv", "mybucket")
    contents = obj.get()['Body'].read().decode('utf-8')
    print('contents', contents)

t1 = PythonOperator(
    task_id='test',
    python_callable=func_test, 
    dag=dag
) 

t1

我知道该文件存在于存储桶中,并且路径是正确的。我也给了Minio用户帐户完整的管理权。不知道是什么原因导致403。

I am using docker compose with bitnami's airflow image as well as minio.
I can get airflow to talk to AWS S3, but when I try to substitute Minio I am getting this error:

File "/opt/bitnami/airflow/venv/lib/python3.8/site-packages/botocore/client.py", line 719, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden

Here's the .env:

OBJECT_STORE=s3://xxxx:xxxxx@S3?host%3Dhttp%3A%2F%2Fminio1%3A9001

Here's the environment connection in compose:

AIRFLOW_CONN_AWS_S3=${OBJECT_STORE}

Here's the Airflow test dag:

default_args = {
    'owner': 'airflow', 
    'retries': 1,
    'retry_delay': timedelta(seconds=5),
    'provide_context': True
}

dag = DAG(
    dag_id='s3_test',
    tags=['ti'],
    default_args=default_args,
    start_date=days_ago(2),
    schedule_interval='0 * * * *',
    catchup=False
)

def func_test():
    s3 = S3Hook('aws_s3')
    obj = s3.get_key("file.csv", "mybucket")
    contents = obj.get()['Body'].read().decode('utf-8')
    print('contents', contents)

t1 = PythonOperator(
    task_id='test',
    python_callable=func_test, 
    dag=dag
) 

t1

I know the file exists in the bucket and the path is correct. I gave the minio user account full admin rights too. Not sure what is causing the 403.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

伊面 2025-02-13 23:48:31

连接类型S3现在已从气流中官方删除,而是使用AWS。请参阅: https://github.com/apache/apache/airflow/pull/pull/pull/pull/25980

A工作示例可以在此处找到:
> airflow and Minio与AWS与AWS连接

“ nofollow noreferrer” 并不一定意味着连接不起作用!

解决方案(所有信用塔拉戈利斯& hanleybrand)
创建一个新的连接调用,例如Minio_s3,是键入Amazon AWS,只有额外的字段设置为:

{ 
  "aws_access_key_id": "your MinIO username", 
   "aws_secret_access_key": "your MinIO password",
  "endpoint_url": "http://localhost:9000", 
  "region_name": "us-east-1"
}

请注意。如果您从Docker的同一主机上的同一主机上运行气流,则需要使用host.docker.internal.internal而不是localhost。

Connection type S3 is now official removed from Airflow instead use aws. See: https://github.com/apache/airflow/pull/25980

A working example can be found here:
Airflow and MinIO connection with AWS

NOTE: if test_connection is failing, it doesn't necessarily mean that the connection won't work!

The solution (all credits to Taragolis & hanleybrand)
Create a new connection call it for example minio_s3, is type Amazon AWS, and only has the extra field set to:

{ 
  "aws_access_key_id": "your MinIO username", 
   "aws_secret_access_key": "your MinIO password",
  "endpoint_url": "http://localhost:9000", 
  "region_name": "us-east-1"
}

Please note. If you're running Airflow from a KinD cluster and MinIO on the same host in docker, you need to use host.docker.internal instead of localhost.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文