连接气流和Minio S3
我使用的是带有比特纳米(Bitnami)的气流图像以及Minio的Docker。 我可以让气流与AWS S3交谈,但是当我尝试替换Minio时,我会遇到此错误:
File "/opt/bitnami/airflow/venv/lib/python3.8/site-packages/botocore/client.py", line 719, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden
这是.env:
OBJECT_STORE=s3://xxxx:xxxxx@S3?host%3Dhttp%3A%2F%2Fminio1%3A9001
这是Compose的环境连接:
AIRFLOW_CONN_AWS_S3=${OBJECT_STORE}
这是气流测试DAG:
default_args = {
'owner': 'airflow',
'retries': 1,
'retry_delay': timedelta(seconds=5),
'provide_context': True
}
dag = DAG(
dag_id='s3_test',
tags=['ti'],
default_args=default_args,
start_date=days_ago(2),
schedule_interval='0 * * * *',
catchup=False
)
def func_test():
s3 = S3Hook('aws_s3')
obj = s3.get_key("file.csv", "mybucket")
contents = obj.get()['Body'].read().decode('utf-8')
print('contents', contents)
t1 = PythonOperator(
task_id='test',
python_callable=func_test,
dag=dag
)
t1
我知道该文件存在于存储桶中,并且路径是正确的。我也给了Minio用户帐户完整的管理权。不知道是什么原因导致403。
I am using docker compose with bitnami's airflow image as well as minio.
I can get airflow to talk to AWS S3, but when I try to substitute Minio I am getting this error:
File "/opt/bitnami/airflow/venv/lib/python3.8/site-packages/botocore/client.py", line 719, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden
Here's the .env:
OBJECT_STORE=s3://xxxx:xxxxx@S3?host%3Dhttp%3A%2F%2Fminio1%3A9001
Here's the environment connection in compose:
AIRFLOW_CONN_AWS_S3=${OBJECT_STORE}
Here's the Airflow test dag:
default_args = {
'owner': 'airflow',
'retries': 1,
'retry_delay': timedelta(seconds=5),
'provide_context': True
}
dag = DAG(
dag_id='s3_test',
tags=['ti'],
default_args=default_args,
start_date=days_ago(2),
schedule_interval='0 * * * *',
catchup=False
)
def func_test():
s3 = S3Hook('aws_s3')
obj = s3.get_key("file.csv", "mybucket")
contents = obj.get()['Body'].read().decode('utf-8')
print('contents', contents)
t1 = PythonOperator(
task_id='test',
python_callable=func_test,
dag=dag
)
t1
I know the file exists in the bucket and the path is correct. I gave the minio user account full admin rights too. Not sure what is causing the 403.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
连接类型S3现在已从气流中官方删除,而是使用AWS。请参阅: https://github.com/apache/apache/airflow/pull/pull/pull/pull/25980
A工作示例可以在此处找到:
> airflow and Minio与AWS与AWS连接
“ nofollow noreferrer” 并不一定意味着连接不起作用!
解决方案(所有信用塔拉戈利斯& hanleybrand)
创建一个新的连接调用,例如Minio_s3,是键入Amazon AWS,只有额外的字段设置为:
请注意。如果您从Docker的同一主机上的同一主机上运行气流,则需要使用host.docker.internal.internal而不是localhost。
Connection type S3 is now official removed from Airflow instead use aws. See: https://github.com/apache/airflow/pull/25980
A working example can be found here:
Airflow and MinIO connection with AWS
NOTE: if test_connection is failing, it doesn't necessarily mean that the connection won't work!
The solution (all credits to Taragolis & hanleybrand)
Create a new connection call it for example minio_s3, is type Amazon AWS, and only has the extra field set to:
Please note. If you're running Airflow from a KinD cluster and MinIO on the same host in docker, you need to use host.docker.internal instead of localhost.