Docker 上的 Airflow 启动出现错误:类型错误:__init__() 收到意外的关键字参数“编码”;

发布于 2025-01-14 12:32:57 字数 3515 浏览 1 评论 0原文

我想通过提供商 hdfs 扩展 docker 上的气流:
https://airflow.apache.org/ docs/docker-stack/build.html#examples-of-image-extending
Dockerfile 看起来像:

FROM apache/airflow:2.2.4
ARG DEV_APT_DEPS="\
     curl \
     gnupg2 \
     apt-transport-https \
     apt-utils \
     build-essential \
     ca-certificates \
     gnupg \
     dirmngr \
     freetds-bin \
     freetds-dev \
     gosu \
     krb5-user \
     ldap-utils \
     libffi-dev \
     libkrb5-dev \
     libldap2-dev \
     libpq-dev \
     libsasl2-2 \
     libsasl2-dev \
     libsasl2-modules \
     libssl-dev \
     locales  \
     lsb-release \
     nodejs \
     openssh-client \
     postgresql-client \
     python-selinux \
     sasl2-bin \
     software-properties-common \
     sqlite3 \
     sudo \
     unixodbc \
     unixodbc-dev \
     yarn "
     
USER root
RUN mv /etc/apt/sources.list /etc/apt/sources.list.bak \
  && echo 'deb https://mirrors.tuna.tsinghua.edu.cn/debian/ buster main contrib non-free' >> /etc/apt/sources.list \
  && echo 'deb https://mirrors.tuna.tsinghua.edu.cn/debian/ buster-updates main contrib non-free' >> /etc/apt/sources.list \
  && echo 'deb https://mirrors.tuna.tsinghua.edu.cn/debian/ buster-backports main contrib non-free' >> /etc/apt/sources.list \
  && echo 'deb https://mirrors.tuna.tsinghua.edu.cn/debian-security buster/updates main contrib non-free' >> /etc/apt/sources.list \
  && apt-get update \
  && apt-get install -y --no-install-recommends \
    ${DEV_APT_DEPS} \
  && apt-get autoremove -yqq --purge \
  && apt-get clean \
  && rm -rf /var/lib/apt/lists/*

  
USER airflow
COPY --chown=airflow:root constraints-3.7.txt /opt/airflow/
COPY --chown=airflow:root ifxjdbc.jar /opt/airflow/jdbc-drivers/
RUN pip install --timeout=3600 --no-cache-dir --user \
  --constraint /opt/airflow/constraints-3.7.txt \
  --index-url https://pypi.tuna.tsinghua.edu.cn/simple \
  --trusted-host pypi.tuna.tsinghua.edu.cn \
  apache-airflow-providers-apache-hive \
  apache-airflow-providers-apache-hdfs # this line will make a error in the future

它构建成功,

但是当我初始化它时:docker-compose up airflow-init 则会出现错误

airflow-init_1       | ....................
airflow-init_1       | ERROR! Maximum number of retries (20) reached.
airflow-init_1       | 
airflow-init_1       | Last check result:
airflow-init_1       | $ airflow db check
airflow-init_1       | Traceback (most recent call last):
airflow-init_1       |   File "/home/airflow/.local/bin/airflow", line 5, in <module>
airflow-init_1       |     from airflow.__main__ import main
airflow-init_1       |   File "/home/airflow/.local/lib/python3.7/site-packages/airflow/__main__.py", line 28, in <module>
airflow-init_1       |     from airflow.cli import cli_parser
airflow-init_1       |   File "/home/airflow/.local/lib/python3.7/site-packages/airflow/cli/cli_parser.py", line 621, in <module>
airflow-init_1       |     type=argparse.FileType('w', encoding='UTF-8'),
airflow-init_1       | TypeError: __init__() got an unexpected keyword argument 'encoding'
airflow-init_1       | 
airflow-224_airflow-init_1 exited with code 1


如果 dockerfile 删除“apache-airflow-providers-apache-hdfs”然后重建,
可以正常初始化...

HELP~我真的需要hdfs的提供者

i want to extend the airflow on docker with providers hdfs:
https://airflow.apache.org/docs/docker-stack/build.html#examples-of-image-extending

the Dockerfile looks like:

FROM apache/airflow:2.2.4
ARG DEV_APT_DEPS="\
     curl \
     gnupg2 \
     apt-transport-https \
     apt-utils \
     build-essential \
     ca-certificates \
     gnupg \
     dirmngr \
     freetds-bin \
     freetds-dev \
     gosu \
     krb5-user \
     ldap-utils \
     libffi-dev \
     libkrb5-dev \
     libldap2-dev \
     libpq-dev \
     libsasl2-2 \
     libsasl2-dev \
     libsasl2-modules \
     libssl-dev \
     locales  \
     lsb-release \
     nodejs \
     openssh-client \
     postgresql-client \
     python-selinux \
     sasl2-bin \
     software-properties-common \
     sqlite3 \
     sudo \
     unixodbc \
     unixodbc-dev \
     yarn "
     
USER root
RUN mv /etc/apt/sources.list /etc/apt/sources.list.bak \
  && echo 'deb https://mirrors.tuna.tsinghua.edu.cn/debian/ buster main contrib non-free' >> /etc/apt/sources.list \
  && echo 'deb https://mirrors.tuna.tsinghua.edu.cn/debian/ buster-updates main contrib non-free' >> /etc/apt/sources.list \
  && echo 'deb https://mirrors.tuna.tsinghua.edu.cn/debian/ buster-backports main contrib non-free' >> /etc/apt/sources.list \
  && echo 'deb https://mirrors.tuna.tsinghua.edu.cn/debian-security buster/updates main contrib non-free' >> /etc/apt/sources.list \
  && apt-get update \
  && apt-get install -y --no-install-recommends \
    ${DEV_APT_DEPS} \
  && apt-get autoremove -yqq --purge \
  && apt-get clean \
  && rm -rf /var/lib/apt/lists/*

  
USER airflow
COPY --chown=airflow:root constraints-3.7.txt /opt/airflow/
COPY --chown=airflow:root ifxjdbc.jar /opt/airflow/jdbc-drivers/
RUN pip install --timeout=3600 --no-cache-dir --user \
  --constraint /opt/airflow/constraints-3.7.txt \
  --index-url https://pypi.tuna.tsinghua.edu.cn/simple \
  --trusted-host pypi.tuna.tsinghua.edu.cn \
  apache-airflow-providers-apache-hive \
  apache-airflow-providers-apache-hdfs # this line will make a error in the future

it's build successful

but when i init it : docker-compose up airflow-init

there got an error

airflow-init_1       | ....................
airflow-init_1       | ERROR! Maximum number of retries (20) reached.
airflow-init_1       | 
airflow-init_1       | Last check result:
airflow-init_1       | $ airflow db check
airflow-init_1       | Traceback (most recent call last):
airflow-init_1       |   File "/home/airflow/.local/bin/airflow", line 5, in <module>
airflow-init_1       |     from airflow.__main__ import main
airflow-init_1       |   File "/home/airflow/.local/lib/python3.7/site-packages/airflow/__main__.py", line 28, in <module>
airflow-init_1       |     from airflow.cli import cli_parser
airflow-init_1       |   File "/home/airflow/.local/lib/python3.7/site-packages/airflow/cli/cli_parser.py", line 621, in <module>
airflow-init_1       |     type=argparse.FileType('w', encoding='UTF-8'),
airflow-init_1       | TypeError: __init__() got an unexpected keyword argument 'encoding'
airflow-init_1       | 
airflow-224_airflow-init_1 exited with code 1

if the dockerfile remove 'apache-airflow-providers-apache-hdfs' then rebuild

it can init ok...

HELP~ i really need the provider of hdfs

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

燃情 2025-01-21 12:32:57

经过更多查看后,我想我知道你的问题出在哪里 - 你可能没有使用“入口点”来运行气流。您应该始终使用原始入口点 https://airflow.apache.org/docs /docker-stack/entrypoint.html 并且您可能会在没有自己的入口点的情况下覆盖它,而该入口点无法正确初始化气流环境。

After a bit more looking I think I know where your problem is - You are likely not using the "entrypoint" to run airflow. You should always use the original entrypoint https://airflow.apache.org/docs/docker-stack/entrypoint.html and you are lkely overriding it without your own entrypoint which does not initialize airflow environment properly.

情场扛把子 2025-01-21 12:32:57

这给我带来了一些麻烦,问题是:argparse 现在位于 python 核心中,不能安装。
软件包 apache-airflow-providers-apache-hdfs 需要 Snakebite-py3,并且 Snakebite-py3 安装 argparse。

要解决问题,只需添加: RUN pip uninstall -y argparse 到 Dockerfile 的末尾

This gave me some trouble the issue is this: argparse is now in python core and must not be installed.
The package apache-airflow-providers-apache-hdfs requires snakebite-py3 and snakebite-py3 installs argparse.

To solve the problem just add: RUN pip uninstall -y argparse to the end of your Dockerfile

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文