Docker 上的 Airflow 启动出现错误:类型错误:__init__() 收到意外的关键字参数“编码”;
我想通过提供商 hdfs 扩展 docker 上的气流:
https://airflow.apache.org/ docs/docker-stack/build.html#examples-of-image-extending
Dockerfile 看起来像:
FROM apache/airflow:2.2.4
ARG DEV_APT_DEPS="\
curl \
gnupg2 \
apt-transport-https \
apt-utils \
build-essential \
ca-certificates \
gnupg \
dirmngr \
freetds-bin \
freetds-dev \
gosu \
krb5-user \
ldap-utils \
libffi-dev \
libkrb5-dev \
libldap2-dev \
libpq-dev \
libsasl2-2 \
libsasl2-dev \
libsasl2-modules \
libssl-dev \
locales \
lsb-release \
nodejs \
openssh-client \
postgresql-client \
python-selinux \
sasl2-bin \
software-properties-common \
sqlite3 \
sudo \
unixodbc \
unixodbc-dev \
yarn "
USER root
RUN mv /etc/apt/sources.list /etc/apt/sources.list.bak \
&& echo 'deb https://mirrors.tuna.tsinghua.edu.cn/debian/ buster main contrib non-free' >> /etc/apt/sources.list \
&& echo 'deb https://mirrors.tuna.tsinghua.edu.cn/debian/ buster-updates main contrib non-free' >> /etc/apt/sources.list \
&& echo 'deb https://mirrors.tuna.tsinghua.edu.cn/debian/ buster-backports main contrib non-free' >> /etc/apt/sources.list \
&& echo 'deb https://mirrors.tuna.tsinghua.edu.cn/debian-security buster/updates main contrib non-free' >> /etc/apt/sources.list \
&& apt-get update \
&& apt-get install -y --no-install-recommends \
${DEV_APT_DEPS} \
&& apt-get autoremove -yqq --purge \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
USER airflow
COPY --chown=airflow:root constraints-3.7.txt /opt/airflow/
COPY --chown=airflow:root ifxjdbc.jar /opt/airflow/jdbc-drivers/
RUN pip install --timeout=3600 --no-cache-dir --user \
--constraint /opt/airflow/constraints-3.7.txt \
--index-url https://pypi.tuna.tsinghua.edu.cn/simple \
--trusted-host pypi.tuna.tsinghua.edu.cn \
apache-airflow-providers-apache-hive \
apache-airflow-providers-apache-hdfs # this line will make a error in the future
它构建成功,
但是当我初始化它时:docker-compose up airflow-init
则会出现错误
airflow-init_1 | ....................
airflow-init_1 | ERROR! Maximum number of retries (20) reached.
airflow-init_1 |
airflow-init_1 | Last check result:
airflow-init_1 | $ airflow db check
airflow-init_1 | Traceback (most recent call last):
airflow-init_1 | File "/home/airflow/.local/bin/airflow", line 5, in <module>
airflow-init_1 | from airflow.__main__ import main
airflow-init_1 | File "/home/airflow/.local/lib/python3.7/site-packages/airflow/__main__.py", line 28, in <module>
airflow-init_1 | from airflow.cli import cli_parser
airflow-init_1 | File "/home/airflow/.local/lib/python3.7/site-packages/airflow/cli/cli_parser.py", line 621, in <module>
airflow-init_1 | type=argparse.FileType('w', encoding='UTF-8'),
airflow-init_1 | TypeError: __init__() got an unexpected keyword argument 'encoding'
airflow-init_1 |
airflow-224_airflow-init_1 exited with code 1
如果 dockerfile 删除“apache-airflow-providers-apache-hdfs”然后重建,
可以正常初始化...
HELP~我真的需要hdfs的提供者
i want to extend the airflow on docker with providers hdfs:
https://airflow.apache.org/docs/docker-stack/build.html#examples-of-image-extending
the Dockerfile looks like:
FROM apache/airflow:2.2.4
ARG DEV_APT_DEPS="\
curl \
gnupg2 \
apt-transport-https \
apt-utils \
build-essential \
ca-certificates \
gnupg \
dirmngr \
freetds-bin \
freetds-dev \
gosu \
krb5-user \
ldap-utils \
libffi-dev \
libkrb5-dev \
libldap2-dev \
libpq-dev \
libsasl2-2 \
libsasl2-dev \
libsasl2-modules \
libssl-dev \
locales \
lsb-release \
nodejs \
openssh-client \
postgresql-client \
python-selinux \
sasl2-bin \
software-properties-common \
sqlite3 \
sudo \
unixodbc \
unixodbc-dev \
yarn "
USER root
RUN mv /etc/apt/sources.list /etc/apt/sources.list.bak \
&& echo 'deb https://mirrors.tuna.tsinghua.edu.cn/debian/ buster main contrib non-free' >> /etc/apt/sources.list \
&& echo 'deb https://mirrors.tuna.tsinghua.edu.cn/debian/ buster-updates main contrib non-free' >> /etc/apt/sources.list \
&& echo 'deb https://mirrors.tuna.tsinghua.edu.cn/debian/ buster-backports main contrib non-free' >> /etc/apt/sources.list \
&& echo 'deb https://mirrors.tuna.tsinghua.edu.cn/debian-security buster/updates main contrib non-free' >> /etc/apt/sources.list \
&& apt-get update \
&& apt-get install -y --no-install-recommends \
${DEV_APT_DEPS} \
&& apt-get autoremove -yqq --purge \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
USER airflow
COPY --chown=airflow:root constraints-3.7.txt /opt/airflow/
COPY --chown=airflow:root ifxjdbc.jar /opt/airflow/jdbc-drivers/
RUN pip install --timeout=3600 --no-cache-dir --user \
--constraint /opt/airflow/constraints-3.7.txt \
--index-url https://pypi.tuna.tsinghua.edu.cn/simple \
--trusted-host pypi.tuna.tsinghua.edu.cn \
apache-airflow-providers-apache-hive \
apache-airflow-providers-apache-hdfs # this line will make a error in the future
it's build successful
but when i init it : docker-compose up airflow-init
there got an error
airflow-init_1 | ....................
airflow-init_1 | ERROR! Maximum number of retries (20) reached.
airflow-init_1 |
airflow-init_1 | Last check result:
airflow-init_1 | $ airflow db check
airflow-init_1 | Traceback (most recent call last):
airflow-init_1 | File "/home/airflow/.local/bin/airflow", line 5, in <module>
airflow-init_1 | from airflow.__main__ import main
airflow-init_1 | File "/home/airflow/.local/lib/python3.7/site-packages/airflow/__main__.py", line 28, in <module>
airflow-init_1 | from airflow.cli import cli_parser
airflow-init_1 | File "/home/airflow/.local/lib/python3.7/site-packages/airflow/cli/cli_parser.py", line 621, in <module>
airflow-init_1 | type=argparse.FileType('w', encoding='UTF-8'),
airflow-init_1 | TypeError: __init__() got an unexpected keyword argument 'encoding'
airflow-init_1 |
airflow-224_airflow-init_1 exited with code 1
if the dockerfile remove 'apache-airflow-providers-apache-hdfs' then rebuild
it can init ok...
HELP~ i really need the provider of hdfs
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
经过更多查看后,我想我知道你的问题出在哪里 - 你可能没有使用“入口点”来运行气流。您应该始终使用原始入口点 https://airflow.apache.org/docs /docker-stack/entrypoint.html 并且您可能会在没有自己的入口点的情况下覆盖它,而该入口点无法正确初始化气流环境。
After a bit more looking I think I know where your problem is - You are likely not using the "entrypoint" to run airflow. You should always use the original entrypoint https://airflow.apache.org/docs/docker-stack/entrypoint.html and you are lkely overriding it without your own entrypoint which does not initialize airflow environment properly.
这给我带来了一些麻烦,问题是:argparse 现在位于 python 核心中,不能安装。
软件包 apache-airflow-providers-apache-hdfs 需要 Snakebite-py3,并且 Snakebite-py3 安装 argparse。
要解决问题,只需添加: RUN pip uninstall -y argparse 到 Dockerfile 的末尾
This gave me some trouble the issue is this: argparse is now in python core and must not be installed.
The package apache-airflow-providers-apache-hdfs requires snakebite-py3 and snakebite-py3 installs argparse.
To solve the problem just add: RUN pip uninstall -y argparse to the end of your Dockerfile