Dockerised Hive找不到org.apache.hadoop.fs.s3a.s3a.s3afilesystem,即使我添加了hadoop-aws jar
我正在尝试使用由HDFS&组成的Docker-Compose。色调& Hive +连接到我的AWS S3存储桶。 截至目前,我正在运行它,并且可以使用Hue File浏览器浏览我的S3存储桶。 当我尝试在镶木quet文件上创建一个蜂巢外部表时,我会收到以下错误:
org.apache.hadoop.fs.s3a.s3a.s3afilesystem找不到
我知道这是由于缺失的Hadoop-aws Jar引起的。 为了快速获胜,我已将Hadoop-aws-2.7.4.4.jar上传到容器HDFS,并在
添加jar hdfs:/hadoop-aws-2.7.4.jar;
我没有遇到任何错误,因此似乎可以正确找到罐子(如果更改名称,我会发现一个错误文件)。然而,创建外部表仍会出现相同的错误。
我在做什么错?尽管它可能与不兼容的版本有关,但它似乎与我的Docker-Compose版本匹配。
这是所用的码头组合:
version: "3"
services:
namenode:
image: bde2020/hadoop-namenode:2.0.0-hadoop2.7.4-java8
volumes:
- namenode:/hadoop/dfs/name
environment:
- CLUSTER_NAME=test
env_file:
- ./hadoop-hive.env
ports:
- "50070:50070"
datanode:
image: bde2020/hadoop-datanode:2.0.0-hadoop2.7.4-java8
volumes:
- datanode:/hadoop/dfs/data
env_file:
- ./hadoop-hive.env
environment:
SERVICE_PRECONDITION: "namenode:50070"
ports:
- "50075:50075"
resourcemanager:
image: bde2020/hadoop-resourcemanager:2.0.0-hadoop2.7.4-java8
environment:
SERVICE_PRECONDITION: "namenode:50070 datanode:50075"
env_file:
- ./hadoop-hive.env
hive-server:
image: bde2020/hive:2.3.2-postgresql-metastore
env_file:
- ./hadoop-hive.env
environment:
HIVE_CORE_CONF_javax_jdo_option_ConnectionURL: "jdbc:postgresql://hive-metastore/metastore"
SERVICE_PRECONDITION: "hive-metastore:9083"
ports:
- "10000:10000"
hive-metastore:
image: bde2020/hive:2.3.2-postgresql-metastore
env_file:
- ./hadoop-hive.env
command: /opt/hive/bin/hive --service metastore
environment:
SERVICE_PRECONDITION: "namenode:50070 datanode:50075 hive-metastore-postgresql:5432 resourcemanager:8088"
ports:
- "9083:9083"
hive-metastore-postgresql:
image: bde2020/hive-metastore-postgresql:2.3.0
ports:
- "5432:5432"
huedb:
image: postgres:12.1-alpine
volumes:
- pg_data:/var/lib/postgresl/data/
ports:
- "5432"
env_file:
- ./hadoop-hive.env
environment:
SERVICE_PRECONDITION: "namenode:50070 datanode:50075 hive-metastore-postgresql:5432 resourcemanager:8088 hive-metastore:9083"
hue:
image: gethue/hue:4.6.0
environment:
SERVICE_PRECONDITION: "namenode:50070 datanode:50075 hive-metastore-postgresql:5432 resourcemanager:8088 hive-metastore:9083 huedb:5000"
ports:
- "8888:8888"
volumes:
- ./hue-overrides.ini:/usr/share/hue/desktop/conf/hue-overrides.ini
links:
- huedb
volumes:
namenode:
datanode:
pg_data:
会喜欢任何帮助!!
谢谢,
I am trying to PoC a docker-compose composed of HDFS & Hue & Hive + connected to my AWS S3 Buckets.
As of right now I have it running and I can browse my S3 Buckets using Hue File browser.
When I try to create a hive external table on my parquet file, I get the following error :
org.apache.hadoop.fs.s3a.S3AFileSystem not found
I understand this is caused by a missing hadoop-aws jar.
As a quick win, I've uploaded the hadoop-aws-2.7.4.jar to the container HDFS, and executed my create table query after
add jar hdfs:/hadoop-aws-2.7.4.jar;
I don't get any errors so it seems to find the jar properly (if I change the name, I get an error file not found). Yet the create external table still fails with the same error.
What am I doing wrong ? I though it could be related to an incompatible version but it seems to be matching my docker-compose versions.
Here is the docker-compose used :
version: "3"
services:
namenode:
image: bde2020/hadoop-namenode:2.0.0-hadoop2.7.4-java8
volumes:
- namenode:/hadoop/dfs/name
environment:
- CLUSTER_NAME=test
env_file:
- ./hadoop-hive.env
ports:
- "50070:50070"
datanode:
image: bde2020/hadoop-datanode:2.0.0-hadoop2.7.4-java8
volumes:
- datanode:/hadoop/dfs/data
env_file:
- ./hadoop-hive.env
environment:
SERVICE_PRECONDITION: "namenode:50070"
ports:
- "50075:50075"
resourcemanager:
image: bde2020/hadoop-resourcemanager:2.0.0-hadoop2.7.4-java8
environment:
SERVICE_PRECONDITION: "namenode:50070 datanode:50075"
env_file:
- ./hadoop-hive.env
hive-server:
image: bde2020/hive:2.3.2-postgresql-metastore
env_file:
- ./hadoop-hive.env
environment:
HIVE_CORE_CONF_javax_jdo_option_ConnectionURL: "jdbc:postgresql://hive-metastore/metastore"
SERVICE_PRECONDITION: "hive-metastore:9083"
ports:
- "10000:10000"
hive-metastore:
image: bde2020/hive:2.3.2-postgresql-metastore
env_file:
- ./hadoop-hive.env
command: /opt/hive/bin/hive --service metastore
environment:
SERVICE_PRECONDITION: "namenode:50070 datanode:50075 hive-metastore-postgresql:5432 resourcemanager:8088"
ports:
- "9083:9083"
hive-metastore-postgresql:
image: bde2020/hive-metastore-postgresql:2.3.0
ports:
- "5432:5432"
huedb:
image: postgres:12.1-alpine
volumes:
- pg_data:/var/lib/postgresl/data/
ports:
- "5432"
env_file:
- ./hadoop-hive.env
environment:
SERVICE_PRECONDITION: "namenode:50070 datanode:50075 hive-metastore-postgresql:5432 resourcemanager:8088 hive-metastore:9083"
hue:
image: gethue/hue:4.6.0
environment:
SERVICE_PRECONDITION: "namenode:50070 datanode:50075 hive-metastore-postgresql:5432 resourcemanager:8088 hive-metastore:9083 huedb:5000"
ports:
- "8888:8888"
volumes:
- ./hue-overrides.ini:/usr/share/hue/desktop/conf/hue-overrides.ini
links:
- huedb
volumes:
namenode:
datanode:
pg_data:
Would love any help !!
Thanks,
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论