Spark-Shell 的 hadoop 中的 JAVA_HOME 错误
我需要安装 Hadoop 才能在我的 WSL2 Ubuntu 上运行 Spark 以用于学校项目。我按照这两个教程安装了 Hadoop 3.3.1 和 Spark 3.2.1 :
Kontext.tech 上的 Hadoop 教程 Kontext.tech 上的 Spark 教程
我在 .bashrc 中正确设置了环境变量:
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
export PATH=$PATH:$JAVA_HOME
export HADOOP_HOME=~/hadoop/hadoop-3.3.1
export SPARK_HOME=~/hadoop/spark-3.2.1
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:/usr/local/hadoop/bin/
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export PATH=$SPARK_HOME/bin:$PATH
# Configure Spark to use Hadoop classpath
export SPARK_DIST_CLASSPATH=$(hadoop classpath)
以及 ~/hadoop/spark-3.2.1/conf/spark-env.sh.template :
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/
但是当我启动时spark-shell
,我收到此错误:
/home/adrien/hadoop/spark-3.2.1/bin/spark-class: line 71: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin//bin/java: No such file or directory
/home/adrien/hadoop/spark-3.2.1/bin/spark-class: line 96: CMD: bad array subscript
$PATH 变量的重新定义似乎很混乱,但我不知道它在哪里。你能帮我解决一下吗?我不了解 Hadoop,也不太了解 Spark,但我从来不需要安装它们。
I needed to install Hadoop in order to have Spark running on my WSL2 Ubuntu for school projects. I installed Hadoop 3.3.1 and Spark 3.2.1 follow those two tutorials :
Hadoop Tutorial on Kontext.tech
Spark Tutorial on Kontext.tech
I correctly set up env variables in my .bashrc :
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
export PATH=$PATH:$JAVA_HOME
export HADOOP_HOME=~/hadoop/hadoop-3.3.1
export SPARK_HOME=~/hadoop/spark-3.2.1
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:/usr/local/hadoop/bin/
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export PATH=$SPARK_HOME/bin:$PATH
# Configure Spark to use Hadoop classpath
export SPARK_DIST_CLASSPATH=$(hadoop classpath)
As well as the ~/hadoop/spark-3.2.1/conf/spark-env.sh.template :
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/
However when I launch spark-shell
, I get this error :
/home/adrien/hadoop/spark-3.2.1/bin/spark-class: line 71: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin//bin/java: No such file or directory
/home/adrien/hadoop/spark-3.2.1/bin/spark-class: line 96: CMD: bad array subscript
There seems to be a mess up in a redefinition of the $PATH variable but I can't figure out where it can be. Can you help me solve it please ? I don't know Hadoop and know spark well but I never had to install them.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
首先,某些 Spark 软件包随 Hadoop 一起提供,因此您无需单独下载它们。更具体地说,Spark 目前是针对 Hadoop 3.2 构建的,因此使用最新版本可能会导致其自身的问题
对于您的问题,
JAVA_HOME
不应以/bin
或结尾>/bin/java
.再次检查链接的帖子...如果您使用
apt install
for java,您实际上也不需要设置 JAVA_HOME 或 Java 的 PATH,因为包管理器会为您执行此操作。或者您可以使用 https://sdkman.io注意:首选 Java 11
您还需要删除
。 template
来自任何配置文件,以便实际使用它们...但是,JAVA_HOME
会被spark-submit
自动检测到,因此它在spark-env.sh
同样适用于
hadoop-env.sh
还要从您的 PATH 中删除
/usr/local/hadoop/bin/
因为它您似乎没有在该位置放置任何东西First, certain Spark packages come with Hadoop, so you don't need to download them separately. More specifically, Spark is built against Hadoop 3.2 for now, so using the latest version might cause its own problems
For your problem,
JAVA_HOME
should not end in/bin
or/bin/java
. Check the linked post again...If you used
apt install
for java, you shouldn't really need to set JAVA_HOME or the PATH for Java, either, as the package manager will do this for you. Or you can use https://sdkman.ioNote: Java 11 is preferred
You also need to remove
.template
from any config files for them to actually be used... However,JAVA_HOME
is automatically detected byspark-submit
, so it's completely optional inspark-env.sh
Same applies for
hadoop-env.sh
Also remove
/usr/local/hadoop/bin/
from your PATH since it doesn't appear you've put anything in that location