Intellij 中的 Spark Scala 错误:无法在 Hadoop 二进制文件中找到可执行文件 null\bin\winutils.exe

发布于 2025-01-10 10:41:26 字数 11055 浏览 0 评论 0原文

我想运行以下代码,将 CSV 加载到 IntelliJ 中的 Spark 数据框中。

import org.apache.spark.sql.SQLContext
import org.apache.spark.{SparkConf, SparkContext}

object ReadCSVFile {

  case class Employee(empno:String, ename:String, designation:String, manager:String, 
hire_date:String, sal:String , deptno:String)

  def main(args : Array[String]): Unit = {
    var conf = new SparkConf().setAppName("Read CSV File").setMaster("local[*]")
    val sc = new SparkContext(conf)
    val sqlContext = new SQLContext(sc)
    import sqlContext.implicits._

    val textRDD = sc.textFile("src\\main\\resources\\emp_data.csv")
    //println(textRDD.foreach(println)

    val empRdd = textRDD.map {
      line =>
        val col = line.split(",")
        Employee(col(0), col(1), col(2), col(3), col(4), col(5), col(6))
    }
    val empDF = empRdd.toDF()
    empDF.show()

  }
}

这也是 Intellij 的屏幕,以便您了解我的项目结构: 输入图片这里的描述

最后一件事是我的 build.sbt 文件:

ThisBuild / version := "0.1.0-SNAPSHOT"

ThisBuild / scalaVersion := "2.12.15"

libraryDependencies++=Seq(
  "org.apache.spark"%"spark-core_2.12"%"2.4.5",
  "org.apache.spark"%"spark-sql_2.12"%"2.4.5"
)

我使用 Oracle OpenJDK 版本 17.0.2 作为 SDK。

当我执行代码时,出现以下错误:

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
22/02/28 13:49:17 INFO SparkContext: Running Spark version 2.4.5
22/02/28 13:49:17 WARN NativeCodeLoader: Unable to load native-hadoop library for your 
platform... using builtin-java classes where applicable
22/02/28 13:49:17 ERROR Shell: Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:378)
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:393)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:386)
at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:79)
at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:116)
at org.apache.hadoop.security.Groups.<init>(Groups.java:93)
at org.apache.hadoop.security.Groups.<init>(Groups.java:73)
at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:293)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:283)
at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:260)
at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:789)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:774)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:647)
at org.apache.spark.util.Utils$.$anonfun$getCurrentUserName$1(Utils.scala:2422)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2422)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:293)
at ReadCSVFile$.main(ReadCSVFile.scala:10)
at ReadCSVFile.main(ReadCSVFile.scala)
22/02/28 13:49:17 INFO SparkContext: Submitted application: Read CSV File
22/02/28 13:49:18 INFO SecurityManager: Changing view acls to: adamuser
22/02/28 13:49:18 INFO SecurityManager: Changing modify acls to: adamuser
22/02/28 13:49:18 INFO SecurityManager: Changing view acls groups to: 
22/02/28 13:49:18 INFO SecurityManager: Changing modify acls groups to: 
22/02/28 13:49:18 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(adamuser); groups with view permissions: Set(); users  with modify permissions: Set(adamuser); groups with modify permissions: Set()
22/02/28 13:49:18 INFO Utils: Successfully started service 'sparkDriver' on port 52614.
22/02/28 13:49:18 INFO SparkEnv: Registering MapOutputTracker
22/02/28 13:49:18 INFO SparkEnv: Registering BlockManagerMaster
22/02/28 13:49:18 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
22/02/28 13:49:18 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
22/02/28 13:49:18 INFO DiskBlockManager: Created local directory at 
C:\Users\adamuser\AppData\Local\Temp\blockmgr-c2b9fbdc-cd1a-4a49-8e60-10d3a0f21f3a
22/02/28 13:49:18 INFO MemoryStore: MemoryStore started with capacity 1032.0 MB
22/02/28 13:49:18 INFO SparkEnv: Registering OutputCommitCoordinator
22/02/28 13:49:19 INFO Utils: Successfully started service 'SparkUI' on port 4040.
22/02/28 13:49:19 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://SH-P101.s-h.local:4040
22/02/28 13:49:19 INFO Executor: Starting executor ID driver on host localhost
22/02/28 13:49:19 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 52615.
22/02/28 13:49:19 INFO NettyBlockTransferService: Server created on SH-P101.s-h.local:52615
22/02/28 13:49:19 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
22/02/28 13:49:19 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, SH-P101.s-h.local, 52615, None)
22/02/28 13:49:19 INFO BlockManagerMasterEndpoint: Registering block manager SH-P101.s-h.local:52615 with 1032.0 MB RAM, BlockManagerId(driver, SH-P101.s-h.local, 52615, None)
22/02/28 13:49:19 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, SH-P101.s-h.local, 52615, None)
22/02/28 13:49:19 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, SH-P101.s-h.local, 52615, None)
22/02/28 13:49:20 WARN BlockManager: Putting block broadcast_0 failed due to exception java.lang.reflect.InaccessibleObjectException: Unable to make field transient java.lang.Object[] java.util.ArrayList.elementData accessible: module java.base does not "opens java.util" to unnamed module @15c43bd9.
22/02/28 13:49:20 WARN BlockManager: Block broadcast_0 could not be removed as it was not found on disk or in memory
Exception in thread "main" java.lang.reflect.InaccessibleObjectException: Unable to make field transient java.lang.Object[] java.util.ArrayList.elementData accessible: module java.base does not "opens java.util" to unnamed module @15c43bd9
at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:354)
at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:297)
at java.base/java.lang.reflect.Field.checkCanSetAccessible(Field.java:178)
at java.base/java.lang.reflect.Field.setAccessible(Field.java:172)
at org.apache.spark.util.SizeEstimator$.$anonfun$getClassInfo$2(SizeEstimator.scala:336)
at org.apache.spark.util.SizeEstimator$.$anonfun$getClassInfo$2$adapted(SizeEstimator.scala:330)
at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
at org.apache.spark.util.SizeEstimator$.getClassInfo(SizeEstimator.scala:330)
at org.apache.spark.util.SizeEstimator$.visitSingleObject(SizeEstimator.scala:222)
at org.apache.spark.util.SizeEstimator$.estimate(SizeEstimator.scala:201)
at org.apache.spark.util.SizeEstimator$.estimate(SizeEstimator.scala:69)
at org.apache.spark.util.collection.SizeTracker.takeSample(SizeTracker.scala:78)
at org.apache.spark.util.collection.SizeTracker.afterUpdate(SizeTracker.scala:70)
at org.apache.spark.util.collection.SizeTracker.afterUpdate$(SizeTracker.scala:67)
at org.apache.spark.util.collection.SizeTrackingVector.$plus$eq(SizeTrackingVector.scala:31)
at org.apache.spark.storage.memory.DeserializedValuesHolder.storeValue(MemoryStore.scala:665)
at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:222)
at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:299)
at org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1165)
at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1091)
at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1156)
at org.apache.spark.storage.BlockManager.putIterator(BlockManager.scala:914)
at org.apache.spark.storage.BlockManager.putSingle(BlockManager.scala:1481)
at org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:123)
at org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:88)
at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34)
at org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62)
at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1489)
at org.apache.spark.SparkContext.$anonfun$hadoopFile$1(SparkContext.scala:1035)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.SparkContext.withScope(SparkContext.scala:699)
at org.apache.spark.SparkContext.hadoopFile(SparkContext.scala:1027)
at org.apache.spark.SparkContext.$anonfun$textFile$1(SparkContext.scala:831)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.SparkContext.withScope(SparkContext.scala:699)
at org.apache.spark.SparkContext.textFile(SparkContext.scala:828)
at ReadCSVFile$.main(ReadCSVFile.scala:14)
at ReadCSVFile.main(ReadCSVFile.scala)
22/02/28 13:49:20 INFO SparkContext: Invoking stop() from shutdown hook
22/02/28 13:49:20 INFO SparkUI: Stopped Spark web UI at http://SH-P101.s-h.local:4040
22/02/28 13:49:20 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
22/02/28 13:49:20 INFO MemoryStore: MemoryStore cleared
22/02/28 13:49:20 INFO BlockManager: BlockManager stopped
22/02/28 13:49:20 INFO BlockManagerMaster: BlockManagerMaster stopped
22/02/28 13:49:20 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
22/02/28 13:49:20 INFO SparkContext: Successfully stopped SparkContext
22/02/28 13:49:20 INFO ShutdownHookManager: Shutdown hook called
22/02/28 13:49:20 INFO ShutdownHookManager: Deleting directory 
C:\Users\adamuser\AppData\Local\Temp\spark-67d11686-a6b8-4744-ab04-2ac1623db9dd

Process finished with exit code 1

我尝试解决第一个错误:“java.io.IOException:无法在 Hadoop 二进制文件中找到可执行文件 null\bin\winutils.exe。”

因此,我下载了 winutils.exe 文件,将其放入 C:\hadoop\bin 并将 HADOOP_HOME 环境设置为 C:\hadoop\bin >,但它没有解决任何问题。我还尝试在我的代码中添加:

System.setProperty("hadoop.home.dir", "c/hadoop/bin")

...,但它也不起作用。

我没有找到有关第二个错误的任何信息,但我想找不到我的资源文件(也许路径定义不正确?)。

有人可以建议什么是错的以及需要改变什么吗?

I would like to run the below code that loads a CSV into a Spark dataframe in IntelliJ.

import org.apache.spark.sql.SQLContext
import org.apache.spark.{SparkConf, SparkContext}

object ReadCSVFile {

  case class Employee(empno:String, ename:String, designation:String, manager:String, 
hire_date:String, sal:String , deptno:String)

  def main(args : Array[String]): Unit = {
    var conf = new SparkConf().setAppName("Read CSV File").setMaster("local[*]")
    val sc = new SparkContext(conf)
    val sqlContext = new SQLContext(sc)
    import sqlContext.implicits._

    val textRDD = sc.textFile("src\\main\\resources\\emp_data.csv")
    //println(textRDD.foreach(println)

    val empRdd = textRDD.map {
      line =>
        val col = line.split(",")
        Employee(col(0), col(1), col(2), col(3), col(4), col(5), col(6))
    }
    val empDF = empRdd.toDF()
    empDF.show()

  }
}

Here is also a screen of Intellij so you get how my project structure is :
enter image description here

And the last thing is my build.sbt file :

ThisBuild / version := "0.1.0-SNAPSHOT"

ThisBuild / scalaVersion := "2.12.15"

libraryDependencies++=Seq(
  "org.apache.spark"%"spark-core_2.12"%"2.4.5",
  "org.apache.spark"%"spark-sql_2.12"%"2.4.5"
)

I am using Oracle OpenJDK version 17.0.2 as SDK.

When I execute my code, I get the following errors :

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
22/02/28 13:49:17 INFO SparkContext: Running Spark version 2.4.5
22/02/28 13:49:17 WARN NativeCodeLoader: Unable to load native-hadoop library for your 
platform... using builtin-java classes where applicable
22/02/28 13:49:17 ERROR Shell: Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:378)
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:393)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:386)
at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:79)
at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:116)
at org.apache.hadoop.security.Groups.<init>(Groups.java:93)
at org.apache.hadoop.security.Groups.<init>(Groups.java:73)
at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:293)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:283)
at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:260)
at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:789)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:774)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:647)
at org.apache.spark.util.Utils$.$anonfun$getCurrentUserName$1(Utils.scala:2422)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2422)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:293)
at ReadCSVFile$.main(ReadCSVFile.scala:10)
at ReadCSVFile.main(ReadCSVFile.scala)
22/02/28 13:49:17 INFO SparkContext: Submitted application: Read CSV File
22/02/28 13:49:18 INFO SecurityManager: Changing view acls to: adamuser
22/02/28 13:49:18 INFO SecurityManager: Changing modify acls to: adamuser
22/02/28 13:49:18 INFO SecurityManager: Changing view acls groups to: 
22/02/28 13:49:18 INFO SecurityManager: Changing modify acls groups to: 
22/02/28 13:49:18 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(adamuser); groups with view permissions: Set(); users  with modify permissions: Set(adamuser); groups with modify permissions: Set()
22/02/28 13:49:18 INFO Utils: Successfully started service 'sparkDriver' on port 52614.
22/02/28 13:49:18 INFO SparkEnv: Registering MapOutputTracker
22/02/28 13:49:18 INFO SparkEnv: Registering BlockManagerMaster
22/02/28 13:49:18 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
22/02/28 13:49:18 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
22/02/28 13:49:18 INFO DiskBlockManager: Created local directory at 
C:\Users\adamuser\AppData\Local\Temp\blockmgr-c2b9fbdc-cd1a-4a49-8e60-10d3a0f21f3a
22/02/28 13:49:18 INFO MemoryStore: MemoryStore started with capacity 1032.0 MB
22/02/28 13:49:18 INFO SparkEnv: Registering OutputCommitCoordinator
22/02/28 13:49:19 INFO Utils: Successfully started service 'SparkUI' on port 4040.
22/02/28 13:49:19 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://SH-P101.s-h.local:4040
22/02/28 13:49:19 INFO Executor: Starting executor ID driver on host localhost
22/02/28 13:49:19 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 52615.
22/02/28 13:49:19 INFO NettyBlockTransferService: Server created on SH-P101.s-h.local:52615
22/02/28 13:49:19 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
22/02/28 13:49:19 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, SH-P101.s-h.local, 52615, None)
22/02/28 13:49:19 INFO BlockManagerMasterEndpoint: Registering block manager SH-P101.s-h.local:52615 with 1032.0 MB RAM, BlockManagerId(driver, SH-P101.s-h.local, 52615, None)
22/02/28 13:49:19 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, SH-P101.s-h.local, 52615, None)
22/02/28 13:49:19 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, SH-P101.s-h.local, 52615, None)
22/02/28 13:49:20 WARN BlockManager: Putting block broadcast_0 failed due to exception java.lang.reflect.InaccessibleObjectException: Unable to make field transient java.lang.Object[] java.util.ArrayList.elementData accessible: module java.base does not "opens java.util" to unnamed module @15c43bd9.
22/02/28 13:49:20 WARN BlockManager: Block broadcast_0 could not be removed as it was not found on disk or in memory
Exception in thread "main" java.lang.reflect.InaccessibleObjectException: Unable to make field transient java.lang.Object[] java.util.ArrayList.elementData accessible: module java.base does not "opens java.util" to unnamed module @15c43bd9
at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:354)
at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:297)
at java.base/java.lang.reflect.Field.checkCanSetAccessible(Field.java:178)
at java.base/java.lang.reflect.Field.setAccessible(Field.java:172)
at org.apache.spark.util.SizeEstimator$.$anonfun$getClassInfo$2(SizeEstimator.scala:336)
at org.apache.spark.util.SizeEstimator$.$anonfun$getClassInfo$2$adapted(SizeEstimator.scala:330)
at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
at org.apache.spark.util.SizeEstimator$.getClassInfo(SizeEstimator.scala:330)
at org.apache.spark.util.SizeEstimator$.visitSingleObject(SizeEstimator.scala:222)
at org.apache.spark.util.SizeEstimator$.estimate(SizeEstimator.scala:201)
at org.apache.spark.util.SizeEstimator$.estimate(SizeEstimator.scala:69)
at org.apache.spark.util.collection.SizeTracker.takeSample(SizeTracker.scala:78)
at org.apache.spark.util.collection.SizeTracker.afterUpdate(SizeTracker.scala:70)
at org.apache.spark.util.collection.SizeTracker.afterUpdate$(SizeTracker.scala:67)
at org.apache.spark.util.collection.SizeTrackingVector.$plus$eq(SizeTrackingVector.scala:31)
at org.apache.spark.storage.memory.DeserializedValuesHolder.storeValue(MemoryStore.scala:665)
at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:222)
at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:299)
at org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1165)
at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1091)
at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1156)
at org.apache.spark.storage.BlockManager.putIterator(BlockManager.scala:914)
at org.apache.spark.storage.BlockManager.putSingle(BlockManager.scala:1481)
at org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:123)
at org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:88)
at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34)
at org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62)
at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1489)
at org.apache.spark.SparkContext.$anonfun$hadoopFile$1(SparkContext.scala:1035)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.SparkContext.withScope(SparkContext.scala:699)
at org.apache.spark.SparkContext.hadoopFile(SparkContext.scala:1027)
at org.apache.spark.SparkContext.$anonfun$textFile$1(SparkContext.scala:831)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.SparkContext.withScope(SparkContext.scala:699)
at org.apache.spark.SparkContext.textFile(SparkContext.scala:828)
at ReadCSVFile$.main(ReadCSVFile.scala:14)
at ReadCSVFile.main(ReadCSVFile.scala)
22/02/28 13:49:20 INFO SparkContext: Invoking stop() from shutdown hook
22/02/28 13:49:20 INFO SparkUI: Stopped Spark web UI at http://SH-P101.s-h.local:4040
22/02/28 13:49:20 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
22/02/28 13:49:20 INFO MemoryStore: MemoryStore cleared
22/02/28 13:49:20 INFO BlockManager: BlockManager stopped
22/02/28 13:49:20 INFO BlockManagerMaster: BlockManagerMaster stopped
22/02/28 13:49:20 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
22/02/28 13:49:20 INFO SparkContext: Successfully stopped SparkContext
22/02/28 13:49:20 INFO ShutdownHookManager: Shutdown hook called
22/02/28 13:49:20 INFO ShutdownHookManager: Deleting directory 
C:\Users\adamuser\AppData\Local\Temp\spark-67d11686-a6b8-4744-ab04-2ac1623db9dd

Process finished with exit code 1

I tried to resolve the first one : "java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries."

So I downloaded the winutils.exe file, put it in C:\hadoop\bin and set the HADOOP_HOME environment to C:\hadoop\bin, but it didn't resolve anything. I also tried to add :

System.setProperty("hadoop.home.dir", "c/hadoop/bin")

...in my code, but it didn't work either.

I didn't find anything about the second error, but I suppose my resources file can't be found (maybe the Path isn't defined properly?).

Can someone suggest what is wrong and what needs to be changed?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

伤感在游骋 2025-01-17 10:41:26

评论中这两个问题的答案:

第一个问题:将 HADOOP_HOME 环境变量设置为 C:\hadoop(不带“bin”)并附加 C:\hadoop\bin code> 到 PATH 环境变量。

第二个:使用 JDK 8,因为 Spark 不支持 JDK 17。

Answers to both issues in comment:

1st one : Set HADOOP_HOME environment variable to C:\hadoop without "bin" and append C:\hadoop\bin to PATH environment variable.

2nd one : Use JDK 8 since Spark doesn't support JDK 17.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文