运行“本地群集”时Apache Spark中的模型,如何防止执行者过早解离?

发布于 2025-01-29 15:25:24 字数 5045 浏览 2 评论 0原文

我有一个Spark应用程序,应在本地模式下进行测试&使用Scalatest的本地群集模式。

使用此方法提交本地群集模式:

如何在“本地群集”模式下进行Scara测试?

测试成功运行,但是在终止测试时,我在日志中收到了以下错误:


22/05/16 17:45:25 ERROR TaskSchedulerImpl: Lost executor 0 on 172.16.224.18: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.
22/05/16 17:45:25 ERROR Worker: Failed to launch executor app-20220516174449-0000/2 for Test.
java.lang.IllegalStateException: Shutdown hooks cannot be modified during shutdown.
    at org.apache.spark.util.SparkShutdownHookManager.add(ShutdownHookManager.scala:195)
    at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:153)
    at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:142)
    at org.apache.spark.deploy.worker.ExecutorRunner.start(ExecutorRunner.scala:77)
    at org.apache.spark.deploy.worker.Worker$$anonfun$receive$1.applyOrElse(Worker.scala:547)
    at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:117)
    at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:215)
    at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:102)
    at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:221)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
22/05/16 17:45:25 ERROR Worker: Failed to launch executor app-20220516174449-0000/3 for Test.
java.lang.IllegalStateException: Shutdown hooks cannot be modified during shutdown.
    at org.apache.spark.util.SparkShutdownHookManager.add(ShutdownHookManager.scala:195)
    at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:153)
    at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:142)
    at org.apache.spark.deploy.worker.ExecutorRunner.start(ExecutorRunner.scala:77)
    at org.apache.spark.deploy.worker.Worker$$anonfun$receive$1.applyOrElse(Worker.scala:547)
    at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:117)
    at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:215)
    at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:102)
    at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:221)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
22/05/16 17:45:25 ERROR Worker: Failed to launch executor app-20220516174449-0000/4 for Test.
java.lang.IllegalStateException: Shutdown hooks cannot be modified during shutdown.
    at org.apache.spark.util.SparkShutdownHookManager.add(ShutdownHookManager.scala:195)
    at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:153)
    at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:142)
    at org.apache.spark.deploy.worker.ExecutorRunner.start(ExecutorRunner.scala:77)
    at org.apache.spark.deploy.worker.Worker$$anonfun$receive$1.applyOrElse(Worker.scala:547)
    at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:117)
    at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:215)
    at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:102)
    at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:221)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
22/05/16 17:45:25 ERROR Worker: Failed to launch executor app-20220516174449-0000/5 for Test.
java.lang.IllegalStateException: Shutdown hooks cannot be modified during shutdown.
    at org.apache.spark.util.SparkShutdownHookManager.add(ShutdownHookManager.scala:195)
    at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:153)
    at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:142)
    at org.apache.spark.deploy.worker.ExecutorRunner.start(ExecutorRunner.scala:77)
    at org.apache.spark.deploy.worker.Worker$$anonfun$receive$1.applyOrElse(Worker.scala:547)
    at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:117)
    at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:215)
    at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:102)
    at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dis

...

事实证明executor 0在SparkContext之前已删除停止,这引发了Spark Master的暴力自我修复反应,该反应试图反复发射新的执行者以弥补损失。如何防止这种情况发生?

I have a Spark application that should be tested in both local mode & local-cluster mode, using scalatest.

The local-cluster mode is submitted using this method:

How to scala-test a Spark program under "local-cluster" mode?

The test run successfully, but when terminating the test I got the following error in the log:


22/05/16 17:45:25 ERROR TaskSchedulerImpl: Lost executor 0 on 172.16.224.18: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.
22/05/16 17:45:25 ERROR Worker: Failed to launch executor app-20220516174449-0000/2 for Test.
java.lang.IllegalStateException: Shutdown hooks cannot be modified during shutdown.
    at org.apache.spark.util.SparkShutdownHookManager.add(ShutdownHookManager.scala:195)
    at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:153)
    at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:142)
    at org.apache.spark.deploy.worker.ExecutorRunner.start(ExecutorRunner.scala:77)
    at org.apache.spark.deploy.worker.Worker$anonfun$receive$1.applyOrElse(Worker.scala:547)
    at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:117)
    at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:215)
    at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:102)
    at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:221)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
22/05/16 17:45:25 ERROR Worker: Failed to launch executor app-20220516174449-0000/3 for Test.
java.lang.IllegalStateException: Shutdown hooks cannot be modified during shutdown.
    at org.apache.spark.util.SparkShutdownHookManager.add(ShutdownHookManager.scala:195)
    at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:153)
    at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:142)
    at org.apache.spark.deploy.worker.ExecutorRunner.start(ExecutorRunner.scala:77)
    at org.apache.spark.deploy.worker.Worker$anonfun$receive$1.applyOrElse(Worker.scala:547)
    at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:117)
    at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:215)
    at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:102)
    at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:221)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
22/05/16 17:45:25 ERROR Worker: Failed to launch executor app-20220516174449-0000/4 for Test.
java.lang.IllegalStateException: Shutdown hooks cannot be modified during shutdown.
    at org.apache.spark.util.SparkShutdownHookManager.add(ShutdownHookManager.scala:195)
    at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:153)
    at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:142)
    at org.apache.spark.deploy.worker.ExecutorRunner.start(ExecutorRunner.scala:77)
    at org.apache.spark.deploy.worker.Worker$anonfun$receive$1.applyOrElse(Worker.scala:547)
    at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:117)
    at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:215)
    at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:102)
    at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:221)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
22/05/16 17:45:25 ERROR Worker: Failed to launch executor app-20220516174449-0000/5 for Test.
java.lang.IllegalStateException: Shutdown hooks cannot be modified during shutdown.
    at org.apache.spark.util.SparkShutdownHookManager.add(ShutdownHookManager.scala:195)
    at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:153)
    at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:142)
    at org.apache.spark.deploy.worker.ExecutorRunner.start(ExecutorRunner.scala:77)
    at org.apache.spark.deploy.worker.Worker$anonfun$receive$1.applyOrElse(Worker.scala:547)
    at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:117)
    at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:215)
    at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:102)
    at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dis

...

It turns out executor 0 was dropped before the SparkContext is stopped, this triggered a violent self-healing reaction from Spark master that tries to repeatedly launch new executors to compensate for the loss. How do I prevent this from happening?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

那请放手 2025-02-05 15:25:24

Spark尝试通过尝试再次运行失败的任务从失败的任务中恢复。您可以做的避免这种情况是将一些属性设置为 1

  • (默认值为4)
  • spark.task.task.maxfailures >(默认为4)

这些属性可以在$ SPARK_HOME/conf/conf/spark-defaults.conf 中设置为spark-submit

spark-submit --conf spark.task.maxFailures=1 --conf spark.stage.maxConsecutiveAttempts=1

或在spark上下文中/会话配置在开始会话之前。

编辑:

看来由于内存不足,您的执行者丢失了。您可以尝试增加:

  • spark.executor.memory
  • spark.executor.memoryoverhead
  • spark.memory.offheap.size with(spark spark 。

​>)

运行执行程序容器的最大内存大小由spark.executor.memoryoverhead的总和spark.executor.memory.memoryspark.memory .offHeap.size and spark.executor.pyspark.memory.

Spark attempts to recover from failed tasks by attempting to run them again. What you can do to avoid this is to set some properties to 1 in

  • spark.task.maxFailures (default is 4)
  • spark.stage.maxConsecutiveAttempts (default is 4)

These properties can be set in $SPARK_HOME/conf/spark-defaults.conf or given as options to spark-submit:

spark-submit --conf spark.task.maxFailures=1 --conf spark.stage.maxConsecutiveAttempts=1

or in the Spark context/session configuration before starting the session.

EDIT:

It looks like your executors are lost due to insufficient memory. You could try to increase:

  • spark.executor.memory
  • spark.executor.memoryOverhead
  • spark.memory.offHeap.size with (spark.memory.offHeap.enabled=true)

(see Spark configuration)

The maximum memory size of container to running executor is determined by the sum of spark.executor.memoryOverhead, spark.executor.memory, spark.memory.offHeap.size and spark.executor.pyspark.memory.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文