dataproc; Spark Job在DataProc Spark群集上失败,但在本地运行

发布于 2025-02-10 20:15:54 字数 3485 浏览 1 评论 0原文

我有一个通过Maven项目生成的JAR文件,当我通过Java -jar Jarfilename.jar在本地运行时运行良好。但是,当我尝试在DataProc上运行相同的JAR文件时,我会收到以下错误:

22/06/27 13:13:45 INFO org.apache.spark.SparkEnv: Registering BlockManagerMaster
22/06/27 13:13:46 INFO org.apache.spark.SparkEnv: Registering BlockManagerMasterHeartbeat
22/06/27 13:13:46 INFO org.apache.spark.SparkEnv: Registering OutputCommitCoordinator
22/06/27 13:13:49 INFO org.sparkproject.jetty.util.log: Logging initialized @7373ms to org.sparkproject.jetty.util.log.Slf4jLog
22/06/27 13:13:51 INFO com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl: Ignoring exception of type GoogleJsonResponseException; verified object already exists with desired state.
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile$PercentileDigest.getPercentiles([D)Lscala/collection/Seq;
    at com.amazon.deequ.analyzers.ApproxQuantile.fromAggregationResult(ApproxQuantile.scala:84)
    at com.amazon.deequ.analyzers.ScanShareableAnalyzer.metricFromAggregationResult(Analyzer.scala:192)
    at com.amazon.deequ.analyzers.ScanShareableAnalyzer.metricFromAggregationResult$(Analyzer.scala:185)
    at com.amazon.deequ.analyzers.ApproxQuantile.metricFromAggregationResult(ApproxQuantile.scala:50)
    at com.amazon.deequ.analyzers.runners.AnalysisRunner$.successOrFailureMetricFrom(AnalysisRunner.scala:362)
    at com.amazon.deequ.analyzers.runners.AnalysisRunner$.$anonfun$runScanningAnalyzers$5(AnalysisRunner.scala:330)
    at scala.collection.immutable.List.map(List.scala:297)
    at com.amazon.deequ.analyzers.runners.AnalysisRunner$.liftedTree1$1(AnalysisRunner.scala:328)
    at com.amazon.deequ.analyzers.runners.AnalysisRunner$.runScanningAnalyzers(AnalysisRunner.scala:318)
    at com.amazon.deequ.analyzers.runners.AnalysisRunner$.doAnalysisRun(AnalysisRunner.scala:167)
    at com.amazon.deequ.VerificationSuite.doVerificationRun(VerificationSuite.scala:121)
    at com.amazon.deequ.VerificationRunBuilder.run(VerificationRunBuilder.scala:173)
    at com.amazon.deequ.thesis.GCTestOne$.$anonfun$main$1(GCTestOne.scala:42)
    at com.amazon.deequ.thesis.GCTestOne$.$anonfun$main$1$adapted(GCTestOne.scala:11)
    at com.amazon.deequ.examples.ExampleUtils$.withSpark(ExampleUtils.scala:32)
    at com.amazon.deequ.thesis.GCTestOne$.main(GCTestOne.scala:11)
    at com.amazon.deequ.thesis.GCTestOne.main(GCTestOne.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
    at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951)
    at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
    at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
    at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
    at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

我不明白为什么DataProc在本地运行良好时具有Nosuchmethoderror。

有人知道为什么这是吗?

I have a JAR file generated via a Maven project that works fine when I run it locally via java -jar JARFILENAME.jar. However, when I try to run the same JAR file on Dataproc I get the following error:

22/06/27 13:13:45 INFO org.apache.spark.SparkEnv: Registering BlockManagerMaster
22/06/27 13:13:46 INFO org.apache.spark.SparkEnv: Registering BlockManagerMasterHeartbeat
22/06/27 13:13:46 INFO org.apache.spark.SparkEnv: Registering OutputCommitCoordinator
22/06/27 13:13:49 INFO org.sparkproject.jetty.util.log: Logging initialized @7373ms to org.sparkproject.jetty.util.log.Slf4jLog
22/06/27 13:13:51 INFO com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl: Ignoring exception of type GoogleJsonResponseException; verified object already exists with desired state.
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile$PercentileDigest.getPercentiles([D)Lscala/collection/Seq;
    at com.amazon.deequ.analyzers.ApproxQuantile.fromAggregationResult(ApproxQuantile.scala:84)
    at com.amazon.deequ.analyzers.ScanShareableAnalyzer.metricFromAggregationResult(Analyzer.scala:192)
    at com.amazon.deequ.analyzers.ScanShareableAnalyzer.metricFromAggregationResult$(Analyzer.scala:185)
    at com.amazon.deequ.analyzers.ApproxQuantile.metricFromAggregationResult(ApproxQuantile.scala:50)
    at com.amazon.deequ.analyzers.runners.AnalysisRunner$.successOrFailureMetricFrom(AnalysisRunner.scala:362)
    at com.amazon.deequ.analyzers.runners.AnalysisRunner$.$anonfun$runScanningAnalyzers$5(AnalysisRunner.scala:330)
    at scala.collection.immutable.List.map(List.scala:297)
    at com.amazon.deequ.analyzers.runners.AnalysisRunner$.liftedTree1$1(AnalysisRunner.scala:328)
    at com.amazon.deequ.analyzers.runners.AnalysisRunner$.runScanningAnalyzers(AnalysisRunner.scala:318)
    at com.amazon.deequ.analyzers.runners.AnalysisRunner$.doAnalysisRun(AnalysisRunner.scala:167)
    at com.amazon.deequ.VerificationSuite.doVerificationRun(VerificationSuite.scala:121)
    at com.amazon.deequ.VerificationRunBuilder.run(VerificationRunBuilder.scala:173)
    at com.amazon.deequ.thesis.GCTestOne$.$anonfun$main$1(GCTestOne.scala:42)
    at com.amazon.deequ.thesis.GCTestOne$.$anonfun$main$1$adapted(GCTestOne.scala:11)
    at com.amazon.deequ.examples.ExampleUtils$.withSpark(ExampleUtils.scala:32)
    at com.amazon.deequ.thesis.GCTestOne$.main(GCTestOne.scala:11)
    at com.amazon.deequ.thesis.GCTestOne.main(GCTestOne.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
    at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:951)
    at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
    at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
    at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
    at org.apache.spark.deploy.SparkSubmit$anon$2.doSubmit(SparkSubmit.scala:1039)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

I quite don't get why Dataproc has a NoSuchMethodError when everything runs fine locally.

Someone knows why this is?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

忆离笙 2025-02-17 20:15:54

版本与GCP不匹配。我有Spark 3.2.1,但是簇在3.1上运行。

Version mismatch with GCP. I had Spark 3.2.1, but the clusters run on 3.1.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文