Spark Spark-sql-kafka - java.lang.NoClassDefFoundError:org/apache/kafka/common/serialization/ByteArraySerializer
我正在尝试通过。
火花版:3.2.1
Scala版本:2.12.15
遵循其火花壳的指南,包括依赖项,我开始了我的外壳:
spark-shell --packages org.apache.spark:spark-sql-kafka-0-10_2.12:3.2.1
但是,一旦我在外壳中运行类似以下内容:
val df = spark.readStream.format("kafka").option("kafka.bootstrap.servers","http://HOST:PORT").option("subscribe", "my-topic").load()
我会得到以下例外:
java.lang.noclassdeffounderror:org/apache/kafka/common/serialization/bytearrayserializer`
任何想法如何克服这个问题?
我的假设是使用 - 包装,也应加载所有依赖项。但这似乎并非如此。从日志中,我假设软件包已成功加载,包括kafka-clients依赖关系:
org.apache.spark#spark-sql-kafka-0-10_2.12 added as a dependency
resolving dependencies :: org.apache.spark#spark-submit-parent-3b04f646-471c-4cc8-88fb-7e32bc3226ed;1.0
confs: \[default\]
found org.apache.spark#spark-sql-kafka-0-10_2.12;3.2.1 in central
found org.apache.spark#spark-token-provider-kafka-0-10_2.12;3.2.1 in central
found org.apache.kafka#kafka-clients;2.8.0 in central
found org.lz4#lz4-java;1.7.1 in central
found org.xerial.snappy#snappy-java;1.1.8.4 in central
found org.slf4j#slf4j-api;1.7.30 in central
found org.apache.hadoop#hadoop-client-runtime;3.3.1 in central
found org.spark-project.spark#unused;1.0.0 in central
found org.apache.hadoop#hadoop-client-api;3.3.1 in central
found org.apache.htrace#htrace-core4;4.1.0-incubating in central
found commons-logging#commons-logging;1.1.3 in central
found com.google.code.findbugs#jsr305;3.0.0 in central
found org.apache.commons#commons-pool2;2.6.2 in central
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
日志看起来还不错,但是您可以尝试在
中包括
kafka-clients
在中 - packages
参数,我建议创建一个Uber Jar而不是下载库每次提交应用
The logs seem fine, but you can try to include
kafka-clients
dependency in--packages
argument as wellOtherwise, I'd suggest creating an uber jar instead of downloading libraries every time you submit the app