Spark-Cassandra(java.lang.noclassdeffounderror:org/apache/spark/sql/sql/cassandra/package)
我正在使用Spark 3.2.1使用Scala 2.12.5和SBT 1.6.2读取Cassandra 4.0.3的数据框,但我有问题。
这是我的sbt文件:
name := "StreamHandler"
version := "1.6.2"
scalaVersion := "2.12.15"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "3.2.1" % "provided",
"org.apache.spark" %% "spark-sql" % "3.2.1" % "provided",
"org.apache.cassandra" % "cassandra-all" % "4.0.3" % "test",
"org.apache.spark" %% "spark-streaming" % "3.2.1" % "provided",
"com.datastax.spark" %% "spark-cassandra-connector" % "3.2.0",
"com.datastax.cassandra" % "cassandra-driver-core" % "4.0.0"
)
libraryDependencies += "com.datastax.dse" % "dse-java-driver-core" % "2.1.1" % "provided"
libraryDependencies += "org.apache.spark" %% "spark-streaming" % "3.2.1" % "provided"
libraryDependencies += "org.apache.commons" % "commons-math3" % "3.6.1" % "provided"
这是我的scala文件:
import org.apache.spark.sql._
import org.apache.spark.sql.functions._
import org.apache.spark.sql.streaming._
import org.apache.spark.sql.types._
import org.apache.spark.sql.cassandra._
import com.datastax.oss.driver.api.core.uuid.Uuids
import com.datastax.spark.connector._
object StreamHandler {
def main(args: Array[String]){
val spark = SparkSession
.builder
.appName("Stream Handler")
.config("spark.cassandra.connection.host","localhost")
.getOrCreate()
import spark.implicits._
val Temp_DF = spark
.read
.cassandraFormat("train_temperature", "project")
.load()
Temp_DF.show(10)
}
}
这是结果:
i am trying to read DataFrame from cassandra 4.0.3 with spark 3.2.1 using scala 2.12.5 and sbt 1.6.2 but i have a problem.
this is my sbt file:
name := "StreamHandler"
version := "1.6.2"
scalaVersion := "2.12.15"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "3.2.1" % "provided",
"org.apache.spark" %% "spark-sql" % "3.2.1" % "provided",
"org.apache.cassandra" % "cassandra-all" % "4.0.3" % "test",
"org.apache.spark" %% "spark-streaming" % "3.2.1" % "provided",
"com.datastax.spark" %% "spark-cassandra-connector" % "3.2.0",
"com.datastax.cassandra" % "cassandra-driver-core" % "4.0.0"
)
libraryDependencies += "com.datastax.dse" % "dse-java-driver-core" % "2.1.1" % "provided"
libraryDependencies += "org.apache.spark" %% "spark-streaming" % "3.2.1" % "provided"
libraryDependencies += "org.apache.commons" % "commons-math3" % "3.6.1" % "provided"
and this is my scala file:
import org.apache.spark.sql._
import org.apache.spark.sql.functions._
import org.apache.spark.sql.streaming._
import org.apache.spark.sql.types._
import org.apache.spark.sql.cassandra._
import com.datastax.oss.driver.api.core.uuid.Uuids
import com.datastax.spark.connector._
object StreamHandler {
def main(args: Array[String]){
val spark = SparkSession
.builder
.appName("Stream Handler")
.config("spark.cassandra.connection.host","localhost")
.getOrCreate()
import spark.implicits._
val Temp_DF = spark
.read
.cassandraFormat("train_temperature", "project")
.load()
Temp_DF.show(10)
}
}
and this is the result:
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
通常问题是,当您执行
SBT软件包
时,它仅使用您的代码构建JAR,而无需依赖。为了减轻此问题,您有两种方法:使用
spark-submit
与- packages com.datastax.spark.spark:spark-cassandra-connector_2.12:3.2时指定Cassandra连接器。 0
如Usually the problem is that when you do
sbt package
it builds a jar only with your code, and without dependencies. To mitigate this problem you have two approaches:Specify Cassandra Connector when using
spark-submit
with--packages com.datastax.spark:spark-cassandra-connector_2.12:3.2.0
as described in documentationCreate a fat jar (with all necessary dependencies) using SBT assembly plugin, but you may need to take care of not including Spark classes into fat jar.