我如何使用Java I' ve使用df.write()。parquet但会有以下错误?
package com.evampsaanga.imran.testing;
import org.apache.spark.SparkConf;
import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.SparkSession;
public class ReadingCsv {
public static void main(String[] args) {
SparkConf conf = new SparkConf().setAppName("Text File Data Load").setMaster("local").set("spark.driver.host","localhost").set("spark.testing.memory", "2147480000");
SparkSession spark = SparkSession.builder().config(conf).getOrCreate();
Dataset<Row> df = spark.read()
.format("org.apache.spark.sql.execution.datasources.csv.CSVFileFormat")
.option("sep", ",")
.option("inferSchema", true)
.option("header", true)
.load("E:/CarsData.csv");
df.write().parquet("test.parquet");
我已经使用了df.write.parquet,但是我遇到了这个错误
Exception in thread "main" org.apache.spark.sql.AnalysisException: Multiple sources found for parquet (org.apache.spark.sql.execution.datasources.v2.parquet.ParquetDataSourceV2, org.apache.spark.sql 。 atrg.apache.spark.sql.execution.datasources.datasource $ .lookupdatasource(datasource.scala:702) atrg.apache.spark.sql.execution.datasources.datasource $ .lookupdatasourcev2(datasource.scala:728) atrg.apache.spark.sql.dataframewriter.lookupv2provider(dataframewriter.scala:948) 请访问org.apache.spark.sql.dataframewriter.save(dataframewriter.scala:285) atrg.apache.spark.sql.dataframewriter.save(dataFrameWriter.scala:269) atrg.apache.spark.sql.dataframewriter.parquet(dataframewriter.scala:829) 在com.evampsaanga.imran.testing.readingcsv.main(Readscsv.java:23)
package com.evampsaanga.imran.testing;
import org.apache.spark.SparkConf;
import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.SparkSession;
public class ReadingCsv {
public static void main(String[] args) {
SparkConf conf = new SparkConf().setAppName("Text File Data Load").setMaster("local").set("spark.driver.host","localhost").set("spark.testing.memory", "2147480000");
SparkSession spark = SparkSession.builder().config(conf).getOrCreate();
Dataset<Row> df = spark.read()
.format("org.apache.spark.sql.execution.datasources.csv.CSVFileFormat")
.option("sep", ",")
.option("inferSchema", true)
.option("header", true)
.load("E:/CarsData.csv");
df.write().parquet("test.parquet");
I have used df.write.parquet but I am getting this error
Exception in thread "main" org.apache.spark.sql.AnalysisException: Multiple sources found for parquet (org.apache.spark.sql.execution.datasources.v2.parquet.ParquetDataSourceV2, org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat), please specify the fully qualified class name.;
at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:702)
at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSourceV2(DataSource.scala:728)
at org.apache.spark.sql.DataFrameWriter.lookupV2Provider(DataFrameWriter.scala:948)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:285)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:269)
at org.apache.spark.sql.DataFrameWriter.parquet(DataFrameWriter.scala:829)
at com.evampsaanga.imran.testing.ReadingCsv.main(ReadingCsv.java:23)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论