警告 CSVHeaderChecker:CSV 标头不符合架构。 - 但标题是正确的
我正在尝试使用 Spark 流式传输 CSV 文件。 我受到 https://dzone.com/articles/spark-structed 的启发-streaming-using-java。
但是我收到错误:
22/03/07 13:51:52 WARN CSVHeaderChecker: CSV header does not conform to the schema.
Header:
Schema: department
Expected: department but found:
CSV file: file:///C:..../data/stream/employee/drop_data/02_employee.csv
这里是我的代码:
StructType schema = new StructType().add("empId", DataTypes.StringType).add("empName", DataTypes.StringType)
.add("department", DataTypes.StringType);
//build the streaming data reader from the file source, specifying csv file format
Dataset<Row> rawData = spark.readStream().option("header", true).format("csv").schema(schema)
.csv("C:/.../test/data/stream/employee/drop_data");
这里是我的 csv:
empId;empName;department
1;Name;IT
I'm trying to stream CSV files with Spark.
I'm inspiring of https://dzone.com/articles/spark-structured-streaming-using-java.
However I got the error :
22/03/07 13:51:52 WARN CSVHeaderChecker: CSV header does not conform to the schema.
Header:
Schema: department
Expected: department but found:
CSV file: file:///C:..../data/stream/employee/drop_data/02_employee.csv
Here my code :
StructType schema = new StructType().add("empId", DataTypes.StringType).add("empName", DataTypes.StringType)
.add("department", DataTypes.StringType);
//build the streaming data reader from the file source, specifying csv file format
Dataset<Row> rawData = spark.readStream().option("header", true).format("csv").schema(schema)
.csv("C:/.../test/data/stream/employee/drop_data");
Here my csv :
empId;empName;department
1;Name;IT
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您是否尝试更改 csv 文件的分隔符值“;”到 ','?
来自来源的示例 有专栏。
Do you try to change the delimiter value of csv file from ';' to ','?
Example from source have column.
哈哈我也遇到了同样的问题,最后我发现我输入了“XX.csv”到“XX.xlsx”,所以正如Benoit所说,是分隔符的问题
HAHA I also met the same question,and finally I found I typed "XX.csv" to "XX.xlsx",So as Benoit said,is the problem of delimiter