我们可以阅读Hazelcast Jet中的镶木quet文件吗?
我试图通过Hazelcast读取Parquet文件,因为我在下面写的代码写了,但是Hazelcast提供了任何读取parquet文件的内构建源?
BatchSource<Object> csvData = SourceBuilder
.batch("parquet-source", x -> {
try {
ParquetReader<GenericData.Record> reader = AvroParquetReader.<GenericData.Record>builder(new Path("D:/test/1651070287920.parquet")).build();
return reader;
} catch (Exception e) {
return null;
}
})
.<Object>fillBufferFn((reader, buf) -> {
try {
GenericRecord record;
if ((record = reader.read()) != null) {
Map<String, String> map = new HashMap<>();
for (int i = 0; i < headers[0].length; i++) {
String value = record.get(i) == null ? "" : record.get(i).toString();
map.put(headers[0][i], value);
}
if (map != null) {
rowcount = rowcount + 1;
buf.add(map);
}
} else {
buf.close();
return;
}
} catch (Exception e) {
buf.close();
return;
}
})
.build();
请让我知道Hazelcast Jet中是否已经有任何消息来源。
I trying to read parquet file via Hazelcast for that I have written below code which is working fine, but do Hazelcast provide any in-build source to read parquet file?
BatchSource<Object> csvData = SourceBuilder
.batch("parquet-source", x -> {
try {
ParquetReader<GenericData.Record> reader = AvroParquetReader.<GenericData.Record>builder(new Path("D:/test/1651070287920.parquet")).build();
return reader;
} catch (Exception e) {
return null;
}
})
.<Object>fillBufferFn((reader, buf) -> {
try {
GenericRecord record;
if ((record = reader.read()) != null) {
Map<String, String> map = new HashMap<>();
for (int i = 0; i < headers[0].length; i++) {
String value = record.get(i) == null ? "" : record.get(i).toString();
map.put(headers[0][i], value);
}
if (map != null) {
rowcount = rowcount + 1;
buf.add(map);
}
} else {
buf.close();
return;
}
} catch (Exception e) {
buf.close();
return;
}
})
.build();
Please let me know if there is already any source in Hazelcast Jet.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
可以使用 unified File File Connector 。另请参见代码样本。
Parquet files using Avro for serialization can be read using the Unified File Connector. See also the code sample.
统一文件连接器
使用 t有一个与您的架构相对应的类,或者您的模式更具动态性,您想返回
org.apache.generic.genericrecord
从源代码>到map&lt; string,String&gt;
您可以使用以下内容:Parquet is supported using the unified file connector
If you don't have a class corresponding to your schema, or your schema is more dynamic and you want to return
org.apache.avro.generic.GenericRecord
from the source, which you can thenmap
toMap<String, String>
you can use the following: