写入 HDFS:文件被覆盖
我正在写入 hadoop 文件系统。但每次我附加一些内容时,它都会覆盖数据,而不是将其添加到现有数据/文件中。下面提供了执行此操作的代码。对于不同的数据,会一次又一次地调用此代码。每次打开一个新的 SequenceFile.Writer 是否有问题?
每次我将路径获取为 new Path("someDir");
public void writeToHDFS(Path path, long uniqueId, String data){
FileSystem fs = path.getFileSystem(conf);
SequenceFile.Writer inputWriter = new SequenceFile.Writer(fs, conf,
path, LongWritable.class, MyWritable.class);
inputWriter.append(new LongWritable(uniqueId++), new MyWritable(data));
inputWriter.close();
}
I am writing to hadoop file system. But everytime I append something, it overwrites the data instead of adding it to the existing data/file. The code which is doing this is provided below. This code is called again and again for different data. Is opening a new SequenceFile.Writer everytime a problem?
Each time I am getting the path as new Path("someDir");
public void writeToHDFS(Path path, long uniqueId, String data){
FileSystem fs = path.getFileSystem(conf);
SequenceFile.Writer inputWriter = new SequenceFile.Writer(fs, conf,
path, LongWritable.class, MyWritable.class);
inputWriter.append(new LongWritable(uniqueId++), new MyWritable(data));
inputWriter.close();
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
目前无法通过 API 附加到现有的 SequenceFile。当您创建新的
SequenceFile.Writer
对象时,它不会附加到该Path
处的现有文件,而是覆盖它。请参阅我的之前的问题。正如 Thomas 指出的,如果您保留相同的
SequenceFile.Writer
对象,您将能够追加到文件,直到您调用close()
。There is currently no way to append to an existing SequenceFile through the API. When you make the new
SequenceFile.Writer
object, it will not append to an existing file at thatPath
, but instead overwrite it. See my earlier question.As Thomas points out, if you keep the same
SequenceFile.Writer
object, you will be able to append to the file until you callclose()
.