Java:使用迭代器复制集合
我有一个方法,它作为参数有一个集合的迭代器。在方法内部,我想复制迭代器“指向”的集合。 然而,只有最后一个集合条目出现在集合副本中,它出现了 N 次,其中 N 是原始集合的大小。
public void someMethod(Iterator<Node> values) {
Vector<Node> centralNodeNeighbourhood = new Vector<Node>();
while (values.hasNext()) {
Node tmp = values.next();
centralNodeNeighbourhood.add(tmp);
}
...
//store the centralNodeNeighbourhood on disk
}
示例“原始集合”:
1
2
3
示例“centralNodeNeighbourhood 集合”:
3
3
3
有人可以指出我的错误吗?我无法更改方法参数,我只能将迭代器获取到集合,对此无能为力。
更新(回答一些问题)
while (values.hasNext()) {
Node tmp = values.next();
System.out.print("Adding = "+tmp.toString());
centralNodeNeighbourhood.add(tmp);
}
打印正确的原始集合元素。 我不知道原始集合是什么类型,但 Iterator 来自 std java。该方法是
public class GatherNodeNeighboursInfoReducer extends MapReduceBase
implements Reducer<IntWritable, Node, NullWritable, NodeNeighbourhood>{
public void reduce(IntWritable key, Iterator<Node> values,
OutputCollector<NullWritable, NodeNeighbourhood> output, Reporter reporter) throws IOException {...}
}
旧Hadoop api(Hadoop版本0.20.203.0)中的方法
已解决 我在每次迭代时创建了 tmp 对象的副本,并将该副本添加到centralNodeNeighbourhood 集合中。这解决了我的问题。感谢您的所有(快速)帮助。
I have a method which as an argument has an iterator to the collection. Inside the method I want to copy the collection the iterator is "pointing to".
However only the last collection entry is present in the collection copy, it is present N times, where N is the size of the original collection.
public void someMethod(Iterator<Node> values) {
Vector<Node> centralNodeNeighbourhood = new Vector<Node>();
while (values.hasNext()) {
Node tmp = values.next();
centralNodeNeighbourhood.add(tmp);
}
...
//store the centralNodeNeighbourhood on disk
}
Exemplar "original collection":
1
2
3
Exemplar "centralNodeNeighbourhood collection":
3
3
3
Can someone point me to my mistake? I can not change the method args, I only get the Iterator to the collection, can't do anything about it.
UPDATE (Answer to some questions)
while (values.hasNext()) {
Node tmp = values.next();
System.out.print("Adding = "+tmp.toString());
centralNodeNeighbourhood.add(tmp);
}
Prints proper original collection elements.
I don't know what type is the original collection, but the Iterator is from std java. The method is the
public class GatherNodeNeighboursInfoReducer extends MapReduceBase
implements Reducer<IntWritable, Node, NullWritable, NodeNeighbourhood>{
public void reduce(IntWritable key, Iterator<Node> values,
OutputCollector<NullWritable, NodeNeighbourhood> output, Reporter reporter) throws IOException {...}
}
method from OLD Hadoop api (Hadoop version 0.20.203.0)
SOLVED
I made a copy of tmp object at each iteration, and I add this copy to the centralNodeNeighbourhood collection. This solved my problem. Thx for all your (fast) help.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
看起来迭代器每次都返回相同的 Node 对象。如果是这样,您需要在将节点添加到集合之前获取该节点的副本。 (否则您每次都会添加相同的对象,并且它将具有最后设置的值)
Its appears that the Iterator is returning the same Node object each time. If so, you need to take a copy of the Node before adding it to the collection. (Otherwise you will be adding the same object each time and it will have the last values it was set to)
Hadoop 的 reduce 方法指定它重用其迭代器中的值对象。这是一件可怕的事情,但事实就是如此。
Hadoop's reduce method specifies that it reuses the value objects in its iterator. That's a terrible thing to do, but that's what it does.