比较数组并获取差异
我如何比较两个可能具有不同长度的数组并获得每个数组之间的差异?
例如:
Cat cat = new Cat();
Dog dog = new Dog();
Alligator alligator = new Alligator();
Animal animals[] = { cat, dog };
Animal animals2[] = { cat, dog, alligator };
我如何比较它们两个数组并使其返回Alligator
的实例?
How would I compare two arrays that might have different lengths and get the difference between each array?
For example:
Cat cat = new Cat();
Dog dog = new Dog();
Alligator alligator = new Alligator();
Animal animals[] = { cat, dog };
Animal animals2[] = { cat, dog, alligator };
How would I compare them two arrays and make it return the instance of Alligator
?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
我建议你的问题需要澄清。目前,每个人都在猜测你实际上在问什么。
new Cat()
“等于”new Cat()
吗?你的例子表明确实如此!假设这些数组是真正的集合,那么您可能应该使用
HashSet
而不是数组,并使用addAll
和retainAll< 等集合操作/code> 计算设置的差异。
另一方面,如果数组旨在表示列表,则根本不清楚“差异”的含义。
如果代码运行速度至关重要,那么您肯定需要重新考虑您的数据结构。如果你总是从数组开始,你将无法快速计算“差异”......至少在一般情况下是这样。
最后,如果您要使用任何依赖于 equals(Object) 方法的内容(并且包括任何 Java 集合类型),您确实需要清楚地了解“equals”是什么在您的应用程序中应该意味着所有
Cat
实例都相同吗?它们是否都不同?如果您不明白这一点,并相应地实现equals
和hashCode
方法,您将得到令人困惑的结果。I would suggest that your question needs to be clarified. Currently, everyone is guessing what about what you are actually asking.
new Cat()
"equal"new Cat()
? Your example suggests that it does!!Making the assumption that these arrays are intended to be true sets, then you probably should be using
HashSet
instead of arrays, and using collection operations likeaddAll
andretainAll
to calculate the set difference.On the other hand, if the arrays are meant to represent lists, it is not at all clear what "difference" means.
If it is critical that the code runs fast, then you most certainly need to rethink your data structures. If you always start with arrays, you are not going to be able to calculate the "differences" fast ... at least in the general case.
Finally, if you are going to use anything that depends on the
equals(Object)
method (and that includes any of the Java collection types, you really need to have a clear understanding of what "equals" is supposed to mean in your application. Are allCat
instances equal? Are they all different? Are someCat
instances equal and others not? If you don't figure this out, and implement theequals
andhashCode
methods accordingly you will get confusing results.我建议您将对象放入集合中,然后使用集合的交集:
之后,您可以使用removeAll来获得两个集合中任何一个的差异:
灵感来自:http://hype-free.blogspot.com/2008/11/calculate-intersection-of-两个java.html
I suggest that you put your objects in sets and then use an intersection of the sets:
After that you can use removeAll to get a difference to any of the two sets:
Inspired by: http://hype-free.blogspot.com/2008/11/calculating-intersection-of-two-java.html
好吧,您也许可以使用
Set
来代替,并使用removeAll()
方法。或者您可以使用以下简单而缓慢的算法进行操作:
然后
differences
将包含animals
数组中但不在animals2
中的所有对象大批。以类似的方式,您可以执行相反的操作(获取animals2
中但不在animals
中的所有对象)。Well, you could maybe use
Set
instead and use theremoveAll()
method.Or you could use the following simple and slow algorithm for doing:
Then
differences
will have all the objects that are inanimals
array but not inanimals2
array. In a similar way you can do the opposite (get all the objects that are inanimals2
but not inanimals
).您可能需要查看这篇文章以获取更多信息:
http: //download-llnw.oracle.com/javase/tutorial/collections/interfaces/set.html
正如前面提到的,
removeAll()
就是为此而设计的,但您会想要这样做两次,这样您就可以创建一个列表,列出两者中缺少的所有内容,然后您可以将这两个结果组合起来,得到所有差异的列表。但是,这是一种破坏性操作,因此如果您不想丢失信息,请复制
Set
并对其进行操作。更新:
看来我对数组中内容的假设是错误的,因此
removeAll()
将不起作用,但需要5毫秒,具体取决于数组的数量搜索它的项目可能是一个问题。因此,
HashMap
似乎是最好的选择,因为它的搜索速度很快。Animal 是一个至少有一个属性
String name
的接口。为每个实现Animal
的类编写Equals
和hashCode
代码。您可以在这里找到一些讨论: http://www.ibm.com /developerworks/java/library/j-jtp05273.html。这样,如果您希望哈希值是动物类型和名称的组合,那就没问题了。因此,基本算法是将所有内容保留在哈希图中,然后为了搜索差异,只需获取一个键数组,然后搜索该键是否包含在另一个列表中,如果不包含它到
List
因此,我首先将其编写为单线程并进行测试,以了解它的速度有多快,并将其与原始实现进行比较。
然后,如果您需要更快,请尝试使用线程,再次进行比较,看看是否有速度提升。
在进行任何优化之前,请确保您对已有的内容有一些指标,以便您可以进行比较并查看一项更改是否会导致速度提高。
如果一次改动太多,有的可能速度有很大提升,但有的可能会导致性能下降,而且是看不出来的,所以每次改动应该一次一个。
不过,不要失去其他实现,通过使用单元测试并每次测试大约 100 次,您可以了解每个更改给您带来了哪些改进。
You may want to look at this article for more information:
http://download-llnw.oracle.com/javase/tutorial/collections/interfaces/set.html
As was mentioned,
removeAll()
is made for this, but you will want to do it twice, so that you can create a list of all that are missing in both, and then you could combine these two results to have a list of all the differences.But, this is a destructive operation, so if you don't want to lose the information, copy the
Set
and operate on that one.UPDATE:
It appears that my assumption of what is in the array is wrong, so
removeAll()
won't work, but with a 5ms requirement, depeending on the number of items to search it could be a problem.So, it would appear a
HashMap<String, Animal>
would be the best option, as it is fast in searching.Animal is an interface with at least one property,
String name
. For each class that implementsAnimal
write code forEquals
andhashCode
. You can find some discussion here: http://www.ibm.com/developerworks/java/library/j-jtp05273.html. This way, if you want the hash value to be a combination of the type of animal and the name then that will be fine.So, the basic algorithm is to keep everything in the hashmaps, and then to search for differences, just get an array of keys, and search through to see if that key is contained in the other list, and if it isn't put it into a
List<Object>
, storing the value there.You will want to do this twice, so, if you have at least a dual-core processor, you may get some benefit out of having both searches being done in separate threads, but then you will want to use one of the concurrent datatypes added in JDK5 so that you don't have to worry about synchronizations in the combined list of differences.
So, I would write it first as a single-thread and test, to get some ideas as to how much faster it is, also comparing it to the original implmemntation.
Then, if you need it faster, try using threads, again, compare to see if there is a speed increase.
Before making any optimization ensure you have some metrics on what you already have, so that you can compare and see if the one change will lead to an increase in speed.
If you make too many changes at a time, one may have a large improvement on speed, but others may lead to a performance decrease, and it wouldn't be seen, which is why each change should be one at a time.
Don't lose the other implementations though, by using unit tests and testing perhaps 100 times each, you can get an idea as to what improvement each change gives you.
我不关心我的用法的性能(你也不应该关心,除非你有充分的理由,并且你通过分析器发现这段代码是瓶颈)。
我所做的与功能性的答案类似。我使用 LINQ 设置运算符来获取每个列表上的异常:
http://msdn .microsoft.com/en-us/library/bb397894.aspx
编辑:
抱歉,我没有注意到这是 Java。抱歉,我已经进入了 C# la-la land,它们看起来非常相似:)
I don't care about perf for my usages (and you shouldn't either, unless you have a good reason to, and you find out via your profiler that this code is the bottleneck).
What I do is similar to functional's answer. I use LINQ set operators to get the exception on each list:
http://msdn.microsoft.com/en-us/library/bb397894.aspx
Edit:
Sorry, I didn't notice this is Java. Sorry, I'm off in C# la-la land, and they look very similar :)