Java中ArrayList的交集和并集
有什么方法可以做到这一点吗?我正在寻找,但没有找到。
另一个问题:我需要这些方法,以便我可以过滤文件。 有些是 AND 过滤器,有些是 OR 过滤器(就像在集合论中一样),所以我需要根据所有文件和保存这些文件的联合/相交 ArrayList 进行过滤。
我应该使用不同的数据结构来保存文件吗?还有其他什么可以提供更好的运行时间吗?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(24)
这是一个不使用任何第三方库的简单实现。相对于
retainAll
、removeAll
和addAll
的主要优点是这些方法不会修改方法的原始列表输入。Here's a plain implementation without using any third-party library. Main advantage over
retainAll
,removeAll
andaddAll
is that these methods don't modify the original lists input to the methods.Collection (所以 ArrayList 也)有:
使用如果您接受重复,则为 List 实现;如果不接受重复,则为 Set 实现:
Collection (so ArrayList also) have:
Use a List implementation if you accept repetitions, a Set implementation if you don't:
这篇文章相当老了,但它是在谷歌搜索该主题时第一个出现的文章。
我想使用 Java 8 流在一行中执行(基本上)相同的操作:
如果有人有更好/更快的解决方案,请告诉我,但这个解决方案是一个很好的单行代码,可以轻松包含在方法中无需添加不必要的帮助类/方法,并且仍然保持可读性。
This post is fairly old, but nevertheless it was the first one popping up on google when looking for that topic.
I want to give an update using Java 8 streams doing (basically) the same thing in a single line:
If anyone has a better/faster solution let me know, but this solution is a nice one liner that can be easily included in a method without adding a unnecessary helper class/method and still keep the readability.
union 将先是
removeAll
,然后是addAll
。更多内容请参见collection的文档(ArrayList是一个集合)
http://download.oracle.com/javase /1.5.0/docs/api/java/util/Collection.html
union will be
removeAll
and thenaddAll
.Find more in the documentation of collection(ArrayList is a collection)
http://download.oracle.com/javase/1.5.0/docs/api/java/util/Collection.html
并集和交集仅为集合而不是列表定义。正如你提到的。
检查 guava 库中的过滤器。 guava 还提供了真正的交集和并集
Unions and intersections defined only for sets, not lists. As you mentioned.
Check guava library for filters. Also guava provides real intersections and unions
您可以使用 apache 公共资源。
这两种方法还具有数组和映射的重载,并将 null 输入视为空集。由于它们返回集合,因此它们不保留顺序。
You can use
CollectionUtils
from apache commons.Both of these methods also have overloads for arrays and maps, and treat null inputs as an empty set. Since they return sets, they do not preserve order.
标记的解决方案效率不高。它的时间复杂度为 O(n^2)。我们能做的就是对两个列表进行排序,然后执行如下的交集算法。
这个复杂度为 O(n log n + n),即 O(n log n)。
联盟以类似的方式完成。只需确保对 if-elseif-else 语句进行适当的修改即可。
如果需要,您也可以使用迭代器(我知道它们在 C++ 中效率更高,我不知道在 Java 中是否也是如此)。
The solution marked is not efficient. It has a O(n^2) time complexity. What we can do is to sort both lists, and the execute an intersection algorithm as the one below.
This one has a complexity of O(n log n + n) which is in O(n log n).
The union is done in a similar manner. Just make sure you make the suitable modifications on the if-elseif-else statements.
You can also use iterators if you want (I know they are more efficient in C++, I dont know if this is true in Java as well).
自JAVA 8以来的单行代码
联合
如果没有重复项则
:联合和不同:
如果集合/集返回类型:
如果没有重复则相交
:
性能:如果集合
b
很大并且不是 O(1),然后通过在return
之前添加 1 行来预先优化过滤器性能:复制到HasSet
(导入 java.util.Set;
):相交和不同:
- 导入
One-liners since JAVA 8
Union
if there are no duplicates:
union and distinct:
union and distinct if Collection/Set return type:
Intersect
if no duplicates:
PERFORMANCE: If collection
b
is huge and not O(1), then pre-optimize the filter performance by adding 1 line beforereturn
: Copy toHasSet
(import java.util.Set;
):intersect and distinct:
- imports
我认为如果你想对文件进行交集和并集,你应该使用
Set
来保存文件。然后您可以使用 Guava 的 设置类来执行并集
、交集 并通过
Predicate
进行过滤。这些方法与其他建议之间的区别在于,所有这些方法都会创建两个集合的并集、交集等的惰性视图。 Apache Commons 创建一个新集合并将数据复制到其中。retainAll
通过删除其中的元素来更改您的集合之一。I think you should use a
Set
to hold the files if you want to do intersection and union on them. Then you can use Guava's Sets class to dounion
,intersection
and filtering by aPredicate
as well. The difference between these methods and the other suggestions is that all of these methods create lazy views of the union, intersection, etc. of the two sets. Apache Commons creates a new collection and copies data to it.retainAll
changes one of your collections by removing elements from it.这是一种与流进行交集的方法(请记住,您必须对流使用 java 8):
具有不同类型的列表的示例。如果 foo 和 bar 之间存在关系,并且可以从 foo 获取 bar 对象,那么您可以修改流:
Here is a way how you can do an intersection with streams (remember that you have to use java 8 for streams):
An example for lists with different types. If you have a realtion between foo and bar and you can get a bar-object from foo than you can modify your stream:
您可以使用 commons-collections4 CollectionUtils
You can use commons-collections4 CollectionUtils
我发现 ListUtils 对于这个用例非常有用。
如果您不想修改现有列表,请使用 org.apache.commons.collections 中的 ListUtils。
ListUtils.intersection(list1, list2)
I found ListUtils very useful for this use case.
Use ListUtils from org.apache.commons.collections if you do not want to modify existing list.
ListUtils.intersection(list1, list2)
在 Java 8 中,我使用如下简单的辅助方法:
In Java 8, I use simple helper methods like this:
如果列表中的对象是可散列的(即具有合适的 hashCode 和 equals 函数),则表之间最快的方法大约是。尺寸> 20 是为两个列表中较大的一个构建一个 HashSet。
If the objects in the list are hashable (i.e. have a decent hashCode and equals function), the fastest approach between tables approx. size > 20 is to construct a HashSet for the larger of the two lists.
我也在处理类似的情况并到达这里寻求帮助。最终找到了我自己的数组解决方案。
ArrayList AbsentDates = new ArrayList(); // 将存储 Array1-Array2
注意: 如果可以帮助某人访问此页面寻求帮助,则发布此内容。
I was also working on the similar situation and reached here searching for help. Ended up finding my own solution for Arrays.
ArrayList AbsentDates = new ArrayList(); // Will Store Array1-Array2
Note : Posting this if it can help someone reaching this page for help.
基于公共键的不同对象的两个列表的交集 - Java 8
Intersection of two list of different object based on common key - Java 8
JDK8+(可能是最佳性能)
如果您不关心性能并且更喜欢较小的代码,只需使用:
JDK8+ (Probably Best Performance)
If you don't care about performance and prefer smaller code just use:
keepAll() 方法用于查找公共元素..即;交集
列表1.retainAll(列表2)
retainAll() method use for finding common element..i.e;intersection
list1.retainAll(list2)
首先,我将数组的所有值复制到一个数组中,然后将重复值删除到数组中。第 12 行,解释如果相同的数字出现超过时间,则将一些额外的垃圾值放入“j”位置。最后从头到尾遍历,检查是否出现相同的垃圾值,然后丢弃。
First, I am copying all values of arrays into a single array then I am removing duplicates values into the array. Line 12, explaining if same number occur more than time then put some extra garbage value into "j" position. At the end, traverse from start-end and check if same garbage value occur then discard.
经过测试,这是我最好的交叉方法。
与纯 HashSet 方法相比速度更快。下面的 HashSet 和 HashMap 对于超过 100 万条记录的数组具有相似的性能。
对于 Java 8 Stream 方法,对于大于 10k 的数组大小,速度相当慢。
希望这能有所帮助。
After testing, here is my best intersection approach.
Faster speed compared to pure HashSet Approach. HashSet and HashMap below has similar performance for arrays with more than 1 million records.
As for Java 8 Stream approach, speed is quite slow for array size larger then 10k.
Hope this can help.
您可以使用以下方法:
CollectionUtils.containsAny 和
CollectionUtils.containsAll
Apache Commons。
You can use the methods:
CollectionUtils.containsAny
andCollectionUtils.containsAll
from Apache Commons.
最终解决方案:
Final solution:
如果您的数据集中在 Sets 中,则可以使用 Guava 的
设置
类。If you had your data in Sets you could use Guava's
Sets
class.如果数字与我正在检查的数字匹配,则在“indexOf()”的帮助下,它是第一次发生,如果数字第一次匹配,则打印并保存到字符串中,这样,当下次相同的数字匹配时,它就赢了。 t 打印,因为由于“indexOf()”条件将为假。
}
If the number matches than I am checking it's occur first time or not with help of "indexOf()" if the number matches first time then print and save into in a string so, that when the next time same number matches then it's won't print because due to "indexOf()" condition will be false.
}