Java在集合中查找最接近(或相等)的值
我有一个类:
public class Observation {
private String time;
private double x;
private double y;
//Constructors + Setters + Getters
}
我可以选择将这些对象存储在任何类型的集合中(标准类或第 3 方,如 Guava)。我在下面的 ArrayList 中存储了一些示例数据,但正如我所说,我愿意接受任何其他类型的集合来实现这一目的。因此,一些示例数据:
ArrayList<Observation> ol = new ArrayList<Observation>();
ol.add(new Observation("08:01:23",2.87,3.23));
ol.add(new Observation("08:01:27",2.96,3.17));
ol.add(new Observation("08:01:27",2.93,3.20));
ol.add(new Observation("08:01:28",2.93,3.21));
ol.add(new Observation("08:01:30",2.91,3.23));
该示例假设 Observation 中有一个匹配的构造函数。当我从外部源接收时间戳时,时间戳被存储为 String
对象,但我很乐意将它们转换为其他东西。我按时间顺序收到观察结果,这样我就可以创建并依赖排序的观察结果集合。时间戳不是唯一的(如示例数据中所示),因此我无法基于时间
创建唯一键。
现在来说说问题。我经常需要找到时间
等于或最接近某个时间的一 (1) 个观察值,例如,如果我的时间是08:01:29
,我想获取示例数据中的第四个观察结果,如果时间是 08:01:27
我想要第三个观察结果。
显然,我可以迭代集合,直到找到我正在寻找的时间,但我需要经常这样做,最终我可能会有数百万个观察结果,所以我需要找到一个解决方案,在其中可以找到有效地进行相关观察。
我研究了各种集合类型,包括可以使用谓词过滤集合的类型,但我未能找到一种可以返回一个值的解决方案,而不是满足以下条件的集合子集“<=”-条件。我本质上是在寻找 SELECT * FROM ol WHERE time <= t LIMIT 1 的 SQL 等效项。
我确信有一种聪明而简单的方法可以解决我的问题,所以我希望得到启发。先感谢您。
I have a class along the lines of:
public class Observation {
private String time;
private double x;
private double y;
//Constructors + Setters + Getters
}
I can choose to store these objects in any type of collection (Standard class or 3rd party like Guava). I have stored some example data in an ArrayList below, but like I said I am open to any other type of collection that will do the trick. So, some example data:
ArrayList<Observation> ol = new ArrayList<Observation>();
ol.add(new Observation("08:01:23",2.87,3.23));
ol.add(new Observation("08:01:27",2.96,3.17));
ol.add(new Observation("08:01:27",2.93,3.20));
ol.add(new Observation("08:01:28",2.93,3.21));
ol.add(new Observation("08:01:30",2.91,3.23));
The example assumes a matching constructor in Observation
. The timestamps are stored as String
objects as I receive them as such from an external source but I am happy to convert them into something else. I receive the observations in chronological order so I can create and rely on a sorted collection of observations. The timestamps are NOT unique (as can be seen in the example data) so I cannot create a unique key based on time
.
Now to the problem. I frequently need to find one (1) observation with a time
equal or nearest to a certain time, e.g if my time was 08:01:29
I would like to fetch the 4th observation in the example data and if the time is 08:01:27
I want the 3rd observation.
I can obviously iterate through the collection until I find the time that I am looking for, but I need to do this frequently and at the end of the day I may have millions of observations so I need to find a solution where I can locate the relevant observations in an efficient manner.
I have looked at various collection-types including ones where I can filter the collections with Predicates
but I have failed to find a solution that would return one value, as opposed to a subset of the collection that fulfills the "<="-condition. I am essentially looking for the SQL equivalent of SELECT * FROM ol WHERE time <= t LIMIT 1
.
I am sure there is a smart and easy way to solve my problem so I am hoping to be enlightened. Thank you in advance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
尝试 TreeSet 提供一个比较时间的比较器。它维护一个有序集,您可以要求 TreeSet.floor(E) 来找到最大的最小值(您应该提供一个虚拟观察值以及您正在寻找的时间)。您还有用于有序子集的 headSet 和 tailSet。
添加和检索的时间为 O(log n)。我认为非常适合您的需求。
如果您更喜欢 Map,则可以使用具有类似方法的 TreeMap。
Try TreeSet providing a comparator that compares the time. It mantains an ordered set and you can ask for
TreeSet.floor(E)
to find the greatest min (you should provide a dummy Observation with the time you are looking for). You also have headSet and tailSet for ordered subsets.It has O(log n) time for adding and retrieving. I think is very suitable for your needs.
If you prefer a Map you can use a TreeMap with similar methods.
对您的集合进行排序(ArrayList 在这里可能效果最好)并使用 BinarySearch 返回“最接近”可能匹配的整数索引,即它返回...
Sort your collection (ArrayList will probably work best here) and use BinarySearch which returns an integer index of either a match of the "closest" possible match, ie it returns an...
让
Observation
类实现Comparable
并使用TreeSet
来存储对象,这将使元素保持排序。TreeSet
实现了SortedSet
,因此您可以使用headSet
或tailSet
来获取排序之前或之后的集合的视图您正在寻找的元素。对返回的集合使用first
或last
方法来获取您要查找的元素。如果您无法使用 ArrayList,但可以自行对元素进行排序,请使用 Collections.binarySearch 来搜索元素。如果找到确切的元素,则返回正数;如果找到最接近的元素,则返回负数。 http://download.oracle.com/javase/1.4.2/docs/api/java/util/Collections.html#binarySearch(java.util.List,%20java.lang.Object )
Have the
Observation
class implementComparable
and use aTreeSet
to store the objects, which will keep the elements sorted.TreeSet
implementsSortedSet
, so you can useheadSet
ortailSet
to get a view of the set before or after the element you're searching for. Use thefirst
orlast
method on the returned set to get the element you're seeking.If you are stuck with
ArrayList
, but can keep the elements sorted yourself, useCollections.binarySearch
to search for the element. It returns a positive number if the exact element is found, or a negative number that can be used to determine the closest element. http://download.oracle.com/javase/1.4.2/docs/api/java/util/Collections.html#binarySearch(java.util.List,%20java.lang.Object)如果您足够幸运能够使用 Java 6,那么保留
SortedSet
的性能开销对您来说并不是什么大问题。看看 TreeSetceiling
、floor
、higher
和lower
方法。If you are lucky enough to be using Java 6, and the performance overhead of keeping a
SortedSet
is not a big deal for you. Take a look at TreeSetceiling
,floor
,higher
andlower
methods.