我想做的是有效地处理间隔。例如,在我的示例中,间隔如下所示:
[10, 20], [15, 25], [40, 100], [5, 14]
间隔是封闭的整数,并且某些间隔可能会重叠。我想有效地找到给定查询的重叠间隔。例如,如果给出 [16, 22]
:
[10, 20], [15, 25]
上述间隔应计算为重叠间隔。
我目前正在编写一个基于红黑树的区间树(参考:CLRS,算法简介)。虽然找到所有重叠间隔的时间复杂度可能是 O(n),但运行时间应该更快。请注意,可以删除和插入间隔。
然而,我刚刚发现Boost有interval_map
和interval_set
:
http://www.boost.org/doc/libs/1_46_1/ libs/icl/doc/html/index.html
我尝试过,但行为对我来说很奇怪。例如,如果先插入 [2, 7]
,然后插入 [3, 8]
,则生成的地图将具有 [2, 3)
、[3, 7]
、(7, 8]
。即当插入新的区间时,会自动进行分割。
我可以把或者,Boost 的功能? interval_map
适合我的目的吗?
What I want to do is handling interval efficiently. For example, in my example, intervals are like the following:
[10, 20], [15, 25], [40, 100], [5, 14]
Intervals are closed and integers, and some of intervals may be ovelapped. I want to find overlapped intervals for a given query efficiently. For example, if [16, 22]
is given:
[10, 20], [15, 25]
The above intervals should be computed as overalpped intervals.
I'm currently writing an interval tree based on Red-Black Tree (reference: CLRS, Introduction to Algorithms). Although finding all overlapped intervals can be O(n), the running time should be faster. Note that intervals can be deleted and inserted.
However, I just found that Boost has interval_map
and interval_set
:
http://www.boost.org/doc/libs/1_46_1/libs/icl/doc/html/index.html
I tried it, but the behavior is very strange for me. For example, if [2, 7]
is inserted first and then [3, 8]
is inserted, then the resulting map will have [2, 3)
, [3, 7]
, and (7, 8]
. That is, when a new interval is inserted, splitting is automatically done.
Can I turn off this feature? Or, Boost's interval_map
is right for my purpose?
发布评论
评论(3)
我认为您可以使用
interval_map; > >
。每当您想要添加间隔I
时,只需将make_pair(I, II)
添加到地图,其中II
是一个集合
仅包含I
。因此,对于上面的示例,您可以这样做:请注意,boost 文档表明
std::set
的interval_map
并不是特别有效,位于 此页面。因此,这表明您可能想要编写自己的集合概念的实现,或者使用与std::set
不同的实现。I think you could use an
interval_map<int, set<discrete_interval<int> > >
. Whenever you want to add an intervalI
, just addmake_pair(I, II)
to the map, whereII
is aset
containing onlyI
. So for the example above, you would do:Note that the boost documentation suggests that an
interval_map
ofstd::set
s is not particularly efficient, at the bottom of this page. So this suggests you might want to write your own implementation of the set concept, or use a different one thanstd::set
.您需要一种可以有效查找重叠的数据结构。这是通过在数据结构中存储重叠来实现的。现在你似乎在抱怨它这样做了。
此示例解释了逻辑:
当您添加两个重叠间隔时,您实际上创建了三个具有不同属性的间隔。重叠存在于两个原始间隔中,使其成为与任一原始间隔在逻辑上不同的间隔。现在,两个原始间隔跨越具有不同属性的时间(有些与原始间隔重叠,有些不重叠)。这种分割可以有效地查找重叠,因为它们在地图中是自己的间隔。
无论如何,Boost 确实允许您选择 间隔组合样式。因此,如果您想强制采用一种更难找到重叠的结构,您可以这样做。
You asked for a data structure that could find overlaps efficiently. This does so, by storing overlaps in the data structure. Now you seem to be complaining that it has done so.
This example explains the logic:
When you add two overlapping intervals, you actually create three intervals with distinct properties. The overlap is in both original intervals, make it a logically distinct interval from either of the original intervals. And the two original intervals now span times with different properties (some that overlap the original, some that don't). This splitting makes it efficient to find overlaps, since they are their own intervals in the map.
In any event, Boost does allow you to select the interval combining style. So if you want to force a structure that makes it harder to find overlaps, you can do so.
我尝试了升压interval_map和interval_set。他们效率很低。设置成本非常高,因为实现基本上将每个子区间(交叉点)映射到包含它的所有区间。
我认为CLRS《算法导论》中基于红黑树的实现要好得多。奇怪的是,尽管 std::set 和 std::map 基于 RB 树,但没有允许增强的红黑树实现。
I tried boost interval_map and interval_set. They are very inefficient. The setup cost is very high because the implementation basically maps each subinterval (intersection) to all the intervals that contain it.
I think the implementation in CLRS "introduction to algorithms" based on red-black tree is far better. It is strange there is no red-black tree implementation out there that allows augmentation, even though std::set and std::map are based on RB tree.