如何扩展二分搜索迭代器以使用多个目标
我有一个函数,binary_range_search
,其调用方式如下:
my $brs_iterator = binary_range_search(
target => $range, # eg. [1, 200]
search => $ranges # eg. [ {start => 1, end => 1000},
); # {start => 500, end => 1500} ]
brs_iterator->()
将迭代 $range 重叠的所有 @$range。
我想扩展 binary_range_search
以便能够以多个范围作为目标来调用它,例如:
target => $target_ranges # eg. [ [1, 200], [50, 300], ... ]
search => $search_ranges # as above
因此,当 $range->[0] 上的搜索耗尽时,它应该继续到$range->[1],等等。这是所讨论的函数的原始形式:
sub binary_range_search {
my %options = @_;
my $range = $options{target} || return;
my $ranges = $options{search} || return;
my ( $low, $high ) = ( 0, @{$ranges} - 1 );
while ( $low <= $high ) {
my $try = int( ( $low + $high ) / 2 );
$low = $try + 1, next if $ranges->[$try]{end} < $range->[0];
$high = $try - 1, next if $ranges->[$try]{start} > $range->[1];
my ( $down, $up ) = ($try) x 2;
my %seen = ();
my $brs_iterator = sub {
if ( $ranges->[ $up + 1 ]{end} >= $range->[0]
and $ranges->[ $up + 1 ]{start} <= $range->[1]
and !exists $seen{ $up + 1 } )
{
$seen{ $up + 1 } = undef;
return $ranges->[ ++$up ];
}
elsif ( $ranges->[ $down - 1 ]{end} >= $range->[0]
and $ranges->[ $down + 1 ]{start} <= $range->[1]
and !exists $seen{ $down - 1 }
and $down > 0 )
{
$seen{ $down - 1 } = undef;
return $ranges->[ --$down ];
}
elsif ( !exists $seen{$try} ) {
$seen{$try} = undef;
return $ranges->[$try];
}
else {
return;
}
};
return $brs_iterator;
}
return sub { };
}
它是标准的二分搜索策略,直到找到重叠范围。然后向右移动,耗尽它,向左移动,耗尽它,最后放弃。理想情况下,我想它应该转移
下一个目标范围,并重做搜索(也许通过递归?)。我的问题是,我不确定如何使其与迭代器构造一起工作。
I have a function, binary_range_search
, that is called like so:
my $brs_iterator = binary_range_search(
target => $range, # eg. [1, 200]
search => $ranges # eg. [ {start => 1, end => 1000},
); # {start => 500, end => 1500} ]
brs_iterator->()
will iterate over all @$ranges on which $range overlaps.
I would like to extend binary_range_search
to be able to call it with multiple ranges as its target, eg:
target => $target_ranges # eg. [ [1, 200], [50, 300], ... ]
search => $search_ranges # as above
So, when the search on $range->[0] is exhausted, it should move on to $range->[1], and so on. Here is the function in question, in its original form:
sub binary_range_search {
my %options = @_;
my $range = $options{target} || return;
my $ranges = $options{search} || return;
my ( $low, $high ) = ( 0, @{$ranges} - 1 );
while ( $low <= $high ) {
my $try = int( ( $low + $high ) / 2 );
$low = $try + 1, next if $ranges->[$try]{end} < $range->[0];
$high = $try - 1, next if $ranges->[$try]{start} > $range->[1];
my ( $down, $up ) = ($try) x 2;
my %seen = ();
my $brs_iterator = sub {
if ( $ranges->[ $up + 1 ]{end} >= $range->[0]
and $ranges->[ $up + 1 ]{start} <= $range->[1]
and !exists $seen{ $up + 1 } )
{
$seen{ $up + 1 } = undef;
return $ranges->[ ++$up ];
}
elsif ( $ranges->[ $down - 1 ]{end} >= $range->[0]
and $ranges->[ $down + 1 ]{start} <= $range->[1]
and !exists $seen{ $down - 1 }
and $down > 0 )
{
$seen{ $down - 1 } = undef;
return $ranges->[ --$down ];
}
elsif ( !exists $seen{$try} ) {
$seen{$try} = undef;
return $ranges->[$try];
}
else {
return;
}
};
return $brs_iterator;
}
return sub { };
}
It's a standard binary search strategy, until it finds an overlapping range. It then moves on the right, exhausts it, moves on the left, exhausts it, and finally gives up. Ideally, it should then maybe shift
the next target range, and redo the search, I suppose (perhaps via recursion?). My problem is, I am not sure how to make that work with the iterator construction.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我只是将迭代器生成包装在 for 循环中,并构建了一个迭代器函数数组。
根据上下文,我要么返回一个主迭代器,要么返回一个迭代器函数列表。我不确定你想要什么。
I just wrapped your iterator generation in a for loop, and built up an array of iterator functions.
Depending on context, I either return a master iterator or a list of iterator functions. I wasn't sure what you wanted.
如果您想要迭代与搜索范围重叠的所有值,则不需要二分搜索。
首先是惯例的首要问题:
首先,检查我们是否有
target
和search
参数,并且对于每个范围,起点不大于其终点。否则,我们拒绝继续。迭代器本身是一个维护以下状态的闭包:
其中
$i
如果已定义,则为当前重叠范围$ta
和$tb
中的下一个值code> 是当前目标范围的起点和终点$sa
和$sb
与上面类似,但针对当前搜索范围$si
> 是@$search
的索引,定义当前搜索范围。我们将分配并返回迭代器
$it
。声明和初始化是分开的,因此迭代器可以在必要时调用自身。如果没有更多的目标范围剩余或者没有搜索范围开始,我们就完成了。
当定义
$i
时,意味着我们找到了重叠,并通过递增$i
进行迭代,直到它大于当前目标范围或当前范围的终点搜索范围。否则,我们需要确定下一个目标范围是否与任何搜索范围重叠。但是,如果
$i
未定义并且我们已经考虑了所有搜索范围,我们将放弃当前目标范围并重新开始。在这里,我们提取当前目标范围(始终位于
@$target
的前面)和当前搜索范围(由$si
索引)的起点和终点。现在测试重叠很简单。对于不相交的搜索范围,我们忽略并继续。否则,我们找到重叠中的最左边的点并从那里迭代。
最后,我们返回迭代器:
对于与您问题中类似的示例,
其删除内部点的输出是
If you're wanting to iterate over all values that overlap the search ranges, you don't need binary search.
First the customary front matter:
First off, check that we have
target
andsearch
parameters and that for each range, the starting point is no greater than its ending point. Otherwise, we refuse to proceed.The iterator itself is a closure that maintains the following state:
where
$i
if defined is the next value from the current overlapping range$ta
and$tb
are the starting and ending points for the current target range$sa
and$sb
are like the above but for the current search range$si
is an index into@$search
and defines the current search rangeWe will be assigning and returning the iterator
$it
. The declaration and initialization are separate so the iterator may call itself when necessary.We are done if no more target ranges remain or if there were no search ranges to begin with.
When
$i
is defined, it means we have found an overlap and iterate by incrementing$i
until it is greater than the ending point of either the current target range or the current search range.Otherwise, we need to determine whether the next target range overlaps any search range. However, if
$i
is undefined and we've already considered all the search ranges, we discard the current target range and start again.Here we pull out the starting and ending points of both the current target range (always at the front of
@$target
) and the current search range (indexed by$si
).Now testing for overlap is straightforward. For disjoint search ranges, we ignore and move on. Otherwise, we find the leftmost point in the overlap and iterate from there.
Finally, we return the iterator:
For an example similar to the one in your question
Its output with internal points elided is
将其分成两个函数,一个外部函数循环遍历范围并调用一个实现传统二进制斩波的内部函数。
Split it into two functions, an outer function that loops over the ranges and calls an inner function that implements a conventional binary chop.
警告:一个非常偏向 C++ 的答案:
你要做的就是定义一个新类型的迭代器,它是一对普通迭代器和一个 segmemt 迭代器(如果你没有段迭代器,它是一对指向段的 const 指针/引用,以及指向正确段的索引)。您必须定义随机访问迭代器的所有概念(差异、整数加法等)。请记住,至少在 C++ 术语中,这不是真正的随机迭代器,因为整数的加法并不是真正的恒定时间;这就是生活。
Warning: a very c++ biased answer:
what you have to do is define a new type of iterator, which is a pair of a usual iterator, and a segmemt iterrator (if you don't have a segment iterator, it's a pair of a const pointer / ref to the segments, and an index pointing to the correct segment). You have to define all the concepts of a random access iterator (difference, addition of integer, etc). bear in mind, that at least in c++ lingo this is not a true random iterator, since addition of an integer is not really constant time; such is life.