循环遍历两个数组删除perl中的重叠

发布于 2024-11-18 01:41:05 字数 742 浏览 4 评论 0原文

我有两组范围,由 [ start, stop ] 值表示。某些范围重叠,这意味着一个范围的起点位于另一范围的 [ start, stop ] 之间。我想创建一组没有此类重叠的新范围,并且范围中也不包含任何新值。

范围如下所示:

@starts  @ends
      5    108 
      5    187
     44    187
     44    229 
     44    236 
     64    236 
    104    236
    580    644
    632    770

我期望的输出将是这样的:

@starts  @ends
      5    236
    580    770

这是因为前七个范围与 5 => 的间隔重叠。 236,最后两个与632 => 236的间隔重叠。 770.

这是我尝试过的代码:

$fix = 0;
foreach (@ends) {  
    if ($starts[$fix + 1] < $ends[$fix]) {
        splice(@ends, $fix, $fix);
        splice(@starts, $fix + 1, $fix + 1);
    } else {
        $fix += 1;
    }
}

我可以自己打印出值,我只需要合并算法方面的帮助。

I have two sets of ranges, represented by [ start, stop ] values. Some of the ranges overlap, meaning that the start of one range is in between the [ start, stop ] of the other range. I'd like to make a new set of ranges that has no such overlap, and also doesn't include any new values in a range.

The ranges look like this:

@starts  @ends
      5    108 
      5    187
     44    187
     44    229 
     44    236 
     64    236 
    104    236
    580    644
    632    770

The output that I expect would be this:

@starts  @ends
      5    236
    580    770

This is because the first seven ranges overlap with the interval from 5 => 236, and the last two overlap with the interval from 632 => 770.

Here's the code that I tried:

$fix = 0;
foreach (@ends) {  
    if ($starts[$fix + 1] < $ends[$fix]) {
        splice(@ends, $fix, $fix);
        splice(@starts, $fix + 1, $fix + 1);
    } else {
        $fix += 1;
    }
}

I can print out the values myself, I just need help with the algorithm for merging.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

弥枳 2024-11-25 01:41:05

这会就地编辑数组,当它们重叠时简单地折叠边界。

# Since they're sorted by @starts, accept the 0th interval, start at 1
for (1..$#starts) {
    # extra check on array bounds, since we edit in-place
    last unless $_ < @starts;
    # don't need to collapse if no overlap with previous end
    next unless $starts[$_] <= $ends[$_-1];
    # delete this start and the previous end
    splice(@starts,$_,1);
    splice(@ends,$_-1,1);
    # rerun this loop for the same value of $_ since it was deleted
    redo;
}

This edits your arrays in-place, simply collapsing boundaries when they overlap.

# Since they're sorted by @starts, accept the 0th interval, start at 1
for (1..$#starts) {
    # extra check on array bounds, since we edit in-place
    last unless $_ < @starts;
    # don't need to collapse if no overlap with previous end
    next unless $starts[$_] <= $ends[$_-1];
    # delete this start and the previous end
    splice(@starts,$_,1);
    splice(@ends,$_-1,1);
    # rerun this loop for the same value of $_ since it was deleted
    redo;
}
浮光之海 2024-11-25 01:41:05

我认为这就是你想要的。您有一系列 [start,stop] 形式的范围,并且您想要合并重叠的范围。下面的方法相当简单。

  1. 有两组范围,
    原始集和合并集。
  2. 您将第一个范围添加到集合中
    合并(非重叠)范围。
    对于每个候选范围从
    原始设置,你做出选择:

    • 如果该候选项与合并集中已有的候选项重叠,则可以适当扩展合并集中的范围边界。
    • 如果候选范围与合并集中的任何范围之间没有重叠,则将候选范围添加到合并集中。

希望这是有道理的。从你的问题来看,这并不是你想要的,所以如果这不正确,请告诉我。

#!/usr/bin/perl

use strict;
use warnings;

my @starts = qw/ 5 5 44 44 44 64 104 580 632 /;
my @ends   = qw/ 108 187 187 229 236 236 236 644 770 /;

my @ranges;
while ( @starts && @ends ) {
    my $s = shift @starts;
    my $e = shift @ends;
    push @ranges, [ $s, $e ];
}

my @merged_ranges;
push @merged_ranges, shift @ranges;

foreach my $range (@ranges) {
    my $overlap = 0;
    foreach my $m_range (@merged_ranges) {
        if ( ranges_overlap($range,$m_range) ) {
            $overlap = 1;
            $m_range = merge_range($range,$m_range);
        }
    }
    if ( !$overlap ) {
        push @merged_ranges, $range;
    }
}

print join ' ', qw/ start end /;
print "\n";
foreach my $range (@merged_ranges) {
    print join ' ', ( $range->[0], $range->[1] );
    print "\n";
}

sub ranges_overlap {
    my $r1 = shift;
    my $r2 = shift;

    return ( $r1->[0] <= $r2->[1] && $r2->[0] <= $r1->[1] );
}

sub merge_range {
    my $r1 = shift;
    my $r2 = shift;
    use List::Util qw/ min max/;

    my $merged = [ min($r1->[0],$r2->[0]), max($r1->[1],$r2->[1]) ];
    return $merged;
}

I think that this is what you want. You have a series of ranges of the form [start,stop], and you'd like to merge the overlapping ranges. The approach below is fairly simple.

  1. There are two sets of ranges, the
    original set and the merged set.
  2. You add the first range to the set
    of merged (non-overlapping) ranges.
    For each candidate range left from
    the original set, you make a choice:

    • If that candidate overlaps with one already in the merged set, you extend the boundaries of the range in the merged set appropriately.
    • If there is no overlap between the candidate range and any range in the merged set, you add the candidate to the merged set.

Hopefully this makes sense. It's not too obvious from your question that this is what you wanted, so let me know if this isn't right.

#!/usr/bin/perl

use strict;
use warnings;

my @starts = qw/ 5 5 44 44 44 64 104 580 632 /;
my @ends   = qw/ 108 187 187 229 236 236 236 644 770 /;

my @ranges;
while ( @starts && @ends ) {
    my $s = shift @starts;
    my $e = shift @ends;
    push @ranges, [ $s, $e ];
}

my @merged_ranges;
push @merged_ranges, shift @ranges;

foreach my $range (@ranges) {
    my $overlap = 0;
    foreach my $m_range (@merged_ranges) {
        if ( ranges_overlap($range,$m_range) ) {
            $overlap = 1;
            $m_range = merge_range($range,$m_range);
        }
    }
    if ( !$overlap ) {
        push @merged_ranges, $range;
    }
}

print join ' ', qw/ start end /;
print "\n";
foreach my $range (@merged_ranges) {
    print join ' ', ( $range->[0], $range->[1] );
    print "\n";
}

sub ranges_overlap {
    my $r1 = shift;
    my $r2 = shift;

    return ( $r1->[0] <= $r2->[1] && $r2->[0] <= $r1->[1] );
}

sub merge_range {
    my $r1 = shift;
    my $r2 = shift;
    use List::Util qw/ min max/;

    my $merged = [ min($r1->[0],$r2->[0]), max($r1->[1],$r2->[1]) ];
    return $merged;
}
披肩女神 2024-11-25 01:41:05

由于数组是按开始排序的,因此最简单的方法是从末尾开始:

# this assumes at least one element in @starts, @ends
my $n = $#starts;
for (my $i = $#starts - 1; $i >= 0; $i--) {
    if ($ends[$i] < $starts[$n]) {
        # new interval
        $n--;
        ($starts[$n], $ends[$n]) = ($starts[$i], $ends[$i]);
    } else {
        # merge intervals - first scan for how far back to go
        while ($n < $#starts && $ends[$i] < $starts[$n+1]) {
            $n++;
        }
        $starts[$n] = $starts[$i];
    }
}
@starts = @starts[$n..$#starts];
@ends   = @ends[$n..$#ends];

Since the arrays are ordered by start, then the easiest is to work from the end:

# this assumes at least one element in @starts, @ends
my $n = $#starts;
for (my $i = $#starts - 1; $i >= 0; $i--) {
    if ($ends[$i] < $starts[$n]) {
        # new interval
        $n--;
        ($starts[$n], $ends[$n]) = ($starts[$i], $ends[$i]);
    } else {
        # merge intervals - first scan for how far back to go
        while ($n < $#starts && $ends[$i] < $starts[$n+1]) {
            $n++;
        }
        $starts[$n] = $starts[$i];
    }
}
@starts = @starts[$n..$#starts];
@ends   = @ends[$n..$#ends];
婴鹅 2024-11-25 01:41:05

我不熟悉 PERL,但以下伪代码解决方案可能可以轻松修改:

for(i=0; i<N;){
    //we know that the next merged interval starts here:
    start = starts[i]
    end   = ends[i]

    for(i=i+1; i < N && starts[i] < end; i++){  //perhaps you want <= ?
        end = maximum(end, ends[i]);
    }

    add (start, end) to merged array
}

I am not fluent in PERL, but the following pseudocode solution can probably be easily adapted:

for(i=0; i<N;){
    //we know that the next merged interval starts here:
    start = starts[i]
    end   = ends[i]

    for(i=i+1; i < N && starts[i] < end; i++){  //perhaps you want <= ?
        end = maximum(end, ends[i]);
    }

    add (start, end) to merged array
}
亢潮 2024-11-25 01:41:05

这怎么样?

#!perl

use strict;
use warnings;

my @starts = qw(5   5   44  44  44  64  104 580 632);
my @ends =   qw(108 187 187 229 236 236 236 644 770);

my @starts_new;
my @ends_new;

if ((scalar @starts) ne (scalar @ends)) {
    die "Arrays are not of equal length!\n";
}

my %ranges;
my $next_i = 0;
for (my $i=0; $i <= $#starts; $i=$next_i) {
    # If nothing changes below, the next array item we'll visit is the next sequential one.
    $next_i = $i + 1;

    # Init some temp stuff.
    my $start = $starts[$i]; # this one shouldn't change during this "for $i" loop
    my $end = $ends[$i];
    for (my $j=$i+1; $j <= $#ends; $j++) {
        if ($starts[$j] <= $end) {
            # This item further down the @starts array is actually less than
            # (or equal to) the current $end.
            # So, we need to "skip" this item in @starts and update
            # $end to reflect the corresponding entry in @ends.
            $next_i = $j +1;
            $end = $ends[$j] if ($ends[$j] > $end);
        }
    }
    # We have a valid start/end pair.
    push (@starts_new, $start);
    push (@ends_new, $end);
}

for (my $i=0; $i <= $#starts_new; $i++) {
    print "$starts_new[$i], $ends_new[$i]\n";
}

How's this?

#!perl

use strict;
use warnings;

my @starts = qw(5   5   44  44  44  64  104 580 632);
my @ends =   qw(108 187 187 229 236 236 236 644 770);

my @starts_new;
my @ends_new;

if ((scalar @starts) ne (scalar @ends)) {
    die "Arrays are not of equal length!\n";
}

my %ranges;
my $next_i = 0;
for (my $i=0; $i <= $#starts; $i=$next_i) {
    # If nothing changes below, the next array item we'll visit is the next sequential one.
    $next_i = $i + 1;

    # Init some temp stuff.
    my $start = $starts[$i]; # this one shouldn't change during this "for $i" loop
    my $end = $ends[$i];
    for (my $j=$i+1; $j <= $#ends; $j++) {
        if ($starts[$j] <= $end) {
            # This item further down the @starts array is actually less than
            # (or equal to) the current $end.
            # So, we need to "skip" this item in @starts and update
            # $end to reflect the corresponding entry in @ends.
            $next_i = $j +1;
            $end = $ends[$j] if ($ends[$j] > $end);
        }
    }
    # We have a valid start/end pair.
    push (@starts_new, $start);
    push (@ends_new, $end);
}

for (my $i=0; $i <= $#starts_new; $i++) {
    print "$starts_new[$i], $ends_new[$i]\n";
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文