如何确定给定值属于哪个范围?

发布于 2024-12-01 08:24:05 字数 1103 浏览 7 评论 0原文

我有两个数据集:“数据 1”和“数据 2”。您能否帮我找到“数据 1”中 posi 的每个值,“数据 2”中 posi 位于 Star_posi 和 end_posi 之间的范围。

数据 1

  Num     posi 
   1        2 
   2        14
   3        18
   4        19
  ...      ...

数据 2

 Num      Star_posi    End_posi
  1          1            10
  2          3            15
  3          17           21
  4          23           34
 ...       ...           ...

输出

  1. 位置 2 处的数据 1 包含在 star_posi 1 和 end_posi 10 之间的数据 2 中。
  2. 位置 14 处的数据 1 包含在 star_posi 3 和 end_posi 15 之间的数据 2 中。

我想识别数据 2 中的值所在的行数据 1 包含在数据 2 中的行范围内。我编写了下面的脚本,但没有走得太远。

   #!/usr/bin/perl -w
   use strict;
   use warnings;
   use Data:ump qw(dump);

   #Sort the position**************

   my (@posi1, $Num2, @Num2, @Num1);
   open(POS1,"<posi.txt");
   @posi1=<POS1>;
   @Num1=@posi1;
   open(LIST,">list.txt"); {
   @Num2= sort {$a <=> $b} @Num1;
   $Num2 = join( '', @Num2);
   print $Num2;
   print LIST $Num2."\n";
   }
   close(LIST); 

如果您能给一些指点,我将不胜感激。

I have two data sets: 'Data 1' and 'Data 2'. Could you please help me to find, for each value of posi in 'Data 1', the ranges in 'Data 2' where posi lies between Star_posi and end_posi.

Data 1

  Num     posi 
   1        2 
   2        14
   3        18
   4        19
  ...      ...

Data 2

 Num      Star_posi    End_posi
  1          1            10
  2          3            15
  3          17           21
  4          23           34
 ...       ...           ...

Output

  1. Data 1 at posi 2 contained in Data 2 between star_posi 1 and end_posi 10.
  2. Data 1 at posi 14 contained in Data 2 between star_posi 3 and end_posi 15.

I want to identify the rows in Data 2 where the value in Data 1 is contained in the range of the row in Data 2. I made the script below, but I did not get far.

   #!/usr/bin/perl -w
   use strict;
   use warnings;
   use Data:ump qw(dump);

   #Sort the position**************

   my (@posi1, $Num2, @Num2, @Num1);
   open(POS1,"<posi.txt");
   @posi1=<POS1>;
   @Num1=@posi1;
   open(LIST,">list.txt"); {
   @Num2= sort {$a <=> $b} @Num1;
   $Num2 = join( '', @Num2);
   print $Num2;
   print LIST $Num2."\n";
   }
   close(LIST); 

I would appreciate if you could give some pointers.`

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

美人如玉 2024-12-08 08:24:05

你的代码一团糟。此外,它不会以任何方式解决您的问题。

您想要做的是在 while 循环中分割文件中的行,并将它们存储在哈希中。获得这些值后,您可以轻松地将它们与 <> 运算符进行比较,以查看它们属于什么范围。

use strict;
use warnings;
use autodie;

my (%data1,%data2);


open my $in, '<', 'data1.txt';
while (<$in>) {
    next unless /^\s*\d/;
    my ($num, $posi) = split;
    $data1{$num} = $posi;
}

open $in, '<', 'data2.txt';
while (<$in>) {
    next unless /^\s*\d/;
    my ($num, $star, $end) = split;
    $data2{$num}{'star'} = $star;
    $data2{$num}{'end'}  = $end;
}
close $in;

请注意,我将跳过(next)任何不以数字开头的行,例如标题和空行以及我们不希望在数据中出现的其他内容。

现在您将拥有散列中的值,并且可以执行您需要的测试。例如:

for my $num (keys %data1) {
    my $val = $data1{$num};
    for my $num2 (keys %data2) {
        my $min = $data2{$num2}{'star'};
        my $max = $data2{$num2}{'end'};
        if ( ($val > $min) and ($val < $max) ) {
            print "Data 1 at posi $val contained in Data 2 between star_posi $min and end_posi $max.\n";
            last;
        }
    }
}

祝你好运!

Your code is a mess. Also, it does not address your problem in any way.

What you want to do is split the lines from the file in a while loop, storing them in a hash. Once you have the values, you can easily compare them with the < and > operators to see in what ranges they fall.

use strict;
use warnings;
use autodie;

my (%data1,%data2);


open my $in, '<', 'data1.txt';
while (<$in>) {
    next unless /^\s*\d/;
    my ($num, $posi) = split;
    $data1{$num} = $posi;
}

open $in, '<', 'data2.txt';
while (<$in>) {
    next unless /^\s*\d/;
    my ($num, $star, $end) = split;
    $data2{$num}{'star'} = $star;
    $data2{$num}{'end'}  = $end;
}
close $in;

Note that I am skipping (next) any lines which do not start with numbers, e.g. headers and empty lines and other stuff we don't want in the data.

Now you will have the values in the hashes, and can perform what tests you need. For example:

for my $num (keys %data1) {
    my $val = $data1{$num};
    for my $num2 (keys %data2) {
        my $min = $data2{$num2}{'star'};
        my $max = $data2{$num2}{'end'};
        if ( ($val > $min) and ($val < $max) ) {
            print "Data 1 at posi $val contained in Data 2 between star_posi $min and end_posi $max.\n";
            last;
        }
    }
}

Good luck!

仙女 2024-12-08 08:24:05

您应该查看名为 Tie 的 CPAN 模块: :RangeHash正是解决这类问题。

use Tie::RangeHash;
my $hour_name = new Tie::RangeHash Type => Tie::RangeHash::TYPE_NUMBER;

$hour_name->add(' 0, 5', 'EARLY');
$hour_name->add(' 6,11', 'MORNING');
$hour_name->add('12,17', 'AFTERNOON');
$hour_name->add('18,23', 'EVENING');

# and in a loop elsewhere...
my $name = $hour_name->fetch($hour) || "UNKNOWN";

You should have a look at the CPAN module called Tie::RangeHash, which is for exactly this sort of problem.

use Tie::RangeHash;
my $hour_name = new Tie::RangeHash Type => Tie::RangeHash::TYPE_NUMBER;

$hour_name->add(' 0, 5', 'EARLY');
$hour_name->add(' 6,11', 'MORNING');
$hour_name->add('12,17', 'AFTERNOON');
$hour_name->add('18,23', 'EVENING');

# and in a loop elsewhere...
my $name = $hour_name->fetch($hour) || "UNKNOWN";
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文