如何确定给定值属于哪个范围?
我有两个数据集:“数据 1”和“数据 2”。您能否帮我找到“数据 1”中 posi 的每个值,“数据 2”中 posi 位于 Star_posi 和 end_posi 之间的范围。
数据 1
Num posi
1 2
2 14
3 18
4 19
... ...
数据 2
Num Star_posi End_posi
1 1 10
2 3 15
3 17 21
4 23 34
... ... ...
输出
- 位置 2 处的数据 1 包含在 star_posi 1 和 end_posi 10 之间的数据 2 中。
- 位置 14 处的数据 1 包含在 star_posi 3 和 end_posi 15 之间的数据 2 中。
我想识别数据 2 中的值所在的行数据 1 包含在数据 2 中的行范围内。我编写了下面的脚本,但没有走得太远。
#!/usr/bin/perl -w
use strict;
use warnings;
use Data:ump qw(dump);
#Sort the position**************
my (@posi1, $Num2, @Num2, @Num1);
open(POS1,"<posi.txt");
@posi1=<POS1>;
@Num1=@posi1;
open(LIST,">list.txt"); {
@Num2= sort {$a <=> $b} @Num1;
$Num2 = join( '', @Num2);
print $Num2;
print LIST $Num2."\n";
}
close(LIST);
如果您能给一些指点,我将不胜感激。
I have two data sets: 'Data 1' and 'Data 2'. Could you please help me to find, for each value of posi in 'Data 1', the ranges in 'Data 2' where posi lies between Star_posi and end_posi.
Data 1
Num posi
1 2
2 14
3 18
4 19
... ...
Data 2
Num Star_posi End_posi
1 1 10
2 3 15
3 17 21
4 23 34
... ... ...
Output
- Data 1 at posi 2 contained in Data 2 between star_posi 1 and end_posi 10.
- Data 1 at posi 14 contained in Data 2 between star_posi 3 and end_posi 15.
I want to identify the rows in Data 2 where the value in Data 1 is contained in the range of the row in Data 2. I made the script below, but I did not get far.
#!/usr/bin/perl -w
use strict;
use warnings;
use Data:ump qw(dump);
#Sort the position**************
my (@posi1, $Num2, @Num2, @Num1);
open(POS1,"<posi.txt");
@posi1=<POS1>;
@Num1=@posi1;
open(LIST,">list.txt"); {
@Num2= sort {$a <=> $b} @Num1;
$Num2 = join( '', @Num2);
print $Num2;
print LIST $Num2."\n";
}
close(LIST);
I would appreciate if you could give some pointers.`
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
你的代码一团糟。此外,它不会以任何方式解决您的问题。
您想要做的是在
while
循环中分割
文件中的行,并将它们存储在哈希中。获得这些值后,您可以轻松地将它们与<
和>
运算符进行比较,以查看它们属于什么范围。请注意,我将跳过(
next
)任何不以数字开头的行,例如标题和空行以及我们不希望在数据中出现的其他内容。现在您将拥有散列中的值,并且可以执行您需要的测试。例如:
祝你好运!
Your code is a mess. Also, it does not address your problem in any way.
What you want to do is
split
the lines from the file in awhile
loop, storing them in a hash. Once you have the values, you can easily compare them with the<
and>
operators to see in what ranges they fall.Note that I am skipping (
next
) any lines which do not start with numbers, e.g. headers and empty lines and other stuff we don't want in the data.Now you will have the values in the hashes, and can perform what tests you need. For example:
Good luck!
您应该查看名为 Tie 的 CPAN 模块: :RangeHash,正是解决这类问题。
You should have a look at the CPAN module called Tie::RangeHash, which is for exactly this sort of problem.