我们如何使用 perl 对第 1 列和第 2 列用户数据进行排序
我是 Perl 编程新手。 我想读取文件数据,然后对第 1 列和第 2 列上的记录进行排序(删除重复记录)并将排序后的记录存储到另一个文件中。以下是我的数据
第一列和第二列由制表符分隔
user1 name user2 name
abc xyz
adc xyz
abc xyz
pqr tyu
xyz abc
tyu pqr
abc pqr
在此示例中,我希望首先对 user1 名称排序记录,然后对 user2 名称排序,并且在排序时我想删除重复的记录。
输出应如下所示,
user1 name user2 name
abc pqr
abc xyz
adc xyz
pqr tyu
tyu pqr
xyz abc
请让我知道我们如何实现这个 perl?
I am new to perl programming.
I want read file data, then sort record on column 1 and then column2(remove repeated record) and stored sorted record into another file. following is my data
First column and second column is separated by tab
user1 name user2 name
abc xyz
adc xyz
abc xyz
pqr tyu
xyz abc
tyu pqr
abc pqr
In this example I want first sort record on user1 name and then user2 name and also at the time of sorting i want to remove repeated record.
Output should be as follow
user1 name user2 name
abc pqr
abc xyz
adc xyz
pqr tyu
tyu pqr
xyz abc
please let me know how we can implement this perl?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
这完全取决于您如何存储数据。我不确定您打算如何存储您的信息,因为您在课堂上并且可能已经或可能没有了解参考资料。例如,如果您不知道引用,您可能会执行以下操作:
这会将两个值存储为单个字符串。如果不了解参考资料,这种情况很常见。
如果您了解引用,您可能会这样做:
我可以告诉您的是 sort< /a> 子例程允许您创建一个函数来比较和排序值。当您在
sort
中使用自己的函数时,您会得到两个值$a
和$b
,它们代表您要排序的值。您可以操作它们,然后如果$a
小于$b
或1
,则返回-1
如果 $a 大于$b
或如果两者相等则返回零。 Perl 为您提供了两个运算符<=>
和cmp
以使此操作变得更容易。假设您将值存储为
$user1:$user2
,因为您尚未了解引用。您的排序例程可能如下所示。现在,我的排序将如下所示:
注意:这不是我个人的做法。我可能会使用内置的 cmp 运算符并采取一些快捷方式。不过,我想把它一点一点地拆开,这样你就可以理解了。
顺便说一句,如果老师决定您应该在第一列之前对第二列进行排序,您可以通过更改周围的小于和大于符号来轻松修改您的
sort
子例程。这是我的测试程序:
It all depends how you store your data. I'm not sure how you plan to store your information since you're in class and may or may not have learned about references. For example, if you don't know references, you might do something like this:
This will store both values as a single string. This is quite common if don't know about references.
If you know about references, you'd probably do this:
What I can tell you is that the sort subroutine allows you to create a function to compare and sort values. When you use your own function in
sort
, you get two values$a
and$b
which represent the values you're sorting. You can manipulate these, and then you return a-1
if$a
is less than$b
or1
if $a is greater than$b
or return a zero if they're both equal. Perl gives you two operators<=>
andcmp
to make this a bit easier.Let's assume you're storing the values as
$user1:$user2
since you haven't learned about references yet. Your sort routine might look like this.Now, my sort will look something like this:
Note: This is not the way I'd personally do it. I'd probably use the built in
cmp
operator and take some shortcuts. However, I wanted to take this apart piece-by-piece, so you can understand it.By the way, if the teacher decides you should sort the second column before the first, you can easily modify your
sort
subroutine by just changing the less than and greater than signs around.Here's my test program:
也许不值得生产代码,但这里有一个方法:
输出:
这里最非常规的部分是
$a !~ /^ user/ <=>; $b !~ /^ user/
排序条件。$a !~ /^ user/
对除第一行之外的所有行计算1
(true),第一行的计算结果为0
(false) ,因此标题被放在第一位,尾随行则进入第二个排序条件,从而产生所需的结果。Maybe not production code worthy, but here's an approach:
Output:
The most unconventional part here is the
$a !~ /^ user/ <=> $b !~ /^ user/
sort condition.$a !~ /^ user/
evaluates1
(true) for all lines except the first, where it will evaluate to0
(false), so the header is put first, and trailing lines fall through to the second sort condition, which produces the desired result.或者它可以像这样简单:
但前提是您的数据像这样简单。如果每列中的数据长度不同,
每列的宽度必须与最长的项目一样宽。像这样:
在这种情况下,简单排序仍然有效,但请注意,这些列是用空格而不是制表符填充的。
Or it could be as simple as:
But only if your data is as simple as this. If the data in each column varies in length,
each column must be as wide as the longest item. like so:
In this case the simple sort still works but note that these columns a padded out with spaces not tabs.