用于组合行并总结它们的 awk 命令

发布于 2024-10-30 22:44:32 字数 494 浏览 1 评论 0原文

这是我的格式。

Source IP       Destination IP    Received Sent
192.168.0.1     10.10.10.1        3412     341
192.168.0.1     10.10.10.1        341      43
192.168.0.1     10.22.22.2        34       334
192.168.0.1     192.168.9.3       34       243

但这些文件非常大。我基本上想给出每个源IP的总带宽。因此，我需要组合所有 uniq 源 IP，然后添加所有唯一的接收列，然后添加发送列。最终结果将是：

源 ip - 接收到的数据包总数 - 发送的数据包总数

也可以对源和目标 IP 进行 uniq，这样我也可以获得

源 ip - 目标 ip - 接收到的数据包总数 - 发送的数据包总数

任何帮助将不胜感激

原文

This is the format that I have.

Source IP       Destination IP    Received Sent
192.168.0.1     10.10.10.1        3412     341
192.168.0.1     10.10.10.1        341      43
192.168.0.1     10.22.22.2        34       334
192.168.0.1     192.168.9.3       34       243

But a very large file of these. I basically want to give the total bandwidth of each source IP. So I need to combine all uniq source IPs and then add the received columns of everything that is unique and then add the sent columns. The end outcome would be:

source ip - total received packets - total sent packets

It would also be nice to uniq the source and destination IP as well so I could also get

source ip - destination ip - total received packets - total sent packets

Any help would be greatly appreciated

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

入怼 2024-11-06 22:44:32

只需查看源 IP：

awk '
    NR == 1 {next}
    {
        recv[$1] += $3
        sent[$1] += $4
    }
    END {for (ip in recv) printf("%s - %d - %d\n", ip, recv[ip], sent[ip]}
' filename

对于源/目标对，只需稍加修改：

awk '
    NR == 1 {next}
    {
        key = $1 " - " $2
        recv[key] += $3
        sent[key] += $4
    }
    END {for (key in recv) printf("%s - %d - %d\n", key, recv[key], sent[key])}
' filename

just looking at the Source IP:

awk '
    NR == 1 {next}
    {
        recv[$1] += $3
        sent[$1] += $4
    }
    END {for (ip in recv) printf("%s - %d - %d\n", ip, recv[ip], sent[ip]}
' filename

for source/destination pairs, just a slight modification:

awk '
    NR == 1 {next}
    {
        key = $1 " - " $2
        recv[key] += $3
        sent[key] += $4
    }
    END {for (key in recv) printf("%s - %d - %d\n", key, recv[key], sent[key])}
' filename

回复收藏 0 原文

死开点丶别碍眼 2024-11-06 22:44:32

Ruby(1.9+)

#!/usr/bin/env ruby      
hash_recv=Hash.new(0)
hash_sent=Hash.new(0)
hash_src_dst_recv=Hash.new(0)
hash_src_dst_sent=Hash.new(0)
f=File.open("file")
f.readline
f.each do |line|
    s = line.split
    hash_recv[s[0]] += s[2].to_i
    hash_sent[s[0]] +=  s[-1].to_i
    hash_src_dst_recv[ s[0,2] ] +=  s[2].to_i
    hash_src_dst_sent[ s[0,2] ] +=  s[-1].to_i
end
f.close
p hash_recv
p hash_sent
p hash_src_dst_recv
p hash_src_dst_sent

测试运行：

$ ruby test.rb
{"192.168.0.1"=>3787, "192.168.168.0.1"=>34}
{"192.168.0.1"=>718, "192.168.168.0.1"=>243}
{["192.168.0.1", "10.10.10.1"]=>3753, ["192.168.0.1", "10.22.22.2"]=>34, ["192.168.168.0.1", "192.168.9.3"]=>34}
{["192.168.0.1", "10.10.10.1"]=>384, ["192.168.0.1", "10.22.22.2"]=>334, ["192.168.168.0.1", "192.168.9.3"]=>243}

Ruby(1.9+)

#!/usr/bin/env ruby      
hash_recv=Hash.new(0)
hash_sent=Hash.new(0)
hash_src_dst_recv=Hash.new(0)
hash_src_dst_sent=Hash.new(0)
f=File.open("file")
f.readline
f.each do |line|
    s = line.split
    hash_recv[s[0]] += s[2].to_i
    hash_sent[s[0]] +=  s[-1].to_i
    hash_src_dst_recv[ s[0,2] ] +=  s[2].to_i
    hash_src_dst_sent[ s[0,2] ] +=  s[-1].to_i
end
f.close
p hash_recv
p hash_sent
p hash_src_dst_recv
p hash_src_dst_sent

test run:

$ ruby test.rb
{"192.168.0.1"=>3787, "192.168.168.0.1"=>34}
{"192.168.0.1"=>718, "192.168.168.0.1"=>243}
{["192.168.0.1", "10.10.10.1"]=>3753, ["192.168.0.1", "10.22.22.2"]=>34, ["192.168.168.0.1", "192.168.9.3"]=>34}
{["192.168.0.1", "10.10.10.1"]=>384, ["192.168.0.1", "10.22.22.2"]=>334, ["192.168.168.0.1", "192.168.9.3"]=>243}

回复收藏 0 原文

我三岁 2024-11-06 22:44:32

我会做一个（有点格式化，但你可以把它写在一行中）：

sort file.txt | awk ' BEGIN {start = 1;} 
                           { 
                            ip = $1; 
                            if (lastip == ip) { 
                               sum_r += $3; sum_s += $4; 
                               }
                            else 
                               { if (!start) print lastip ": " sum_r ", " sum_s
                                 else 
                                    start = 0;
                                 lastip = ip; sum_r = $3; sum_s = $4;
                                }
                            }
                       END { print lastip ": " sum_r ", " sum_s }'

I would do a (a little bit formatted but you could write it in one line):

sort file.txt | awk ' BEGIN {start = 1;} 
                           { 
                            ip = $1; 
                            if (lastip == ip) { 
                               sum_r += $3; sum_s += $4; 
                               }
                            else 
                               { if (!start) print lastip ": " sum_r ", " sum_s
                                 else 
                                    start = 0;
                                 lastip = ip; sum_r = $3; sum_s = $4;
                                }
                            }
                       END { print lastip ": " sum_r ", " sum_s }'

回复收藏 0 原文

生生漫 2024-11-06 22:44:32

 awk '{
       if (NR==FNR){ 
         Recieved[$1,$2]+=$3;Sent[$1,$2]+=$4;
       }else{
           if(Recieved[$1,$2]){
             print $1" " $2" " Recieved[$1,$2]" "Sent[$1,$2];Recieved[$1,$2]=""
           }
       }
      }' InputFile.txt InputFile.txt

InputFile 被读取两次，因此在最后添加了两次。
第一次出现的输入文件（用于 if(NR==FNR) 条件）是构建两个数组，第二个输入文件（用于 else 条件）是打印所有组合并将数组值设置为空白，这样我们就不会再次打印。

下面格伦的解决方案要优越得多，它只读取文件一次

 awk '{
       if (NR==FNR){ 
         Recieved[$1,$2]+=$3;Sent[$1,$2]+=$4;
       }else{
           if(Recieved[$1,$2]){
             print $1" " $2" " Recieved[$1,$2]" "Sent[$1,$2];Recieved[$1,$2]=""
           }
       }
      }' InputFile.txt InputFile.txt

InputFile is read twice hence it is added two times at the end.
The first occurence of inputfile (which is used in if(NR==FNR) condition) is to build the two arrays and second inputfile (used in else condition) is to print all the combinations and also setting the array value to blank so that we wont print again.

Glenn's Solution below is much superior it reads the file only once

回复收藏 0 原文

~没有更多了~