使用 Awk 评估命令

发布于 2024-12-09 14:16:05 字数 1185 浏览 6 评论 0原文

问题是:我有不同的 txt 文件,其中注册了到达服务器的每个恶意软件数据包的时间戳和 IP 地址。我想要做的是创建另一个 txt 文件,该文件显示每个 ip 的恶意软件数据包第一次到达的时间。

一般来说,我想做这样的事情:

for every  line in file.txt
 if (ip is not present in list.txt)
 copy timestamp and ip in list.txt

我使用 awk 来做这件事。主要问题是“如果 ip 不存在于 list.txt 中”。 我正在这样做:(

 {    a=$( grep -w "$3" list.txt | wc -c );
    if ( a == 0 )
   {
     #copy timestamp and ip in list.txt
   }

我使用 $3 因为 ip 地址位于源文件的第三列中)

我不知道如何让 awk 评估 grep 函数。我也尝试过使用反引号,但没有成功。有人可以给我一些提示吗?

我正在这样的测试文件上测试我的脚本:

10  192.168.1.1
11  192.168.1.2
12  192.165.2.4
13  122.11.22.11    
13  192.168.1.1
13  192.168.1.2
13  122.11.22.11
14  122.11.22.11
15  122.11.22.11
15  122.11.22.144
15  122.11.2.11
15  122.11.22.111

我应该获得的是:

10  192.168.1.1
11  192.168.1.2
12  192.165.2.4
13  122.11.22.11    
15  122.11.22.144
15  122.11.2.11
15  122.11.22.111

感谢您的帮助,我已经成功创建了适合我需要的脚本:

awk '
FILENAME == ARGV[1] {
    ip[$2] = 1
    next
}
! ($2 in ip) {
    print $1, $2 >> ARGV[1]
    ip[$2] = 1
}
' list.txt file.txt 

The problem is that: I have different txt files in which is registered a timestamp and an ip address for every malware packet that arrives to a server. What I want to do is create another txt file that shows, for every ip, the first time a malware packet arrives.

In general I want to do something like this :

for every  line in file.txt
 if (ip is not present in list.txt)
 copy timestamp and ip in list.txt

I'm using awk for doing it. The main problem is the "if ip is not present in list.txt".
I'm doing this:

 {    a=$( grep -w "$3" list.txt | wc -c );
    if ( a == 0 )
   {
     #copy timestamp and ip in list.txt
   }

( i'm using $3 because the ip address is in the third column of the source file )

I don't know how to make awk evaluate the grep function. I've tried with backticks also but it didn't work. Someone could give me some hint?

I'm testing my script on test file like this:

10  192.168.1.1
11  192.168.1.2
12  192.165.2.4
13  122.11.22.11    
13  192.168.1.1
13  192.168.1.2
13  122.11.22.11
14  122.11.22.11
15  122.11.22.11
15  122.11.22.144
15  122.11.2.11
15  122.11.22.111

What should I obtain is:

10  192.168.1.1
11  192.168.1.2
12  192.165.2.4
13  122.11.22.11    
15  122.11.22.144
15  122.11.2.11
15  122.11.22.111

Thanks to your help I've succeded in creating the script that fits my needs :

awk '
FILENAME == ARGV[1] {
    ip[$2] = 1
    next
}
! ($2 in ip) {
    print $1, $2 >> ARGV[1]
    ip[$2] = 1
}
' list.txt file.txt 

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

青丝拂面 2024-12-16 14:16:05

将问题解释为“如何从 awk 中评估命令的状态?”,只需使用 system.

{
  if( system( "cmd" ) == 0 ) {
    # the command succeeded
  {
}

因此,就您的情况而言,只需这样做:

{
  if( system( "grep -w \"" $3 "\" list.txt > /dev/null " ) == 0 ) {
    ...
  }
}

不过,您可能需要重新考虑解决问题的方法。格雷平
每次的计算成本都很高,并且有更好的方法
解决问题。 (例如,将 list.txt 读入数组一次。)

另外,请注意,您不需要使用 wc。如果不存在则 grep 失败
匹配字符串。使用返回值而不是解析输出。

Interpreting the question as "How can I evaluate the status of a command from within awk?", just use system.

{
  if( system( "cmd" ) == 0 ) {
    # the command succeeded
  {
}

So, in your case, just do:

{
  if( system( "grep -w \"" $3 "\" list.txt > /dev/null " ) == 0 ) {
    ...
  }
}

You might want to reconsider your approach to the problem, though. Grepping
each time is computationally expensive, and there are better ways to
approach the problem. (Read list.txt once into an array, for example.)

Also, note that you do not need to use wc. grep fails if it doesn't
match the string. Use the return value rather than parsing the output.

野鹿林 2024-12-16 14:16:05

这会将执行结果保存到变量a中

BEGIN {  } 
{
"grep -w \"$3\" list.txt | wc -c" | getline a
print a
}
END   {}

This will save the result of execution into variable a

BEGIN {  } 
{
"grep -w \"$3\" list.txt | wc -c" | getline a
print a
}
END   {}
谁与争疯 2024-12-16 14:16:05

但实际上您想要做的是让 awk 首先读取 list.txt 文件,然后使用内存中的 list.txt 数据处理另一个文件。这将使您避免为每一行调用system()

我假设 IP 位于 list.txt 的第一列中。

当您说复制list.txt中的时间戳和ip时,我假设您想将file.txt当前行的一些信息附加到list.txt文件中。

awk '
    FILENAME == ARGV[1] {
        ip[$1] = 1
        next
    }
    ! ($3 in ip) {
        print $3, $(whatevever_column_holds_timestamp) >> ARGV[1]
    }
' list.txt file.txt

给定示例文件和问题更新的简化要求:

awk '! seen[$2]++' filename

将产生您所看到的结果。如果尚未看到 IP,该 awk 程序将打印该行。

But really what you want to do is get awk to read the list.txt file first, then process the other file with the list.txt data in memory. This will allow you to avoid calling system() for each line.

I assume the ip is in the 1st column of list.txt.

When you say copy timestamp and ip in list.txt, I assume you want to append some info from the current line of file.txt to the list.txt file.

awk '
    FILENAME == ARGV[1] {
        ip[$1] = 1
        next
    }
    ! ($3 in ip) {
        print $3, $(whatevever_column_holds_timestamp) >> ARGV[1]
    }
' list.txt file.txt

Given the sample file and simplified requirements of your question update:

awk '! seen[$2]++' filename

will produce the results you've seen. That awk program will print the line if the IP has not yet been seen.

删除会话 2024-12-16 14:16:05

您想使用 getline

BEGIN {
    "date" | getline current_time
     close("date")
     print "Report printed on " current_time
}

它获取 date 的输出并将其放入 current_time 变量中。您应该能够使用 grep | 执行相同的操作WC-L。

You want to use getline:

BEGIN {
    "date" | getline current_time
     close("date")
     print "Report printed on " current_time
}

That takes the output of date and puts it into the current_time variable. You should be able to do the same with your grep | wc -l.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文