使用 Awk 评估命令
问题是:我有不同的 txt 文件,其中注册了到达服务器的每个恶意软件数据包的时间戳和 IP 地址。我想要做的是创建另一个 txt 文件,该文件显示每个 ip 的恶意软件数据包第一次到达的时间。
一般来说,我想做这样的事情:
for every line in file.txt
if (ip is not present in list.txt)
copy timestamp and ip in list.txt
我使用 awk 来做这件事。主要问题是“如果 ip 不存在于 list.txt 中”。 我正在这样做:(
{ a=$( grep -w "$3" list.txt | wc -c );
if ( a == 0 )
{
#copy timestamp and ip in list.txt
}
我使用 $3 因为 ip 地址位于源文件的第三列中)
我不知道如何让 awk 评估 grep 函数。我也尝试过使用反引号,但没有成功。有人可以给我一些提示吗?
我正在这样的测试文件上测试我的脚本:
10 192.168.1.1
11 192.168.1.2
12 192.165.2.4
13 122.11.22.11
13 192.168.1.1
13 192.168.1.2
13 122.11.22.11
14 122.11.22.11
15 122.11.22.11
15 122.11.22.144
15 122.11.2.11
15 122.11.22.111
我应该获得的是:
10 192.168.1.1
11 192.168.1.2
12 192.165.2.4
13 122.11.22.11
15 122.11.22.144
15 122.11.2.11
15 122.11.22.111
感谢您的帮助,我已经成功创建了适合我需要的脚本:
awk '
FILENAME == ARGV[1] {
ip[$2] = 1
next
}
! ($2 in ip) {
print $1, $2 >> ARGV[1]
ip[$2] = 1
}
' list.txt file.txt
The problem is that: I have different txt files in which is registered a timestamp and an ip address for every malware packet that arrives to a server. What I want to do is create another txt file that shows, for every ip, the first time a malware packet arrives.
In general I want to do something like this :
for every line in file.txt
if (ip is not present in list.txt)
copy timestamp and ip in list.txt
I'm using awk for doing it. The main problem is the "if ip is not present in list.txt".
I'm doing this:
{ a=$( grep -w "$3" list.txt | wc -c );
if ( a == 0 )
{
#copy timestamp and ip in list.txt
}
( i'm using $3 because the ip address is in the third column of the source file )
I don't know how to make awk evaluate the grep function. I've tried with backticks also but it didn't work. Someone could give me some hint?
I'm testing my script on test file like this:
10 192.168.1.1
11 192.168.1.2
12 192.165.2.4
13 122.11.22.11
13 192.168.1.1
13 192.168.1.2
13 122.11.22.11
14 122.11.22.11
15 122.11.22.11
15 122.11.22.144
15 122.11.2.11
15 122.11.22.111
What should I obtain is:
10 192.168.1.1
11 192.168.1.2
12 192.165.2.4
13 122.11.22.11
15 122.11.22.144
15 122.11.2.11
15 122.11.22.111
Thanks to your help I've succeded in creating the script that fits my needs :
awk '
FILENAME == ARGV[1] {
ip[$2] = 1
next
}
! ($2 in ip) {
print $1, $2 >> ARGV[1]
ip[$2] = 1
}
' list.txt file.txt
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
将问题解释为“如何从 awk 中评估命令的状态?”,只需使用 system.
因此,就您的情况而言,只需这样做:
不过,您可能需要重新考虑解决问题的方法。格雷平
每次的计算成本都很高,并且有更好的方法
解决问题。 (例如,将 list.txt 读入数组一次。)
另外,请注意,您不需要使用 wc。如果不存在则 grep 失败
匹配字符串。使用返回值而不是解析输出。
Interpreting the question as "How can I evaluate the status of a command from within awk?", just use system.
So, in your case, just do:
You might want to reconsider your approach to the problem, though. Grepping
each time is computationally expensive, and there are better ways to
approach the problem. (Read list.txt once into an array, for example.)
Also, note that you do not need to use wc. grep fails if it doesn't
match the string. Use the return value rather than parsing the output.
这会将执行结果保存到变量a中
This will save the result of execution into variable a
但实际上您想要做的是让 awk 首先读取 list.txt 文件,然后使用内存中的 list.txt 数据处理另一个文件。这将使您避免为每一行调用
system()
。我假设 IP 位于 list.txt 的第一列中。
当您说
复制list.txt中的时间戳和ip
时,我假设您想将file.txt当前行的一些信息附加到list.txt文件中。给定示例文件和问题更新的简化要求:
将产生您所看到的结果。如果尚未看到 IP,该 awk 程序将打印该行。
But really what you want to do is get awk to read the list.txt file first, then process the other file with the list.txt data in memory. This will allow you to avoid calling
system()
for each line.I assume the ip is in the 1st column of list.txt.
When you say
copy timestamp and ip in list.txt
, I assume you want to append some info from the current line of file.txt to the list.txt file.Given the sample file and simplified requirements of your question update:
will produce the results you've seen. That awk program will print the line if the IP has not yet been seen.
您想使用 getline:
它获取
date
的输出并将其放入 current_time 变量中。您应该能够使用 grep | 执行相同的操作WC-L。You want to use getline:
That takes the output of
date
and puts it into the current_time variable. You should be able to do the same with your grep | wc -l.