如何仅将多个输入文本文件中的唯一元素添加到 awk 中的数组中
正如主题所示,如何从多个文本文件中读取信息,并且仅在数组中添加元素 1 次,而不管它们是否在不同的文本文件中出现多次?
我从这个脚本开始,它按照元素在不同文档中出现的顺序读入并打印出所有元素。
例如,查看这 3 个不同的文本文件,其中包含以下数据
文件 1:
2011-01-22 22:12 test1 22 1312 75 13.55 1399
2011-01-23 22:13 test4 22 1112 72 12.55 1499
文件 2:
2011-01-24 22:14 test1 21 1322 75 23.55 1599
2011-01-25 22:15 test2 23 2312 77 33.55 1699
文件 3:
2011-01-26 22:16 test2 20 1412 79 63.55 1799
2011-01-27 22:17 test5 12 1352 78 43.55 1999
我想检查当前元素是否已添加到数组中,但目前我的脚本打印出所有元素。
{
BUILDd[NR-1] = $3; len++
}
END {
SUBSYSTEM=substr(FILENAME, 1, length(FILENAME)-7)
LABEL= "\"" toupper(SUBSYSTEM) "\""
print "#{"
print "\"buildnames\": {"
print " \"label\": \"buildnames\","
print " \"data\": ["
for (i = 0 ; i <= len-1; i ++ ) {
if(i == len-1){print " [\"" BUILDd[i] "\"]"}
else
{ print " [\"" BUILDd[i] "\"],"}
}
print " ]"
print " }"
print "};"
}
给出这个输出
#{
"buildnames": {
"label": "buildnames",
"data": [
["test1"]
["test4"]
["test1"]
["test2"]
["test2"]
["test5"]
]
}
};
当我希望它给出以下内容时
#{
"buildnames": {
"label": "buildnames",
"data": [
["test1"]
["test2"]
["test4"]
["test5"]
]
}
};
1)换句话说,首先检查元素是否已经在数组中,如果没有,则添加它/它们
2)如果可能的话,之后对数组进行排序
谢谢=)
as the toppic suggests, how to I read in information from multiple text files and only add elements 1 time in a an array regardless if they occur multiple times in the diffrent text files?
I have started with this script that reads in and prints out all elements in the order that they occur in the different documents.
For example take e look at these 3 diffrent text files containing the following data
File 1:
2011-01-22 22:12 test1 22 1312 75 13.55 1399
2011-01-23 22:13 test4 22 1112 72 12.55 1499
File 2:
2011-01-24 22:14 test1 21 1322 75 23.55 1599
2011-01-25 22:15 test2 23 2312 77 33.55 1699
File 3:
2011-01-26 22:16 test2 20 1412 79 63.55 1799
2011-01-27 22:17 test5 12 1352 78 43.55 1999
I want to check if the current element already is added to the array, but as for now my script prints out all elements.
{
BUILDd[NR-1] = $3; len++
}
END {
SUBSYSTEM=substr(FILENAME, 1, length(FILENAME)-7)
LABEL= "\"" toupper(SUBSYSTEM) "\""
print "#{"
print "\"buildnames\": {"
print " \"label\": \"buildnames\","
print " \"data\": ["
for (i = 0 ; i <= len-1; i ++ ) {
if(i == len-1){print " [\"" BUILDd[i] "\"]"}
else
{ print " [\"" BUILDd[i] "\"],"}
}
print " ]"
print " }"
print "};"
}
Gives this output
#{
"buildnames": {
"label": "buildnames",
"data": [
["test1"]
["test4"]
["test1"]
["test2"]
["test2"]
["test5"]
]
}
};
When I want it to give out the following
#{
"buildnames": {
"label": "buildnames",
"data": [
["test1"]
["test2"]
["test4"]
["test5"]
]
}
};
1) In other words first check if the elements are already in the array and if not, then add it/them
2) Sort the array afterwards if possible
Thanks =)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
除了格式之外,这是否是您想要实现的目标(a、b、c 是包含您的日志的文件)?
使用 awk
给出 请
注意,这里正确的排序顺序纯粹是偶然的...放入数组的顺序不是它打印的顺序。
Except for the formatting, is this what you are trying to achieve (a, b, c, are files that contains your logs) ?
using awk
Gives
Note that the correct sorting order here is pure accidental... The order stuff is put into a array is not the order it is printed.