如何仅将多个输入文本文件中的唯一元素添加到 awk 中的数组中

发布于 2024-11-18 02:55:40 字数 1739 浏览 4 评论 0原文

正如主题所示，如何从多个文本文件中读取信息，并且仅在数组中添加元素 1 次，而不管它们是否在不同的文本文件中出现多次？

我从这个脚本开始，它按照元素在不同文档中出现的顺序读入并打印出所有元素。

例如，查看这 3 个不同的文本文件，其中包含以下数据

文件 1：

2011-01-22 22:12 test1 22 1312 75 13.55 1399 
2011-01-23 22:13 test4 22 1112 72 12.55 1499

文件 2：

2011-01-24 22:14 test1 21 1322 75 23.55 1599 
2011-01-25 22:15 test2 23 2312 77 33.55 1699

文件 3：

2011-01-26 22:16 test2 20 1412 79 63.55 1799 
2011-01-27 22:17 test5 12 1352 78 43.55 1999

我想检查当前元素是否已添加到数组中，但目前我的脚本打印出所有元素。

{
    BUILDd[NR-1] = $3; len++
}
   END { 
        SUBSYSTEM=substr(FILENAME, 1, length(FILENAME)-7)
        LABEL= "\"" toupper(SUBSYSTEM) "\""
        print  "#{"
        print "\"buildnames\": {"
        print "        \"label\": \"buildnames\","
        print "        \"data\": ["
        for (i = 0 ; i <= len-1; i ++ ) {
        if(i == len-1){print "            [\"" BUILDd[i] "\"]"}
        else
            { print "            [\"" BUILDd[i] "\"],"}
        }
        print "        ]"
        print " }"
        print "};"
}

给出这个输出

#{
"buildnames": {
        "label": "buildnames",
        "data": [
            ["test1"]
            ["test4"]
            ["test1"]
            ["test2"]
            ["test2"]
            ["test5"]
        ]
        }
};

当我希望它给出以下内容时

#{
"buildnames": {
        "label": "buildnames",
        "data": [
            ["test1"]
            ["test2"]
            ["test4"]
            ["test5"]
        ]
        }
};

1）换句话说，首先检查元素是否已经在数组中，如果没有，则添加它/它们

2）如果可能的话，之后对数组进行排序

谢谢=）

原文

as the toppic suggests, how to I read in information from multiple text files and only add elements 1 time in a an array regardless if they occur multiple times in the diffrent text files?

I have started with this script that reads in and prints out all elements in the order that they occur in the different documents.

For example take e look at these 3 diffrent text files containing the following data

File 1:

2011-01-22 22:12 test1 22 1312 75 13.55 1399 
2011-01-23 22:13 test4 22 1112 72 12.55 1499

File 2:

2011-01-24 22:14 test1 21 1322 75 23.55 1599 
2011-01-25 22:15 test2 23 2312 77 33.55 1699

File 3:

2011-01-26 22:16 test2 20 1412 79 63.55 1799 
2011-01-27 22:17 test5 12 1352 78 43.55 1999

I want to check if the current element already is added to the array, but as for now my script prints out all elements.

{
    BUILDd[NR-1] = $3; len++
}
   END { 
        SUBSYSTEM=substr(FILENAME, 1, length(FILENAME)-7)
        LABEL= "\"" toupper(SUBSYSTEM) "\""
        print  "#{"
        print "\"buildnames\": {"
        print "        \"label\": \"buildnames\","
        print "        \"data\": ["
        for (i = 0 ; i <= len-1; i ++ ) {
        if(i == len-1){print "            [\"" BUILDd[i] "\"]"}
        else
            { print "            [\"" BUILDd[i] "\"],"}
        }
        print "        ]"
        print " }"
        print "};"
}

Gives this output

#{
"buildnames": {
        "label": "buildnames",
        "data": [
            ["test1"]
            ["test4"]
            ["test1"]
            ["test2"]
            ["test2"]
            ["test5"]
        ]
        }
};

When I want it to give out the following

#{
"buildnames": {
        "label": "buildnames",
        "data": [
            ["test1"]
            ["test2"]
            ["test4"]
            ["test5"]
        ]
        }
};

1) In other words first check if the elements are already in the array and if not, then add it/them

2) Sort the array afterwards if possible

Thanks =)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

冬天的雪花 2024-11-25 02:55:40

除了格式之外，这是否是您想要实现的目标（a、b、c 是包含您的日志的文件）？

$ cut -d" " -f3 a b c | sort | uniq
test1
test2
test4
test5

使用 awk

{
    BUILDd[$3] = 1
}
END {
    for (i in BUILDd) {
    print i
    }
}

给出请

awk -f a.awk a b c
test1
test2
test4
test5

注意，这里正确的排序顺序纯粹是偶然的...放入数组的顺序不是它打印的顺序。

Except for the formatting, is this what you are trying to achieve (a, b, c, are files that contains your logs) ?

$ cut -d" " -f3 a b c | sort | uniq
test1
test2
test4
test5

using awk

{
    BUILDd[$3] = 1
}
END {
    for (i in BUILDd) {
    print i
    }
}

Gives

awk -f a.awk a b c
test1
test2
test4
test5

Note that the correct sorting order here is pure accidental... The order stuff is put into a array is not the order it is printed.

回复收藏 0 原文

~没有更多了~

关于作者

只有一腔孤勇

暂无简介

文章

26 人气

关注发私信

友情链接

文江博客

如何仅将多个输入文本文件中的唯一元素添加到 awk 中的数组中

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

忆悲凉

hgfg1645

qq_qLPLYi

戏舞

殊姿

﹂绝世的画

友情链接

如何仅将多个输入文本文件中的唯一元素添加到 awk 中的数组中

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

忆悲凉

hgfg1645

qq_qLPLYi

戏舞

殊姿

﹂绝世的画

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。