如何使用 AWK 打印此内容？

发布于 2024-09-18 12:14:32 字数 966 浏览 2 评论 0原文

我有一个如下所示的文件：

1 543423 34354 
2 5654656 3423 xyz_1378,xyz_1379
3 4645656 34234354 xyz_1384,xyz_1385
4 5654 78678 xyz_1390,xyz_1391,xyz_1392
5 54654 76867 xyz_1411,xyz_1412,xyz_1413
6 54654 8678 
7 56546 67867 xyz_1711
8 678 7867 
9 76867 7876 xyz_2940
10 6786 678678 xyz_3101,xyz_3102,xyz_3103,xyz_3104,xyz_3105,xyz_3106,xyz_3107
11 67867 78678

请注意，它包含 4 个字段，以空格分隔。最后一个（第四个）字段可能为空，并且可能包含多个用逗号分隔的值。

我想打印最后一行的所有值，每行一个。我该如何做到这一点（最好使用 awk）？

更新：我需要对许多文件批量执行此操作（将所有文件的串联输出放在一起）。

这有效：

for x in *; do awk '{print $4}' $x/filename | awk --field-separator="," '{if ($0 != "") {for (i=1; i<NF+1; i++) print $i}}'; done;

并返回类似的东西

xyz_1378
xyz_1221
xyz_97
xyz_132523
xyz_242

我现在唯一缺少的是，我希望上面的每一行都以一个额外的字段开始 - $x （来自 for 循环的字段）。

我尝试将 print $i 更改为 print $x,$i" 但x` 在此范围内似乎无法正确识别。有什么想法吗？

谢谢！

原文

I have a file that look like this:

1 543423 34354 
2 5654656 3423 xyz_1378,xyz_1379
3 4645656 34234354 xyz_1384,xyz_1385
4 5654 78678 xyz_1390,xyz_1391,xyz_1392
5 54654 76867 xyz_1411,xyz_1412,xyz_1413
6 54654 8678 
7 56546 67867 xyz_1711
8 678 7867 
9 76867 7876 xyz_2940
10 6786 678678 xyz_3101,xyz_3102,xyz_3103,xyz_3104,xyz_3105,xyz_3106,xyz_3107
11 67867 78678

Note it contains 4 fields, space separated. the last (fourth) field might be empty, and may contain numerous values separated by commas.

I would like to print all the values from the last row, one per line. how can I do that (preferably using awk)?

UPDATE:
I need to do this in batch for many files (gets the concatenated output of all the files together).

This works:

for x in *; do awk '{print $4}' $x/filename | awk --field-separator="," '{if ($0 != "") {for (i=1; i<NF+1; i++) print $i}}'; done;

and returns something like

xyz_1378
xyz_1221
xyz_97
xyz_132523
xyz_242

The only thing I am missing now, is that I want each of the above line to begin with an extra field - $x (the one from the for loop).

I tried changing print $i to print $x,$i" butx` does not seem to be recognized correctly in this scope. Any ideas?

Thanks!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

·深蓝 2024-09-25 12:14:32

使用 awk 的 -v 选项将变量传递到 awk 脚本中，而不是依赖 shell 的替换。另外，您只需要调用一次 awk

for dir in *; do 
    awk -v "dir=$dir" '
        NF==4 {
            n = split($4, a, ",")
            for (i=1; i<=n; i++) {print dir "\t" a[i]}
        }
    ' "$dir/filename"
done

，或者，如果您不介意看到“dir/filename”：

awk '
    NF==4 {
        n = split($4, a, ",")
        for (i=1; i<=n; i++) {print FILENAME "\t" a[i]}
    }
' */filename

如果您有大量目录，则在扩展“*/filename”时您的 shell 可能会阻塞，因此请使用 find 和 xargs：

find . -type f -name filename -print0 | xargs -0 awk '...'

（需要 GNU find/xargs 作为 -print0/-0 选项）

Use awk's -v option to pass the variable into the awk script instead of relying on the shell's substitution. Also, you only need one call to awk

for dir in *; do 
    awk -v "dir=$dir" '
        NF==4 {
            n = split($4, a, ",")
            for (i=1; i<=n; i++) {print dir "\t" a[i]}
        }
    ' "$dir/filename"
done

or, if you don't mind seeing "dir/filename":

awk '
    NF==4 {
        n = split($4, a, ",")
        for (i=1; i<=n; i++) {print FILENAME "\t" a[i]}
    }
' */filename

If you have huge numbers of directories, your shell may choke when expanding "*/filename", so use find and xargs:

find . -type f -name filename -print0 | xargs -0 awk '...'

(requires GNU find/xargs for the -print0/-0 options)

回复收藏 0 原文

莫多说 2024-09-25 12:14:32

也许您可以将命令中的其中一个语句更改为

awk '{print FILENAME "," $4}' $x

，然后处理其输出。

FILENAME 是内部 awk 变量，用于获取正在处理的文件的文件名。

Probably you can change one of the statements in your command to

awk '{print FILENAME "," $4}' $x

and then work on the output of this.

FILENAME is the internal awk variable for getting the filename of the file on which it is processing.

回复收藏 0 原文

ぽ尐不点ル 2024-09-25 12:14:32

使用NF>=4作为条件来查看字段中是否有任何内容。然后 split($4,a,/,/) 将为您提供一个包含所有值的数组 a 。将其放入一个大型结果数组中：

NF>=4 {
    n = split($4, a, /,/);
    for( i=1; i<=n; i++ ) {
        result[a[i]] = 0;
    }
}

并在最后打印它：

END {
    for( val in result ) {
        print val;
    }
}

如果您希望对其进行排序，请通过 sort(1) 管道过滤输出

Use NF>=4 as condition to see if there is anything in the field. Then split($4,a,/,/) will give you an array a with all values. Put that into a large result array:

NF>=4 {
    n = split($4, a, /,/);
    for( i=1; i<=n; i++ ) {
        result[a[i]] = 0;
    }
}

and print it at the end:

END {
    for( val in result ) {
        print val;
    }
}

If you want that sorted, filter the output by piping through sort(1)

回复收藏 0 原文

~没有更多了~

关于作者

人间☆小暴躁

暂无简介

0 文章

0 评论

22 人气

关注发私信

友情链接

文江博客

如何使用 AWK 打印此内容？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

謌踐踏愛綪

开始看清了

高速公鹿

alipaysp_PLnULTzf66

热情消退

白色月光

友情链接

如何使用 AWK 打印此内容？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

謌踐踏愛綪

开始看清了

高速公鹿

alipaysp_PLnULTzf66

热情消退

白色月光

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。