如何在unix中提取多行单词？

发布于 2024-11-15 13:42:15 字数 274 浏览 7 评论 0原文

我想从以下字符串中提取一些特定的单词：-

Exported Layer : missing_hello  
Comment :   
Total Polygons : 20000 (reported 100).

我想从上面的字符串中提取单词“missing_hello”和“2000”，并希望将其显示为

missing_hello : 20000

How to do that in unix?

原文

I want to extract some specific words from the following string :-

Exported Layer : missing_hello  
Comment :   
Total Polygons : 20000 (reported 100).

I want to extract the word "missing_hello" and "2000" from the above string and want to display it as

missing_hello : 20000

How to do that in unix?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

花海 2024-11-22 13:42:15

假设 Missing_hello 每次都是一个单词 - 你可以：

perl -lane '$el=$F[3] if(/Exported Layer/); print "$el: $F[3]" if(/Total Polygons/);'

Assuming than missing_hello is everytime one word - you can:

perl -lane '$el=$F[3] if(/Exported Layer/); print "$el: $F[3]" if(/Total Polygons/);'

回复收藏 0 原文

灼疼热情 2024-11-22 13:42:15

看看这个指南 - http://www.grymoire.com/Unix/Sed.html< /a>

sed 无疑是一个值得学习的工具。我会专门查看标题为“使用 \1 保留模式的一部分”和“使用多行”的部分。

回复收藏 0 原文

情丝乱 2024-11-22 13:42:15

如果你有 perl，你可以使用这个：

use strict;
use warnings;

my $layer;
my $polys;

while (<>) {
    if ($_ =~ m{^Exported \s Layer \s : \s (\S+)}xms) {
        $layer = $1;
        next;
    }
    if ($_ =~ m{^Total \s Polygons \s : \s (\d+)}xms) {
        $polys = $1;
    }
    if (defined $layer && defined $polys) {
        print "$layer : $polys\n";
        $layer = $polys = undef;
    }
}

If you have perl, you could use this:

use strict;
use warnings;

my $layer;
my $polys;

while (<>) {
    if ($_ =~ m{^Exported \s Layer \s : \s (\S+)}xms) {
        $layer = $1;
        next;
    }
    if ($_ =~ m{^Total \s Polygons \s : \s (\d+)}xms) {
        $polys = $1;
    }
    if (defined $layer && defined $polys) {
        print "$layer : $polys\n";
        $layer = $polys = undef;
    }
}

回复收藏 0 原文

唐婉 2024-11-22 13:42:15

在awk中：

awk -F: '/Exported Layer/ { export_layer = $2 }
         /Total Polygons/ { printf("%s : %s\n", export_layer, $2); }' "$@"

如果输入是垃圾，那么输出也将是垃圾（GIGO）。如果字段可以包含冒号，生活就会变得更加混乱。

在 sed 中：在

sed -n -e '/Exported Layer : *\(.*\)/{s//\1 : /;h;}' \
       -e '/Total Polygons : *\(.*\)/{s//\1/;x;G;s/\n//;p;}' "$@"

此 sed 版本中，字段中的冒号不是问题。

现已在 MacOS X 10.6.7 上进行测试。两个脚本都在“多边形总数”行中的数字后面包含注释。这两个脚本都可以很容易地修改为仅打印数字并忽略注释。这将有助于对所有格式可能性进行精确定义。

我实际上可能会使用 Perl（或 Python）来完成这项工作；字段分割非常混乱，足以从这些语言的更好设施中受益。

In awk:

awk -F: '/Exported Layer/ { export_layer = $2 }
         /Total Polygons/ { printf("%s : %s\n", export_layer, $2); }' "$@"

If the input is garbage, the output will be too (GIGO). If the fields can contain colons, life gets messier.

In sed:

sed -n -e '/Exported Layer : *\(.*\)/{s//\1 : /;h;}' \
       -e '/Total Polygons : *\(.*\)/{s//\1/;x;G;s/\n//;p;}' "$@"

Colons in fields are not a problem with this sed version.

Now tested on MacOS X 10.6.7. Both scripts include the commentary after the number in the 'Total Polygons' line. Both scripts can fairly easily be revised to only print the number and ignore the commentary. It would help to have a precise definition of all the format possibilities.

I would probably actually use Perl (or Python) to do this job; the field splitting is just messy enough to benefit from the better facilities in those languages.

回复收藏 0 原文

~没有更多了~