如何在unix中提取多行单词?
我想从以下字符串中提取一些特定的单词:-
Exported Layer : missing_hello
Comment :
Total Polygons : 20000 (reported 100).
我想从上面的字符串中提取单词“missing_hello”和“2000”,并希望将其显示为
missing_hello : 20000
How to do that in unix?
I want to extract some specific words from the following string :-
Exported Layer : missing_hello
Comment :
Total Polygons : 20000 (reported 100).
I want to extract the word "missing_hello" and "2000" from the above string and want to display it as
missing_hello : 20000
How to do that in unix?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
假设 Missing_hello 每次都是一个单词 - 你可以:
Assuming than missing_hello is everytime one word - you can:
看看这个指南 - http://www.grymoire.com/Unix/Sed.html< /a>
sed 无疑是一个值得学习的工具。我会专门查看标题为“使用 \1 保留模式的一部分”和“使用多行”的部分。
Take a look at this guide- http://www.grymoire.com/Unix/Sed.html
Sed is certainly a tool worth learning. I would look specifically at the sections titled "Using \1 to keep part of the pattern", and "Working with Multiple Lines".
如果你有 perl,你可以使用这个:
If you have perl, you could use this:
在
awk
中:如果输入是垃圾,那么输出也将是垃圾(GIGO)。如果字段可以包含冒号,生活就会变得更加混乱。
在
sed
中:在此
sed
版本中,字段中的冒号不是问题。现已在 MacOS X 10.6.7 上进行测试。两个脚本都在“多边形总数”行中的数字后面包含注释。这两个脚本都可以很容易地修改为仅打印数字并忽略注释。这将有助于对所有格式可能性进行精确定义。
我实际上可能会使用 Perl(或 Python)来完成这项工作;字段分割非常混乱,足以从这些语言的更好设施中受益。
In
awk
:If the input is garbage, the output will be too (GIGO). If the fields can contain colons, life gets messier.
In
sed
:Colons in fields are not a problem with this
sed
version.Now tested on MacOS X 10.6.7. Both scripts include the commentary after the number in the 'Total Polygons' line. Both scripts can fairly easily be revised to only print the number and ignore the commentary. It would help to have a precise definition of all the format possibilities.
I would probably actually use Perl (or Python) to do this job; the field splitting is just messy enough to benefit from the better facilities in those languages.