如何从类似“name value name value ...”的输入行中按名称提取值
考虑通过管道传输到 awk 的行流。行由交错的字段名称和字段值序列组成,如下例所示(行确实更长,列出了许多其他属性):
样品 2978 最小值 -0.068689 at 1389 Amin 1.0406e-08 at 435 最大值 0.0514581 at 1375
样品 2977 最小值 -0.100258 at 1293 Amin -1.06743e-08 at 3 最大值 0.0989735 at 1282
样品 2977 最小值 -0.109783 at 1281 Amin -2.97293e-08 at 10 最大值 0.139651 at 1268
样品 2976 最小值 -0.116509 at 1269 Amin -1.04306e-09 at 161 最大值 0.0985577 at 1255
我想使用其名称作为指导从字符串中提取某个值,例如 Min
。如果我在 awk
中有一个类似于 scanf
的函数,我首先会使用 ind=index($0, "Min")
,然后s=substr($0, ind)
,然后sscanf(s,"Min %f", &val)
得到val
。但是,我在 awk 中没有任何可用的 scanf
。
那么如何通过名称提取值呢?
Consider a stream of lines piped to awk
. Lines are composed of sequence of interleaved field names and field values, like the following example (lines really are much longer with many other attributes listed):
Samples 2978 Min -0.068689 at 1389 Amin 1.0406e-08 at 435 Max 0.0514581 at 1375
Samples 2977 Min -0.100258 at 1293 Amin -1.06743e-08 at 3 Max 0.0989735 at 1282
Samples 2977 Min -0.109783 at 1281 Amin -2.97293e-08 at 10 Max 0.139651 at 1268
Samples 2976 Min -0.116509 at 1269 Amin -1.04306e-09 at 161 Max 0.0985577 at 1255
I'd like to extract a certain value from the strings using it's name as a guide, for example, Min
. If I had a scanf
-like function in awk
, I'd have at first used ind=index($0, "Min")
, then s=substr($0, ind)
, then sscanf(s,"Min %f", &val)
to obtain val
. However, I dont have any scanf
available in awk.
How can I extract the value by it's name then?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您遍历每个字段,检查“Min”,然后提取下一个字段
Ruby(1.9+)
You go through each field, check for "Min" and then extract the next field
Ruby(1.9+)
\1.您不能依赖数据元素位于每条记录中的相同列位置吗?然后您可以简单地说
要获取每个示例的最小值。
\2.胡鲁米的想法很好。
\3.这是另一种确保数据与其标签匹配的方法。
您可以直接对 $0 进行修改,但因为每次编辑 $0 时 awk 都会“重新计算”字段值,所以(根据我的经验)这将是一个慢得多的过程。
我希望这有帮助。
\1. Can't you rely on data elements to be in the same column position in each record? Then you can simply say
To get the Min value per your example.
\2. Kurumi idea is fine.
\3. Here's another method that ensures you match data with its label
You can do the modifications directly on $0, but because awk "recalculates" the field values everytime $0 is edited, it will be (in my experience) a much slower process.
I hope this helps.
这将仅逐字段扫描包含标签的行。
->
运行于 ideone
This will scan field by field only those lines containing the label.
->
Running at ideone