使用 SED 从“x=a1, y=b1, z=c1”中提取标头和数据文件类型
数据文件如下所示:
x=a1, y=b1, z=c1
x=a2, y=b2, z=c2
...
我想将其解析为更可用的格式:
x y z
a1 b1 c1
a2 b2 c2
...
标题“x,y,z”和数据“a,b,c”不包含“=”或“,”。
使用
1 s/=*[^=]*[,$]/ /g
给我
x y z=c1
显然最后一项与“[,$]”不匹配有什么建议吗?
非常感谢!
董
The data file looks like:
x=a1, y=b1, z=c1
x=a2, y=b2, z=c2
...
I want to parse it to a more useable format:
x y z
a1 b1 c1
a2 b2 c2
...
The header "x,y,z" and the data "a, b, c" does not contain "=" or ",".
using
1 s/=*[^=]*[,$]/ /g
give me
x y z=c1
Apparently the last item is not matched with "[,$]" Any suggestions?
Many thanks!
Dong
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
[,$]
匹配逗号或美元,而不匹配逗号或行尾。在第一行执行两个操作可能是最简单的:
第一个操作查找等号和第一个逗号(包括分隔符)之间的所有内容,然后重复删除它;第二个查找最后一个(唯一的)等号之后的所有内容并将其删除。
The
[,$]
matches either a comma or a dollar, not comma or end of line.It is probably simplest to do two operations on the first line:
The first looks for everything between an equals sign and the first comma (including the delimiters) and deletes it, repeatedly; the second looks for everything after the last (only) equals sign and deletes that.
应该能够将所有数据转换为这种格式
之后,您可以插入您选择的标头,
如果您愿意,您还可以使用 Jonathan Leffler 的答案中的 sed 从文件中解析标头。
Should be able to get all the data into this format
After that you can insert a header of your choosing with
If you want you can also parse the header out of the file using the sed in Jonathan Leffler's answer.
要将文件解析为 CSV,
请分别给出标题和数据。
To parse the file to CSV,
give header and data, respectively.