(sed/awk) 如何将字段分隔(如 csv)文件转换为具有固定大小的制表符分隔列的 txt?
我有一个类似于 csv 文件的文件,其中字段分隔符是“@”。
ID@Name@Surname@Age@Profession@Address
1254343123@John@Smith@24@Engineer@Washington 23@Alexander@Kristofferson-Brown@Economic Advisor@Kent ...
我想将其转换为更易于人类阅读的内容,如下所示:
1254343123 John Smith 24 Engineer Washington 23 Alexander Kristofferson-Brown 35 Economic Advisor Kent
...其中每列都以特定字符开头。
我尝试过利用 TAB 大小值并向该字段添加几个 TAB:
sed -e "{/@[^@]\{32,\}@/s/@\([^@]*\)@/\t\1\t/g};{/@[^@]\{24,31\}@/s/@\([^@]*\)@/\t\1\t/g};{/@[^@]\{16,23\}@/s/@\([^@]*\)@/\t\1\t\t/g};{/@[^@]\{8,15\}@/s/@\([^@]*\)@/\t\1\t\t/g};{/@[^@]\{2,7\}@/s/@\([^@]*\)@/\t\1\t\t\t/g}"
...这并不在所有情况下都有效。
有人可以给我一个如何继续的提示吗?
PS:我的目的是主要使用 sed (单行代码),或者如果 sed 不能解决问题,awk 也可以。
I have something like a csv file in which the field delimiter is "@".
ID@Name@Surname@Age@Profession@Address
1254343123@John@Smith@24@Engineer@Washington 23@Alexander@Kristofferson-Brown@Economic Advisor@Kent ...
I want to convert it into something more human-readable as in:
1254343123 John Smith 24 Engineer Washington 23 Alexander Kristofferson-Brown 35 Economic Advisor Kent
...where each column stars at a specific character.
I have tried something that takes advantage of the TAB-size value and adds several TABs to the field:
sed -e "{/@[^@]\{32,\}@/s/@\([^@]*\)@/\t\1\t/g};{/@[^@]\{24,31\}@/s/@\([^@]*\)@/\t\1\t/g};{/@[^@]\{16,23\}@/s/@\([^@]*\)@/\t\1\t\t/g};{/@[^@]\{8,15\}@/s/@\([^@]*\)@/\t\1\t\t/g};{/@[^@]\{2,7\}@/s/@\([^@]*\)@/\t\1\t\t\t/g}"
...which does not work in all cases.
Could somebody give me a hint as how to proceed?
PS: My intention is to use mainly sed (a one-liner) or if sed just doesn't cut it, awk is OK.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
BSD、Mac OS X 和 Linux 都有用于此目的的
column
命令:但它会生成空格,而不是制表符(而且应该如此,因为制表符已过时)。输出:
BSD, Mac OS X and Linux have the
column
command for this:It produces spaces though, not tabs (and it should, because tabs are obsolete). Output:
输入
输出
如果你想让字段宽度(上面代码中的 20)成为一个可以传入的 shell 变量,你可以这样做:
Input
Output
If you want to make the field width (20 in the code above) a shell variable that can be passed in you do something like this:
我的转换 Excel 默认 csv(逗号分隔,双引号括起来的文本)的解决方案是以下 awk 脚本:
My solution for converting the excel default csv (comma delimited, text enclosed in double quotes) is the following awk script:
awk -F@ '{print $1"\t"$2"\t"$3"\t"$4"\t"$5"\t"$6}' file.csv > read.txt
尝试使用 awk 的分隔符选项并通过在每列之间放置
\t
来打印列。awk -F@ '{print $1"\t"$2"\t"$3"\t"$4"\t"$5"\t"$6}' file.csv > readable.txt
Try this one using delimiter option for awk and printing columns by putting
\t
between each.