如何连接文件中的前n行
我正在尝试清理一些数据,最终我想将其以 CSV 形式保存。
我使用了一些正则表达式来清理它,但我卡在了一步。
我想用逗号替换除每三个换行符 (\n) 之外的所有换行符。
数据看起来像这样:
field1
field2
field3
field1
field2
field3
等等..
我需要它在
field1,field2,field3
field1,field2,field3
任何人都有一个简单的方法来使用 sed 或 awk 来做到这一点? 我可以编写一个程序并使用带有 mod 计数器的循环来擦除每个第一个和第二个换行符,但如果可能的话,我宁愿从命令行执行此操作。
I am trying to clean up some data, and I would eventually like to put it in CSV form.
I have used some regular expressions to clean it up, but I'm stuck on one step.
I would like to replace all but every third newline (\n) with a comma.
The data looks like this:
field1
field2
field3
field1
field2
field3
etc..
I need it in
field1,field2,field3
field1,field2,field3
Anyone have a simple way to do this using sed or awk? I could write a program and use a loop with a mod counter to erase every 1st and 2nd newline char, but I'd rather do it from the command line if possible.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
awk版本:
Awk version:
一个稍微短一些的 Perl 解决方案,可以处理不具有 3 行倍数的文件:
A Perl solution that's a little shorter and that handles files that don't have a multiple of 3 lines:
猫文件| perl -ne 'chomp(); 打印 $_, !(++$i%3) ? “\n”:“,”;'
cat file | perl -ne 'chomp(); print $_, !(++$i%3) ? "\n" : ",";'
在 Solaris 上使用 nawk 或 /usr/xpg4/bin/awk:
Use nawk or /usr/xpg4/bin/awk on Solaris:
这可能对你有用:
或者这个:
This might work for you:
or this:
维姆版本:
vim version:
awk '{ORS=NR%3?",":"\n";print}' urdata.txt
awk '{ORS=NR%3?",":"\n";print}' urdata.txt
使用 awk:
此脚本保存最后三行并每隔三行打印它们。 不幸的是,这仅适用于具有 3 行倍数的文件。
更通用的脚本是:
在这种情况下,最后三行连接成一个字符串,只要行号不是 3 的倍数,就插入逗号分隔符。在文件末尾,如果满足,则打印该字符串删除尾随逗号后不为空。
With awk:
This script saves the last three lines and print them at every third line. Unfortunately, this works only with files having a multiple of 3 lines.
A more general script is:
In this case, the last three lines are concatenated in a single string, with the comma separator inserted whenever the line number is not a multiple of 3. At the end of the file, the string is printed if it is not empty with the trailing comma removed.