我有一个 csv 文件,其中有一些 ^M dos 行结尾,我想删除它们,以及后面的 16 个空格和 3 个制表符。就像,我必须将该行与下一行合并。这是一个有问题的记录,也是一个很好的记录,作为我的意思的示例:
"Mary had a ^M
little lamb", "Nursery Rhyme", 1878
"Mary, Mary quite contrary", "Nursery Rhyme", 1838
我可以使用 sed 删除 ^M,如您所见,但我无法弄清楚如何 rm nix 行端以将行连接起来。
sed -e "s/^M$ //g" rhymes.csv > rhymes.csv
更新
然后我读到“但是,Microsoft CSV 格式允许在双引号字段中嵌入换行符。如果您的数据可能在字段中嵌入换行符,则您应该考虑使用 sed 以外的其他工具来处理数据文件。”从:
http://sed.sourceforge.net/sedfaq4.html
所以编辑我的问题来问哪个我应该使用的工具?
I have a csv file into which has crept some ^M dos line ends, and I want to get rid of them, as well as 16 spaces and 3 tabs which follow. Like, I have to merge that line with the next one down. Heres an offending record and a good one as a sample of what I mean:
"Mary had a ^M
little lamb", "Nursery Rhyme", 1878
"Mary, Mary quite contrary", "Nursery Rhyme", 1838
I can remove the ^M using sed as you can see, but I cannot work out how to rm the nix line end to join the lines back up.
sed -e "s/^M$ //g" rhymes.csv > rhymes.csv
UPDATE
Then I read "However, the Microsoft CSV format allows embedded newlines within a double-quoted field. If embedded newlines within fields are a possibility for your data, you should consider using something other than sed to work with the data file." from:
http://sed.sourceforge.net/sedfaq4.html
So editing my question to ask Which tool I should be using?
发布评论
评论(2)
在 如何替换换行符 (\n )使用 sed?,我做了这个:
如果您只想删除 CR,您可以使用:(
或者如果两个输入和输出文件不同: output )
With help from How can I replace a newline (\n) using sed?, I made this one:
<CR> <LF> <16 spaces> <3 tabs>
If you just want to delete the CR, you could use:
(or if the two input and output file are different:
<yourfile tr -d "\r" > output
)转换文件或
创建新文件。
to convert file, or
to create new file.