Python 中的 CSV 在 Windows 上添加额外的回车符
import csv
with open('test.csv', 'w') as outfile:
writer = csv.writer(outfile, delimiter=',', quoting=csv.QUOTE_MINIMAL)
writer.writerow(['hi', 'dude'])
writer.writerow(['hi2', 'dude2'])
上面的代码生成一个文件 test.csv
,每行都有一个额外的 \r
,如下所示:
hi,dude\r\r\nhi2,dude2\r\r\n
而不是预期的
hi,dude\r\nhi2,dude2\r\n
为什么会发生这种情况,或者这实际上是期望的行为?
import csv
with open('test.csv', 'w') as outfile:
writer = csv.writer(outfile, delimiter=',', quoting=csv.QUOTE_MINIMAL)
writer.writerow(['hi', 'dude'])
writer.writerow(['hi2', 'dude2'])
The above code generates a file, test.csv
, with an extra \r
at each row, like so:
hi,dude\r\r\nhi2,dude2\r\r\n
instead of the expected
hi,dude\r\nhi2,dude2\r\n
Why is this happening, or is this actually the desired behavior?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
Python 3:
官方
csv
文档 推荐在所有平台上使用newline=''
打开
文件到 禁用通用换行符翻译:CSV 编写器以 禁用通用换行符翻译来终止每一行。 html#csv.Dialect.lineterminator" rel="noreferrer">方言的
lineterminator
,默认为
方言,因为这就是 RFC 4180推荐。'\r\n'
所有平台上的 >excelPython 2:
在 Windows 上,始终以二进制模式(
"rb"
或"wb"
)打开文件,然后再将其传递给csv.reader
或csv.writer
。尽管该文件是文本文件,但 CSV 被相关库视为二进制格式,并用
\r\n
分隔记录。如果该分隔符以文本模式编写,则 Python 运行时会将\n
替换为\r\n
,因此\r\r\n
> 在文件中观察到。请参阅之前的答案。
Python 3:
The official
csv
documentation recommendsopen
ing the file withnewline=''
on all platforms to disable universal newlines translation:The CSV writer terminates each line with the
lineterminator
of the dialect, which is'\r\n'
for the defaultexcel
dialect on all platforms because that's what RFC 4180 recommends.Python 2:
On Windows, always open your files in binary mode (
"rb"
or"wb"
), before passing them tocsv.reader
orcsv.writer
.Although the file is a text file, CSV is regarded a binary format by the libraries involved, with
\r\n
separating records. If that separator is written in text mode, the Python runtime replaces the\n
with\r\n
, hence the\r\r\n
observed in the file.See this previous answer.
虽然 @john-machin 给出了一个很好的答案,但这并不总是最好的方法。例如,除非您将所有输入编码到 CSV 编写器,否则它无法在 Python 3 上运行。此外,如果脚本想要使用 sys.stdout 作为流,它也不能解决问题。
我建议在创建编写器时设置“lineterminator”属性:
该示例将在 Python 2 和 Python 3 上运行,并且不会产生不需要的换行符。但请注意,它可能会产生不需要的换行符(在 Unix 操作系统上省略 LF 字符)。
然而,在大多数情况下,我认为这种行为比将所有 CSV 视为二进制格式更好、更自然。我提供这个答案作为替代方案供您考虑。
While @john-machin gives a good answer, it's not always the best approach. For example, it doesn't work on Python 3 unless you encode all of your inputs to the CSV writer. Also, it doesn't address the issue if the script wants to use sys.stdout as the stream.
I suggest instead setting the 'lineterminator' attribute when creating the writer:
That example will work on Python 2 and Python 3 and won't produce the unwanted newline characters. Note, however, that it may produce undesirable newlines (omitting the LF character on Unix operating systems).
In most cases, however, I believe that behavior is preferable and more natural than treating all CSV as a binary format. I provide this answer as an alternative for your consideration.
在Python 3中(我没有在Python 2中尝试过),你也可以简单地
按照 文档。
有关此内容的更多信息,请参见文档的脚注:
In Python 3 (I haven't tried this in Python 2), you can also simply do
as per documentation.
More on this in the doc's footnote:
您可以在 csv writer 命令中引入 lineterminator='\n' 参数。
You can introduce the lineterminator='\n' parameter in the csv writer command.
您必须添加属性 newline="\n" 才能打开函数,如下所示:
You have to add attribute newline="\n" to open function like this:
请注意,如果您使用 DictWriter,您将获得来自 open 函数的新行和来自 writerow 函数的新行。
您可以在 open 函数中使用 newline='' 来删除多余的换行符。
Note that if you use DictWriter, you will have a new line from the open function and a new line from the writerow function.
You can use newline='' within the open function to remove the extra newline.