在 Python 中将转义字符写入 Csv 文件

发布于 2024-12-07 08:47:05 字数 658 浏览 2 评论 0原文

我在 python 中使用 csv 模块，转义字符一直弄乱我的 csv。例如，如果我有以下内容：

import csv

rowWriter = csv.writer(open('bike.csv', 'w'), delimiter = ",")

text1 = "I like to \n ride my bike"
text2 = "pumpkin sauce"

rowWriter.writerow([text1, text2])
rowWriter.writerow(['chicken','wings'])

我希望我的 csv 看起来像：

I like to \n ride my bike,pumpkin sauce
chicken,wings

但事实证明，

I like to
ride my bike,pumpkin sauce
chicken,wings

我已经尝试了 csv 模块的引用、双引号、escapechar 和其他参数的组合，但我似乎无法让它发挥作用。有谁知道这是怎么回事？

*注意 - 我还使用编解码器encode（“utf-8”），所以text1确实看起来像“我喜欢\n骑我的自行车”.encode（“utf-8”）

原文

I'm using the csv module in python and escape characters keep messing up my csv's. For example, if I had the following:

import csv

rowWriter = csv.writer(open('bike.csv', 'w'), delimiter = ",")

text1 = "I like to \n ride my bike"
text2 = "pumpkin sauce"

rowWriter.writerow([text1, text2])
rowWriter.writerow(['chicken','wings'])

I would like my csv to look like:

I like to \n ride my bike,pumpkin sauce
chicken,wings

But instead it turns out as

I like to
ride my bike,pumpkin sauce
chicken,wings

I've tried combinations of quoting, doublequote, escapechar and other parameters of the csv module, but I can't seem to make it work. Does anyone know whats up with this?

*Note - I'm also using codecs encode("utf-8"), so text1 really looks like "I like to \n ride my bike".encode("utf-8")

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

甜妞爱困 2024-12-14 08:47:05

问题不在于将它们写入文件。问题是 \n 在 '' 或 "" 内部时是换行符。你真正想要的是'我喜欢\n骑我的自行车'或r'我喜欢\n骑我的自行车'（注意r 前缀）。

回复收藏 0 原文

眼角的笑意。 2024-12-14 08:47:05

首先，为什么您希望 r"\n" （两个字节）而不是 "\n" （一个字节）出现在文件中并不明显。输出文件的使用者的目的是什么？在每个输入字段上使用 ast.evaluate_literal() ？如果您的实际数据包含任何（非 ASCII 字符、撇号、引号），那么我会非常谨慎地使用 repr() 对其进行序列化。

其次，您错误报告了您的代码或输出（或两者）。您显示的代码实际上会产生：

"I like to
 ride my bike",pumpkin sauce
chicken,wings

第三，关于您的 "I like to骑我的自行车".encode("utf-8")： str_object.encode("utf-8如果 str_object 仅包含 ASCII 字节，") 绝对毫无意义——它什么也不做。否则会引发异常。

第四，这条评论：

我不再需要调用编码，因为我正在使用原始
细绳。我的文字中有很多unicode字符
使用，所以在我开始使用原始字符串之前我使用的是编码所以
csv 可以读取 unicode 文本

没有任何意义——正如我所说，"ascii string".encode('utf8') 是没有意义的。

考虑倒退两步，解释一下你真正想要做什么：你的数据从哪里来，里面有什么，最重要的是，读取文件的过程会做什么？

Firstly, it is not obvious why you want r"\n" (two bytes) to appear in your file instead of "\n" (one byte). What is the consumer of the output file meant to do? Use ast.evaluate_literal() on each input field? If your actual data contains any of (non-ASCII characters, apostrophes, quotes), then I'd be very wary of serialising it using repr().

Secondly, you have misreported either your code or your output (or both). The code that you show actually produces:

"I like to
 ride my bike",pumpkin sauce
chicken,wings

Thirdly, about your "I like to \n ride my bike".encode("utf-8"): str_object.encode("utf-8") is absolutely pointless if str_object contains only ASCII bytes -- it does nothing. Otherwise it raises an exception.

Fourthly, this comment:

I don't need to call encode anymore, now that I'm using the raw
string. There are a lot of unicode characters in the text that I am
using, so before I started using the raw string I was using encode so
that csv could read the unicode text

doesn't make any sense -- as I've said, "ascii string".encode('utf8') is pointless.

Consider taking a step ot two backwards, and explain what you are really trying to do: where does your data come from, what's in it, and most importantly, what does the process that is going to read the file going to do?

回复收藏 0 原文

~没有更多了~