Pandas 数据框 to_csv 转义双引号
我正在尝试在CSV中导出熊猫数据框架。一些数据包含双引号,我无法正确逃脱。
import pandas as pd
from io import StringIO
inp = [{'c1':10, 'c2':'some text'}, {'c1':11,'c2':'some "text"'}]
df = pd.DataFrame(inp)
output = StringIO()
df.to_csv(output, sep='\t', escapechar='\b', header=False, index=False)
结果,我得到了另外双引号逃脱的双引号:
'10\tsome text\n11\t"some ""text"""\n'
但是我需要是:
'10\tsome text\n11\t"some \x08"text\x08""\n'
我尝试了doublequote,quodechar和引用to_csv()功能的参数的不同组合,但没有运气。 我得到的是:
df.to_csv(output, sep='\t', escapechar='\b', header=False, index=False, doublequote=False)
哪些导致正确逃脱的双引号,但是整个单元格没有用双引号包裹,因此在进一步的步骤中无法正确解析,
'10\tsome text\n11\tsome \x08"text\x08"\n'
是否有一种方法可以使熊猫逃脱双引号具有所需的逃生角色?
PS。目前,我只能在字符串缓冲区中手动替换为“” \ x08的解决方法
I am trying to export pandas dataframe in csv. Some of data contains double quotes and I can't get it escaped properly.
import pandas as pd
from io import StringIO
inp = [{'c1':10, 'c2':'some text'}, {'c1':11,'c2':'some "text"'}]
df = pd.DataFrame(inp)
output = StringIO()
df.to_csv(output, sep='\t', escapechar='\b', header=False, index=False)
as result I get double quotes which are escaped with another double quotes:
'10\tsome text\n11\t"some ""text"""\n'
but I need it to be:
'10\tsome text\n11\t"some \x08"text\x08""\n'
I tried different combinations of doublequote, quotechar and quoting arguments for to_csv() function but no luck.
Closest I got is:
df.to_csv(output, sep='\t', escapechar='\b', header=False, index=False, doublequote=False)
which results in properly escaped double quotes but the whole cell is not wrapped in double quotes and thus cannot be parsed correctly on further steps
'10\tsome text\n11\tsome \x08"text\x08"\n'
Is there a way to make pandas escape double quotes with needed escape character?
PS. Currently I have only workaround to replace "" with \x08" manually in string buffer
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我设法通过以下设置解决了这个问题。
EscapeChar =“ \\”
将在您的值中放置一个带有双引号的后挡板,因为您已经指定了doublequote = false
false ,它将确保双引号注意到另一个双引号逃脱了。I managed to fix this issue with following settings.
escapechar="\\"
will put a single back-slash with the double quotes in your value as you have already specifieddoublequote=False
, which will make sure double-quotes are note escaped with another double-quote.