在 R 中将 UTF-8 BOM 导出为 .csv
我正在通过 RJDBC 从 MySQL 数据库读取文件,它正确显示 R 中的所有字母(例如,נווה שאנן)。 但是,即使使用 write.csv 和 fileEncoding="UTF-8" 导出它,输出也看起来像 <代码>.。 (在本例中这不是上面的字符串,而是保加利亚语字符串),适用于保加利亚语、希伯来语、中文等。其他特殊字符如 ã、ç 等都可以正常工作。
我怀疑这是因为UTF-8 BOM,但我在网上没有找到解决方案
我的操作系统是德国Windows7。
编辑:我尝试了
con<-file("file.csv",encoding="UTF-8")
write.csv(x,con,row.names=FALSE)
(据我所知)等效的 write.csv(x, file="file.csv",fileEncoding="UTF-8",row.names=FALSE)
。
I am reading a file through RJDBC from a MySQL database and it correctly displays all letters in R (e.g., נווה שאנן).
However, even when exporting it using write.csv and fileEncoding="UTF-8" the output looks like<U+0436>.<U+043A>. <U+041B><U+043E><U+0437><U+0435><U+043D><U+0435><U+0446>
(in this case this is not the string above but a Bulgarian one) for Bulgarian, Hebrew, Chinese and so on. Other special characters like ã,ç etc work fine.
I suspect this is because of UTF-8 BOM but I did not find a solution on the net
My OS is a German Windows7.
edit: I tried
con<-file("file.csv",encoding="UTF-8")
write.csv(x,con,row.names=FALSE)
and the (afaik) equivalent write.csv(x, file="file.csv",fileEncoding="UTF-8",row.names=FALSE)
.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
接受的答案对我在类似的应用程序中没有帮助(Windows 中的 R 3.1,当我尝试在 Excel 中打开文件时)。无论如何,基于 file 文档的这一部分:
我想出了以下解决方法:
请注意,df是data.frame,filename是csv文件的路径。
The accepted answer did not help me in a similar application (R 3.1 in Windows, while I was trying to open the file in Excel). Anyway, based on this part of file documentation:
I came up with the following workaround:
Note that df is the data.frame and filename is the path to the csv file.
在
Encoding
的帮助页面 (help("Encoding")
) 您可以阅读有关特殊编码 -bytes
的信息。使用它,我能够通过以下方式生成 csv 文件:
注意
factor
和character
之间的差异。以下应该有效:On help page to
Encoding
(help("Encoding")
) you could read about special encoding -bytes
.Using this I was able to generate csv file by:
Take care about differences between
factor
andcharacter
. The following should work: