使用 UTF-16 编码的 R write.csv
我在使用 UTF-16 字符编码的 write.csv
输出 data.frame 时遇到问题。
背景:我正在尝试从 data.frame 中写出 CSV 文件以在 Excel 中使用。 Excel Mac 2011 似乎不喜欢 UTF-8(如果我在文本导入期间指定 UTF-8,非 ASCII 字符将显示为下划线)。我一直相信 Excel 会满意 UTF-16LE 编码。
这是示例 data.frame:
> foo
a b
1 á 羽
> Encoding(levels(foo$a))
[1] "UTF-8"
> Encoding(levels(foo$b))
[1] "UTF-8"
所以我尝试通过执行以下操作来输出 data.frame:
f <- file("foo.csv", encoding="UTF-16LE")
write.csv(foo, f)
这给了我一个 ASCII 文件,如下所示:
"","
如果我使用 encoding="UTF-16"
,我会得到一个仅包含字节顺序标记 0xFE 0xFF
的文件。
如果我使用 encoding="UTF-16BE"
,我会得到一个空文件。
这是在 Mac OS X 10.6.6 上的 64 位版本的 R 2.12.2 上。我做错了什么?
I'm having trouble outputting a data.frame using write.csv
using UTF-16 character encoding.
Background: I am trying to write out a CSV file from a data.frame for use in Excel. Excel Mac 2011 seems to dislike UTF-8 (if I specify UTF-8 during text import, non-ASCII characters show up as underscores). I've been led to believe that Excel will be happy with UTF-16LE encoding.
Here's the example data.frame:
> foo
a b
1 á 羽
> Encoding(levels(foo$a))
[1] "UTF-8"
> Encoding(levels(foo$b))
[1] "UTF-8"
So I tried to output the data.frame by doing:
f <- file("foo.csv", encoding="UTF-16LE")
write.csv(foo, f)
This gives me an ASCII file that looks like:
"","
If I use encoding="UTF-16"
, I get a file that only contains the byte-order mark 0xFE 0xFF
.
If I use encoding="UTF-16BE"
, I get an empty file.
This is on a 64-bit version of R 2.12.2 on Mac OS X 10.6.6. What am I doing wrong?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您只需将 csv 保存为 UTF-8,然后使用 iconv< 将其转换为 UTF-16LE /a> 在终端中。
如果您坚持在 R 中执行此操作,则以下方法可能会起作用 - 尽管 R 中的
iconv
似乎确实存在一些问题,请参阅:http://tolstoy.newcastle.edu.au/R/e10/devel/10/06/0648.html如您所见上面链接的补丁是确实需要的 - 我没有测试过,但是如果你想保持它简单(并且令人讨厌):只需在保存后使用
system
调用来调用 R 中的第三方 iconv 程序表到 csv。You could simply save the csv in UTF-8 and later convert it to UTF-16LE with iconv in terminal.
If you insist on doing it in R, the following might work - althought it seems that
iconv
in R does have some issues, see: http://tolstoy.newcastle.edu.au/R/e10/devel/10/06/0648.htmlAs you can see the above linked patch is really needed - which I did not tested, but if you want to keep it simly (and nasty): just call the third party iconv program inside R with a
system
call after saving the table to csv.类似的事情可能会这样做(
write.csv()
只是忽略编码,因此您必须选择writLines()
或writeBin()
) ...something like that might do (
write.csv()
simply ignores the encoding so you have to opt forwritLines()
orwriteBin()
) ...