烦人的“功能” (或错误?)对于 RODBC
RODBC 是 R 中用于将数据从数据库导入到 R 中的主要库。RODBC 似乎具有“猜测”列的数据类型的能力,我觉得这特别烦人。
我已在此处上传了一个文件test.xls
,或者您也可以自己创建一个 xls 文件:
- 创建 2 列,第一列名为
col_a
,第二列名为col_b
。 - 在
col_a
中输入您喜欢的任何内容,我在 col_b 第 92 行的 92 行中输入了字母 - ,在那里输入了一个数字,我输入了“1923”,但没有更改数据类型(即不使用 < code>')
- 尝试使用以下脚本将 xls 文件导入到 R 中:
library(RODBC)
setwd("C:/Users/hke775/Documents/Enoch/MISC/R_problems/RODBC")
channel <- odbcConnectExcel("test.xls",readOnly=TRUE)
dummy.df <- sqlFetch(channel,"Sheet1")
odbcClose(channel)
您将看到在 dummy.df
中,col_b
是全部NA
,本栏目中的1923
消失了。
如果你想再次看到1923
,可以将col_b
第一行改为数字,它又回来了。
这非常烦人,因为我不喜欢手动修改数据。我需要使用其他包来进行 xls 导入,但我找不到其他包像 RODBC
那样顺利(我尝试了 gdata
和 xlsReadWrite)。
我是否在 sqlFetch 命令中遗漏了任何内容并导致了麻烦?谢谢。
RODBC is the main library in R to import data from a database into R. RODBC
seems to have the ability of "guess" the datatype of the column which I find it particularly annoying.
I have uploaded a file test.xls
here, or you may create a xls file yourself:
- create 2 columns, first column named
col_a
and the second column namedcol_b
. - type whatever you like in
col_a
, I typed letters on this column for 92 rows - at the 92th row of col_b, type a number there, I typed "1923" without changing the data type (i.e. not using
'
) - try to import the xls file into R using the following script:
library(RODBC)
setwd("C:/Users/hke775/Documents/Enoch/MISC/R_problems/RODBC")
channel <- odbcConnectExcel("test.xls",readOnly=TRUE)
dummy.df <- sqlFetch(channel,"Sheet1")
odbcClose(channel)
You will see that in dummy.df
, col_b
is all NA
, the 1923
in this column is gone.
If you want to see the 1923
again, you can change the 1st row of col_b
to a number, and it is back again.
This is very annoying as I don't prefer modifying data manually. I need to use other package to do the xls importing, but I can't find other packages do as smooth as RODBC
(I tried gdata
and xlsReadWrite
).
Did I missing anything in the sqlFetch
command, and cause the trouble? Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
请不要将 Microsoft 的错误归咎于 R 或 RODBC...;)
我通过设置
TypeGuessRows
值为 0,看看会发生什么!请不要投赞成票或复选标记...只需发送现金。 :)
Please don't blame R or RODBC for Microsoft's bugs... ;)
I tried the fix in KB189897 by setting the
TypeGuessRows
value to 0 and look what happens!Please, no up-votes or check marks... just send cash. :)