RODBC(有时)不读取数字
我正在使用 ROBDC 将 Excel 工作表读入 R。然而,我遇到了一些奇怪的结果。有时,重新启动 R 控制台会有所帮助,但我现在尝试了 3 次,并且以下错误(?)仍然存在。
Excel 工作表包含数字块,我想将其读入数组中。
这是我的 sqlQuery:
channel <- odbcConnectExcel(paste("~link to file~"))
as.matrix(sqlQuery(channel,paste("select F13,F14,F15 from \"",land,i,"$\"",sep="")))
这是(部分)结果:
32 NA NA NA
33 NA NA "Pf7"
34 "2.079975e+02" "6.762166e+01" NA
35 "1.721265e+02" "5.595971e+01" NA
36 "1.373673e+02" "4.465921e+01" NA
37 "1.200835e+02" "3.904013e+01" NA
38 "1.411489e+02" "4.588864e+01" NA
39 NA NA NA
40 NA NA NA
41 NA NA NA
42 NA NA "Pf7"
43 "1.443768e+00" "4.693807e-01" NA
44 "1.194778e+00" "3.884318e-01" NA
45 "9.535044e-01" "3.099920e-01" NA
46 "8.335332e-01" "2.709883e-01" NA
47 "9.797535e-01" "3.185257e-01" NA
48 NA NA NA
49 NA NA NA
在 Excel 工作表中,每个数字块前面都有 Pf5:Pf7,每个描述 Pf5:Pf7 后面都跟有 5 个数字。所有单元格均采用数字格式。 Excel 表格可以在此处下载。
在 F13 和 F14 列中,RODBC 删除了描述(这很奇怪,但对我来说没问题),在 F15 列中它删除了数字(这对我来说根本不合适!)
奇怪的是,完全相同的代码运行得很好昨天,我没有更新任何 R 文件或以任何方式更改 Excel 工作表,所以我对可能导致错误的原因感到困惑。
我很感激任何可能有帮助的评论。
谢谢, 马丁
更新: 我尝试使用 odbcConnectExcel2007 而不是 odbcConnectExcel,结果如下:
22 NA NA NA
23 NA NA NA
24 NA NA "Pf7"
25 "7.730711e+01" "1.958429e+01" "12,638675305"
26 "4.126044e+01" "1.045255e+01" "6,7455279987"
27 "4.293419e+01" "1.087656e+01" "7,0191635751"
28 "4.278764e+01" "1.083944e+01" "6,9952057822"
29 "4.073317e+01" "1.031897e+01" "6,6593269202"
30 NA NA NA
31 NA NA NA
32 NA NA NA
33 NA NA "Pf7"
34 "9.001111e+01" "2.280260e+01" "14,715610176"
35 "4.804083e+01" "1.217023e+01" "7,854032014"
36 "4.998963e+01" "1.266392e+01" "8,1726345871"
37 "4.981901e+01" "1.262070e+01" "8,1447397695"
38 "4.742692e+01" "1.201471e+01" "7,7536653665"
39 NA NA NA
40 NA NA NA
“,”而不是“。”可能是我使用德语版 Excel 造成的。我可以使用 sub() 来替换逗号,但这 (1) 不是一个优雅的解决方案,并且 (2) 没有解释为什么它适用于前两列,但不适用于第三列。 (也不是为什么它昨天有效但今天不起作用......)
I am reading Excel sheets into R using ROBDC. However, I encountered some strange results. Sometimes, restarting the R console will help, but I tried 3 times now, and the following bug (?) persists.
The Excel sheet contains blocks of numbers, which I want read into an array.
This is my sqlQuery:
channel <- odbcConnectExcel(paste("~link to file~"))
as.matrix(sqlQuery(channel,paste("select F13,F14,F15 from \"",land,i,"$\"",sep="")))
Here is (a part of) the result:
32 NA NA NA
33 NA NA "Pf7"
34 "2.079975e+02" "6.762166e+01" NA
35 "1.721265e+02" "5.595971e+01" NA
36 "1.373673e+02" "4.465921e+01" NA
37 "1.200835e+02" "3.904013e+01" NA
38 "1.411489e+02" "4.588864e+01" NA
39 NA NA NA
40 NA NA NA
41 NA NA NA
42 NA NA "Pf7"
43 "1.443768e+00" "4.693807e-01" NA
44 "1.194778e+00" "3.884318e-01" NA
45 "9.535044e-01" "3.099920e-01" NA
46 "8.335332e-01" "2.709883e-01" NA
47 "9.797535e-01" "3.185257e-01" NA
48 NA NA NA
49 NA NA NA
In the Excel sheet, every block of numbers is preceded by Pf5:Pf7, and every description Pf5:Pf7 is followed by 5 numbers. All cells are formatted as number. The Excel sheet can be downloaded here.
In columns F13 and F14, RODBC drops the descriptions (which is strange but fine with me), in column F15 it drops the numbers (which is not fine with me at all!)
The strange thing is that exactly the same code worked just fine yesterday, and I did not update any R files or change the Excel sheets in any way, so I am puzzled regarding what might cause the error.
I am grateful for any comments that might help.
Thanks,
Martin
UPDATE:
I tried using odbcConnectExcel2007 instead of odbcConnectExcel, here are the results:
22 NA NA NA
23 NA NA NA
24 NA NA "Pf7"
25 "7.730711e+01" "1.958429e+01" "12,638675305"
26 "4.126044e+01" "1.045255e+01" "6,7455279987"
27 "4.293419e+01" "1.087656e+01" "7,0191635751"
28 "4.278764e+01" "1.083944e+01" "6,9952057822"
29 "4.073317e+01" "1.031897e+01" "6,6593269202"
30 NA NA NA
31 NA NA NA
32 NA NA NA
33 NA NA "Pf7"
34 "9.001111e+01" "2.280260e+01" "14,715610176"
35 "4.804083e+01" "1.217023e+01" "7,854032014"
36 "4.998963e+01" "1.266392e+01" "8,1726345871"
37 "4.981901e+01" "1.262070e+01" "8,1447397695"
38 "4.742692e+01" "1.201471e+01" "7,7536653665"
39 NA NA NA
40 NA NA NA
The "," instead of the "." could be caused by me using a German version of Excel. I could use sub() to replace the commas, but this (1) would not be an elegant solution, and (2) does not explain why it works with the first two cols but not with the third. (And also not why it worked yesterday but not today...)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
查看 Excel 文件。单元格的内容实际上是数字吗? (如果您查看公式栏,类似
'123
的内容会显示单元格内容是字符串,而不是数字。)您还可以考虑使用,例如
read.xls< /code> 在
xlsReadWrite
包中。这比使用 SQL 查询要简单一些。 (同样,对于 Excel 2007 及以上版本,xlsx
包中的read.xlsx
。)Take a look in the Excel file. Are the contents of the cell actually numbers? (If you look in the formula bar, something like
'123
reveals the cell contents to be a string, not a number.)You could also look into using, e.g.,
read.xls
in thexlsReadWrite
package. It's a little simpler than having to use SQL queries. (Likewise,read.xlsx
in thexlsx
package for Excel 2007 onwards.)不要将
as.matrix
用于可能具有不同列数据类型的数据框。您的代码不是独立的,但是如果您将代码分解为可理解和可重现的内容,那么发生的事情应该非常明显。
使用问题中提供的 zip 中的文件“test2.xls”:
F13
和F14
对我来说看起来像数字。如果我们用
as.matrix
丢弃所有结构会发生什么?如果我们对任何旧数据框执行此操作会发生什么?
如果存在数字和字符或因子列,则矩阵将被强制转换为字符。
这与 RODBC 完全无关。
Do not use
as.matrix
for a dataframe that may have different column data types.Your code is not self-contained, but it should be pretty obvious what is going on if you decompose your code into something understandable and reproducible.
Using the file "test2.xls" in the zip provided in the question:
F13
andF14
look like numbers to me.What happens if we trash all that structure with
as.matrix
?What happens if we do that to any old dataframe?
The matrix is coerced to character if there are columns of numeric and character or factor.
This has absolutely nothing to do with
RODBC
.我无法使用 R-2.13.0 和 RODBC_1.3-2 复制您的问题。
由于我无法复制该问题,我不知道如何提供帮助...您说您没有更新任何 R 文件;这是否意味着您没有更改任何程序?您是否升级了 R 安装和/或升级了任何软件包?
I can't replicate your issue using R-2.13.0 and RODBC_1.3-2.
Since I can't replicate the issue, I'm not sure how to help... You say you didn't update any R files; does that mean you didn't change any of your programs? Have you upgraded your R installation and/or upgraded any packages?