“欺骗” SQL Server 接受来自 R 的文件
我在 R 的本地环境中有一个文件(大约 1000000 行和 15 列),我正在尝试将此文件上传到 SQL 服务器:
library(odbc)
library(DBI)
library(RODBC)
dbWriteTable(my_connection, SQL("my_table"), my_table)
但是当我运行时,我无法将整个表上传到服务器这段代码我收到错误:
Error in result_insert_dataframe(rs@prt, values): nanodbc/nanodbc.cpp:1587 : HY008 : Operation canceled
但是,我注意到我能够成功地将这个表的“部分”上传到服务器上:
#this works fine
my_table_1 = my_table[1:1000,]
dbWriteTable(my_connection, SQL("my_table_1"), my_table_1)
这让我相信“my_table”中的某个地方有一些行未被接受服务器 -但我不确定如何正确识别这些行。
- 有没有办法找出哪些行导致问题,或者有没有办法“欺骗”/“强制”服务器接受这些行?
以下是 R(全局环境)中文件的摘要(“str”命令):
- var1 : chr
- var2: chr
- var3: chr
- var4: chr
- var 5: chr
- var6: chr
- var7: chr
- var8: chr
- var9: chr
- var10: chr
- var11:chr
- var12:num
- var13:num
- var14:num
- var15: chr
以下是该文件的前 100 行已成功上传到 SQL 的变量类型的摘要:
- var1 : CHARACTER VARRYING (255)
- var2: CHARACTER VARRYING (255)
- var3: CHARACTER VARRYING (255)
- var4: CHARACTER VARRYING (255)
- var 5: CHARACTER VARRYING (255)
- var6: CHARACTER VARRYING (255)
- var7: CHARACTER VARRYING (255)
- var8: CHARACTER VARRYING (255)
- var9: CHARACTER VARRYING (255)
- var10: CHARACTER VARRYING (255)
- var11:CHARACTER VARRYING (255)
- var12:DOUBLE PERCISION
- var13:DOUBLE PERCISION
- var14:DOUBLE PERCISION
- var15: CHARACTER VARRYING (255)
基于这些摘要:
- 有没有办法找出哪些行导致问题,或者有没有办法“欺骗”/“强制”服务器接受这些行?
谢谢你!
I have a file (roughly 1000000 rows and 15 columns) in my local environment in R, and I am trying to upload this file onto a SQL server:
library(odbc)
library(DBI)
library(RODBC)
dbWriteTable(my_connection, SQL("my_table"), my_table)
But I am not able to upload this entire table on to the server, when I run this code I get the error:
Error in result_insert_dataframe(rs@prt, values): nanodbc/nanodbc.cpp:1587 : HY008 : Operation canceled
However, I noticed that I am able to successfully upload "parts" of this table onto the server:
#this works fine
my_table_1 = my_table[1:1000,]
dbWriteTable(my_connection, SQL("my_table_1"), my_table_1)
This is leading me to believe that somewhere in "my_table" there are some rows which are not being accepted by the server - but I am not sure how to correctly identify these rows.
- Is there some way to either find out which rows are causing the problems, or perhaps is there a way to "trick"/"force" the server to accept these rows?
Here is a summary ("str" command) of the file in R (global environment) :
- var1 : chr
- var2: chr
- var3: chr
- var4: chr
- var 5: chr
- var6: chr
- var7: chr
- var8: chr
- var9: chr
- var10: chr
- var11:chr
- var12:num
- var13:num
- var14:num
- var15: chr
And here is a summary of the variable types for the first 100 rows of this file which were successfully uploaded on to the SQL:
- var1 : CHARACTER VARRYING (255)
- var2: CHARACTER VARRYING (255)
- var3: CHARACTER VARRYING (255)
- var4: CHARACTER VARRYING (255)
- var 5: CHARACTER VARRYING (255)
- var6: CHARACTER VARRYING (255)
- var7: CHARACTER VARRYING (255)
- var8: CHARACTER VARRYING (255)
- var9: CHARACTER VARRYING (255)
- var10: CHARACTER VARRYING (255)
- var11:CHARACTER VARRYING (255)
- var12:DOUBLE PERCISION
- var13:DOUBLE PERCISION
- var14:DOUBLE PERCISION
- var15: CHARACTER VARRYING (255)
Based on these summaries:
- Is there some way to either find out which rows are causing the problems, or perhaps is there a way to "trick"/"force" the server to accept these rows?
Thank you!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您的数据集可能太大,无法通过单个操作将其写入数据库。我记得不久前也遇到过类似的问题,这很有意义,因为一次 1000 行似乎对你有用。将数据集分割成块并一一写入可能会起作用。
这尚未经过测试,但也许这可行:
如果您确实有权访问数据库控制台或管理 UI,那么以这种方式导入数据可能会更快。
这里在 SQL 中建议了类似的解决方案。
It might be that your dataset is too big to write it to the database with a single operation. I remember having similar issues a while ago and it would make sense since 1000 rows at a time seem to work for you. Splitting the dataset in chunks and writing them one by one might work.
This is untested, but maybe this works:
If you do have access to the database console or admin UI it might be faster to import the data that way.
A similar solutions is suggested here in SQL.
您的数据库列的最大长度均为 255 个字符。如果任何字符向量包含的字符串长于该值,则 dbWriteTable 将失败。使用 lapply(my_table, function(x) max(nchar(x))) 来查找。
如果这没有帮助,请将 data.frame 分割成块,并结合使用 purrr::map 和 purrr::possible 来循环它们。然后检查输出以查看哪些块失败,将它们切成更小的块并重试。
Your database columns all have a max length of 255 characters. dbWriteTable will fail if any of your character vectors contain strings longer than that. Use lapply(my_table, function(x) max(nchar(x))) to find out.
If that doesn't help, split the data.frame into chunks and use purrr::map in combination with purrr::possibly to loop over them. Then check the output to see which chunks failed, cut those into smaller chunks and try again.