“欺骗” SQL Server 接受来自 R 的文件

发布于 2025-01-13 07:57:21 字数 1582 浏览 0 评论 0原文

我在 R 的本地环境中有一个文件(大约 1000000 行和 15 列),我正在尝试将此文件上传到 SQL 服务器:

library(odbc)
library(DBI)
library(RODBC)

dbWriteTable(my_connection,  SQL("my_table"), my_table)

但是当我运行时,我无法将整个表上传到服务器这段代码我收到错误:

Error in result_insert_dataframe(rs@prt, values):  nanodbc/nanodbc.cpp:1587 : HY008 : Operation canceled

但是,我注意到我能够成功地将这个表的“部分”上传到服务器上:

#this works fine

my_table_1 = my_table[1:1000,]

dbWriteTable(my_connection,  SQL("my_table_1"), my_table_1)

这让我相信“my_table”中的某个地方有一些行未被接受服务器 -但我不确定如何正确识别这些行。

  • 有没有办法找出哪些行导致问题,或者有没有办法“欺骗”/“强制”服务器接受这些行?

以下是 R(全局环境)中文件的摘要(“str”命令):

- var1 : chr
- var2: chr
- var3: chr
- var4: chr
- var 5: chr
- var6: chr
- var7: chr
- var8: chr
- var9: chr
- var10: chr
- var11:chr
- var12:num
- var13:num
- var14:num
- var15: chr

以下是该文件的前 100 行已成功上传到 SQL 的变量类型的摘要:

- var1 : CHARACTER VARRYING (255)
    - var2: CHARACTER VARRYING (255)
    - var3: CHARACTER VARRYING (255)
    - var4: CHARACTER VARRYING (255)
    - var 5: CHARACTER VARRYING (255)
    - var6: CHARACTER VARRYING (255)
    - var7: CHARACTER VARRYING (255)
    - var8: CHARACTER VARRYING (255)
    - var9: CHARACTER VARRYING (255)
    - var10: CHARACTER VARRYING (255)
    - var11:CHARACTER VARRYING (255)
    - var12:DOUBLE PERCISION 
    - var13:DOUBLE PERCISION 
    - var14:DOUBLE PERCISION 
    - var15: CHARACTER VARRYING (255)

基于这些摘要:

  • 有没有办法找出哪些行导致问题,或者有没有办法“欺骗”/“强制”服务器接受这些行?

谢谢你!

I have a file (roughly 1000000 rows and 15 columns) in my local environment in R, and I am trying to upload this file onto a SQL server:

library(odbc)
library(DBI)
library(RODBC)

dbWriteTable(my_connection,  SQL("my_table"), my_table)

But I am not able to upload this entire table on to the server, when I run this code I get the error:

Error in result_insert_dataframe(rs@prt, values):  nanodbc/nanodbc.cpp:1587 : HY008 : Operation canceled

However, I noticed that I am able to successfully upload "parts" of this table onto the server:

#this works fine

my_table_1 = my_table[1:1000,]

dbWriteTable(my_connection,  SQL("my_table_1"), my_table_1)

This is leading me to believe that somewhere in "my_table" there are some rows which are not being accepted by the server - but I am not sure how to correctly identify these rows.

  • Is there some way to either find out which rows are causing the problems, or perhaps is there a way to "trick"/"force" the server to accept these rows?

Here is a summary ("str" command) of the file in R (global environment) :

- var1 : chr
- var2: chr
- var3: chr
- var4: chr
- var 5: chr
- var6: chr
- var7: chr
- var8: chr
- var9: chr
- var10: chr
- var11:chr
- var12:num
- var13:num
- var14:num
- var15: chr

And here is a summary of the variable types for the first 100 rows of this file which were successfully uploaded on to the SQL:

- var1 : CHARACTER VARRYING (255)
    - var2: CHARACTER VARRYING (255)
    - var3: CHARACTER VARRYING (255)
    - var4: CHARACTER VARRYING (255)
    - var 5: CHARACTER VARRYING (255)
    - var6: CHARACTER VARRYING (255)
    - var7: CHARACTER VARRYING (255)
    - var8: CHARACTER VARRYING (255)
    - var9: CHARACTER VARRYING (255)
    - var10: CHARACTER VARRYING (255)
    - var11:CHARACTER VARRYING (255)
    - var12:DOUBLE PERCISION 
    - var13:DOUBLE PERCISION 
    - var14:DOUBLE PERCISION 
    - var15: CHARACTER VARRYING (255)

Based on these summaries:

  • Is there some way to either find out which rows are causing the problems, or perhaps is there a way to "trick"/"force" the server to accept these rows?

Thank you!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

依 靠 2025-01-20 07:57:21

您的数据集可能太大,无法通过单个操作将其写入数据库。我记得不久前也遇到过类似的问题,这很有意义,因为一次 1000 行似乎对你有用。将数据集分割成块并一一写入可能会起作用。

这尚未经过测试,但也许这可行:

my_table_chunks <- 
  split(my_table, ceiling(seq_along(1:nrow(my_table))/1000))
  
lapply(my_table_chunks,
       function(x) dbWriteTable(my_connection,  SQL("my_table"), x), append = TRUE)

如果您确实有权访问数据库控制台或管理 UI,那么以这种方式导入数据可能会更快。

这里在 SQL 中建议了类似的解决方案。

It might be that your dataset is too big to write it to the database with a single operation. I remember having similar issues a while ago and it would make sense since 1000 rows at a time seem to work for you. Splitting the dataset in chunks and writing them one by one might work.

This is untested, but maybe this works:

my_table_chunks <- 
  split(my_table, ceiling(seq_along(1:nrow(my_table))/1000))
  
lapply(my_table_chunks,
       function(x) dbWriteTable(my_connection,  SQL("my_table"), x), append = TRUE)

If you do have access to the database console or admin UI it might be faster to import the data that way.

A similar solutions is suggested here in SQL.

灰色世界里的红玫瑰 2025-01-20 07:57:21

您的数据库列的最大长度均为 255 个字符。如果任何字符向量包含的字符串长于该值,则 dbWriteTable 将失败。使用 lapply(my_table, function(x) max(nchar(x))) 来查找。

如果这没有帮助,请将 data.frame 分割成块,并结合使用 purrr::map 和 purrr::possible 来循环它们。然后检查输出以查看哪些块失败,将它们切成更小的块并重试。

Your database columns all have a max length of 255 characters. dbWriteTable will fail if any of your character vectors contain strings longer than that. Use lapply(my_table, function(x) max(nchar(x))) to find out.

If that doesn't help, split the data.frame into chunks and use purrr::map in combination with purrr::possibly to loop over them. Then check the output to see which chunks failed, cut those into smaller chunks and try again.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文