如何将非常大的 csv 文件导入到现有的 SQL Server 表中?
我有一个非常大的 csv 文件,其中包含约 500 列、约 350k 行,我试图将其导入到现有的 SQL Server 表中。
我尝试过BULK INSERT
,我得到 - 查询已成功执行,0 行受影响
。有趣的是,BULK INSERT
在几秒钟内就完成了类似的操作,但对于一个小得多的 csv 文件,少于 50 列,约 77k 行。
我也尝试过 bcp,但得到 - BCP 数据文件中遇到意外的 EOF。 BCP 复制失败。
任务很简单——应该不难到纯粹的挫败感。有什么想法或建议吗?您是否成功使用过任何其他工具、实用程序来完成批量导入操作或类似操作?谢谢。
-- 批量插入
USE myDb
BULK INSERT myTable
FROM 'C:\Users\myFile.csv'
WITH
(
FIRSTROW = 2,
-- DATAFILETYPE = 'char',
-- MAXERRORS = 100,
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
);
-- bcp
bcp myDb.dbo.myTable in 'C:\Users\myFile.csv' -T -t, -c
更新
我现在改变了方向。我决定在 SQL Server 之外加入 csv 文件,这是我的目标,这样我现在就不必将数据上传到表中。但是,尝试从 csv 文件中仅上传(BULK INSERT 或“bcp”)1 条记录(约 490 列)会很有趣,否则会失败,然后看看它是否有效。
I have a very large csv file with ~500 columns, ~350k rows, which I am trying to import into an existing SQL Server table.
I have tried BULK INSERT
, I get - Query executed successfully, 0 rows affected
. Interestingly, BULK INSERT
worked, in a matter of seconds, for a similar operation but for a much smaller csv file, less than 50 cols., ~77k rows.
I have also tried bcp
, I get - Unexpected EOF encountered in BCP data-file. BCP copy in failed
.
The task is simple - it shouldn't be hard to the limits of pure frustration. Any ideas or suggestions? Any other tools, utilities that you have successfully used to accomplish a bulk import operation or something similar? Thanks.
-- BULK INSERT
USE myDb
BULK INSERT myTable
FROM 'C:\Users\myFile.csv'
WITH
(
FIRSTROW = 2,
-- DATAFILETYPE = 'char',
-- MAXERRORS = 100,
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
);
-- bcp
bcp myDb.dbo.myTable in 'C:\Users\myFile.csv' -T -t, -c
UPDATE
I have now changed course. I've decided to join the csv files, which was my goal to begin with, outside of SQL Server so that I don't have to upload the data to a table for now. However, it'll be interesting to try to upload (BULK INSERT or 'bcp') only 1 record (~490 cols.) from the csv file, which otherwise failed, and see if it works.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
检查文件中是否存在不应出现的 EOF 字符 - BCP 告诉您该文件有问题。
Notepad++也许能够加载该文件供您查看和搜索。
Check your file for an EOF character where it shouldn't be - BCP is telling you there is a problem with the file.
Notepad ++ may be able to load the file for you to view and search.
最后一行很可能缺少
\n
。此外,尽管 T-SQL 应该提到这一点,但 SQL-Server 中的行大小(8060 字节)存在限制。但是,请检查此链接:我的建议:从一行开始并让它发挥作用。然后剩下的。
Most likely the last line lacks a
\n
. Also, there is a limitation in the row size (8060 bytes) in SQL-Server although T-SQL should have mention this. However, check this link:My advice: Start with one row and get it to work. Then the rest.
如何将文件中的字段与表中的列映射?表中的列数与文件中的字段数相同吗?或者您是否使用格式文件来指定列映射?如果是,格式文件的格式是否正确?
如果您使用格式文件并且“列数”参数错误,则会导致错误“意外的文件结尾”。 查看此内容以了解其他一些错误/问题批量上传。
How are you mapping the fields in the file with the columns in the table? Are the number of columns in the table the same as the number of fields in the file? Or are you using a format file to specify the column mapping? If so, is the format file formatted correctly?
If you are using the format file and if you have the "Number of columns" parameter wrong, it will cause the error "Unexpected end of file". See this for some other errors/issues with bulk uploading.
这可能不是您期望的解决方案,但使用 Python,您可以非常轻松地从 csv 创建一个表(只需上传 1GB CSV 文件):
It is probably not the solution your expecting but with Python you could create a table out of the csv very easily (just uploaded a 1GB CSV file):