无法将大型平面文件导入 SQL Server 2008 表
我有一个 18GB 的平面文件(40,000,000 条记录),具有固定的列宽(无字段终止符),我想将其读入 SQL Server 2008 R2 表中。除了包含数据的文本文件之外,我还获得了一个包含字段名称和长度的 Excel 文档。每条记录(每行)有 270 个字段,总共 465 个字符。我使用 bcp 创建了一个 fmt 文件,对我来说看起来不错。
10.0
270
1 SQLCHAR 2 1 "" 1 TitleCode SQL_Latin1_General_CP1_CI_AS
2 SQLCHAR 2 12 "" 2 FamilyID SQL_Latin1_General_CP1_CI_AS
3 SQLCHAR 2 12 "" 3 LocationID SQL_Latin1_General_CP1_CI_AS
在 SQL Server 中,
我想使用此 fmt 文件将数据读入表中:
BULK INSERT dbo.Customer2_noId
FROM 'C:\Uploads\dataFile_MICX\dataFile_MICX_Copy.txt'
WITH (FORMATFILE = 'C:\Users\kriss\SqlScripts\Customer2_noId-n.fmt');
GO
来自 SQL Server 的错误消息:
消息 4866,级别 16,状态 7,第 1 行
批量加载失败。该列是 数据文件中第 1 行太长, 第 1 列。验证该字段 终止符和行终止符是 指定正确。
消息 7399,级别 16,状态 1,第 1 行
用于链接的 OLE DB 提供程序“BULK” 服务器“(null)”报告错误。这 提供商未提供任何信息 关于错误。
消息 7330,级别 16,状态 2,第 1 行
无法从 OLE DB 获取行 链接服务器的提供者“BULK” “(空)”。
我尝试将最后一个字段的归档终止符更改为“\r”和“\r\n” 我尝试在底部添加一个额外的字段:
271 SQLCHAR 0 0 "\r\n" 271 dummy SQL_Latin1_General_CP1_CI_AS
没有区别。
我在网上找不到任何有帮助的东西。 (建议在 fmt 文件中添加额外的空行,但这并不能解决问题。我认为数据文件有行终止符,因为如果我使用 Excel 数据导入工具,我会看到长度一致的行。
有人可以帮忙吗? 谢谢,克里斯
I have an 18GB flat file (40,000,000 records), with fixed column widths (no field terminators), which I would like to read into a SQL Server 2008 R2 table. In addition to the text file with the data, I was given an Excel document with the field names and lengths. There are 270 fields with 465 total characters per record (per row). Using bcp I have created an fmt file, which looks fine to me.
10.0
270
1 SQLCHAR 2 1 "" 1 TitleCode SQL_Latin1_General_CP1_CI_AS
2 SQLCHAR 2 12 "" 2 FamilyID SQL_Latin1_General_CP1_CI_AS
3 SQLCHAR 2 12 "" 3 LocationID SQL_Latin1_General_CP1_CI_AS
etc.
In SQL Server I want to use this fmt file to read the data into the table:
BULK INSERT dbo.Customer2_noId
FROM 'C:\Uploads\dataFile_MICX\dataFile_MICX_Copy.txt'
WITH (FORMATFILE = 'C:\Users\kriss\SqlScripts\Customer2_noId-n.fmt');
GO
Error Messages from SQL Server:
Msg 4866, Level 16, State 7, Line 1
The bulk load failed. The column is
too long in the data file for row 1,
column 1. Verify that the field
terminator and row terminator are
specified correctly.Msg 7399, Level 16, State 1, Line 1
The OLE DB provider "BULK" for linked
server "(null)" reported an error. The
provider did not give any information
about the error.Msg 7330, Level 16, State 2, Line 1
Cannot fetch a row from OLE DB
provider "BULK" for linked server
"(null)".
I have tried changing the filed terminator for the last field to "\r" and "\r\n"
I have tried adding an extra field at the bottom:
271 SQLCHAR 0 0 "\r\n" 271 dummy SQL_Latin1_General_CP1_CI_AS
Makes no difference.
I have not been able to find anything on line which helps. (Extra blank line in fmt file is suggested, but that doesn't fix it. I think the data file has line terminators, because if I use the Excel data import tool, I see consistent-length lines.
Can someone help?
Thanks, Kriss
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
由于没有动手,我只能提供一些想法和建议:
首先,你需要判断是否有行终止符——可能没有 /r /n、char(13)+char(10) 、空终止符或文件中的任何内容。
尝试获取该文件的副本,该副本仅包含要使用的前 10、100、也许 1000 行。 (创建这个将使查找行终止符是否存在变得更加容易)。这样的编辑器有很多。我过去使用过 LTFViewr,不知道他们现在是什么版本。
我记得 BCP 对行和列分隔符非常挑剔。如果有些东西不匹配,它就会失败。 (我记得必须调整文件中奇怪的第一行或最后一行才能使 BCP 正常工作。)
您是否考虑过使用集成服务 (SSIS)?它更加灵活和适应性强,尽管它确实代表了显着不同的学习曲线。它的一大优势是微软对 2005 版本进行了改进,我怀疑他们在过去六年里是否对 BCP 进行过代码审查。
Without hands-on work, I can only offer some ideas and suggestions:
First off, you need to determine whether there is a row terminator or not -- there may be no /r /n, char(13)+char(10), null-terminator, or whatever in the file.
Try and get a copy of the file, one that only contains the first 10, 100, maybe 1000 rows to work with. (Creating this would make it much easier to find the presence or absence of row terminators). There are a large number of such editors out there; I have used LTFViewr in the past, not sure what version they're up to now.
I recall BCP being very fussy about it's row and column delimiters. If something didn't match up, it'd just fail. (I recall having to tweak the odd first or last line in a file to get BCP to work.)
Have you considered using Intergration Services (SSIS)? It is much more flexible and adaptable, though it does represent a significantly different learning curve. It has the big advantage that Microsoft has improved it over the 2005 version, where I doubt they've even done a code review on BCP in the past six years.