逗号导致使用 BULK INSERT 和格式文件出现问题
我正在尝试使用 SQL Server 上的 BULK INSERT 和格式文件导入 .CSV 文件。我已经创建了格式文件并为每个值设置了正确的字段终止符。格式文件已正确验证,并且我已成功导入一些测试数据。
但是,当我在引用文本中间遇到逗号时遇到问题,如下所示:
“SB”,“7Y4386429Y0351805”,“B5503”,“”,“Chris”,1,10,“Paypal”,“Hi抱歉由于付款延迟,我会尽快付款。”
(示例数据)
我已设置格式文件以正确识别带引号和不带引号的字段,但最后一个字段中间有一个逗号,这会在导入时产生错误。
我将 FieldTerminator 设置为“,”,并尝试将其完全保留,但是是否应该忽略终止符,而使用格式文件中的终止符?
我不明白为什么逗号会导致字段结束,因为该字段被引号包围。这是设计使然吗?当然应该忽略逗号,因为它与格式文件中的任何字段终止符都不匹配?
我在这里有什么选择?有问题的数据来自 Paypal,因此他们改变数据导出方式的可能性不大。
有什么想法吗?
如果重要的话,我在下面显示了格式文件的第一部分(实际文件包含 56 行)。所有 56 行都具有相同的 SQLCHAR 数据类型、相同的长度和相同的结束值。
10.0
56
1 SQLCHAR 0 0 "\"" 0 DUMMY SQL_Latin1_General_CP1_CI_AS
2 SQLCHAR 0 100 "\",\"" 1 CH SQL_Latin1_General_CP1_CI_AS
3 SQLCHAR 0 100 "\",\"" 2 TransactionID SQL_Latin1_General_CP1_CI_AS
4 SQLCHAR 0 100 "\",\"" 3 InvoiceID SQL_Latin1_General_CP1_CI_AS
5 SQLCHAR 0 100 "\",\"" 4 PayPalReferenceID SQL_Latin1_General_CP1_CI_AS
任何帮助将不胜感激。
谢谢,
克里斯。
I'm trying to import .CSV files using BULK INSERT on SQL Server and a Format File. I have created the format file and set the correct field terminators for each value. The format file has been correctly validated and I have successfully imported some test data.
However I have a problem when encountering comma's in the middle of quoted text, such as the following:
"SB","7Y4386429Y0351805","B5503","","Chris",1,10,"Paypal","Hi sorry for the delay in payment, I will be sending payment soon."
(example data)
I've set the format file to correctly identify fields which are quoted and non-quoted, but the last field has a comma in the middle, and this is creating an error whilst importing.
I have the FieldTerminator set to "," and have tried leaving it out completely too, but shouldn't the terminator be ignored and instead the ones in the format file used?
I don't understand why the comma would cause the field to end, as the field is surrounded by quotation marks. Is this by design? Surely the comma should be ignored as it doesnt match any of the field terminators in the format file?
What are my options here? The data in question is coming from Paypal so there's not much chance of them changing the way it's exported.
Any ideas?
If it matters, I have shown the first part of my format file below (the real file contains 56 rows). All 56 rows have the same SQLCHAR datatype, the same length and the same end value.
10.0
56
1 SQLCHAR 0 0 "\"" 0 DUMMY SQL_Latin1_General_CP1_CI_AS
2 SQLCHAR 0 100 "\",\"" 1 CH SQL_Latin1_General_CP1_CI_AS
3 SQLCHAR 0 100 "\",\"" 2 TransactionID SQL_Latin1_General_CP1_CI_AS
4 SQLCHAR 0 100 "\",\"" 3 InvoiceID SQL_Latin1_General_CP1_CI_AS
5 SQLCHAR 0 100 "\",\"" 4 PayPalReferenceID SQL_Latin1_General_CP1_CI_AS
Any help would be much appreciated.
Thanks,
Chris.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
据我所知,你做得对 - 从这里获取知识:
http://www.sqlservercentral.com/Forums/Topic18289-8-1.aspx#bm87418
本质上,将分隔符更改为“\”,\“”应该足够了,因为最后一个字段中间的逗号是不是“,”,但是,
尝试设置链接中的第一个和最后一个分隔符(“\”,\“”)和(“\”\r”),看看是否有帮助?
或者,预处理文件并替换“, " 用诸如 ##$## 之类的垃圾内容并将 , 替换为 . (或其他字符),然后将 ##$## 替换为“,”,然后导入?除非 , 在最后一个字段中至关重要,否则点通常会起作用窍门。
You've done it right as far as I can see - taking knowledge from here:
http://www.sqlservercentral.com/Forums/Topic18289-8-1.aspx#bm87418
Essentially, changing the separator to "\",\"" should be enough as the comma mid the last field is not "," but ,
Try setting the first and last separators as in the link ("\",\"") and ("\"\r") and see if that helps?
Or, preprocess the files and replace "," with some junk like ##$## and replace , with . (or some other character) and then ##$## with "," and then import? Unless the , is vital in the last field, a dot usually does the trick.
除此之外,FIELDTERMINATOR 值似乎没有任何效果,因为无论我是否在 BULK INSERT 的 SQL 中包含该值,结果仍然相同。因此,从现在起,随着格式文件终止符的接管,我将把它省略。
但问题仍然存在,对我来说,为什么引用文本中间的单个逗号会导致字段结束,这似乎很奇怪。
Further to this, the FIELDTERMINATOR value doesn't seem to have any effect as the result is still the same whether I include this value in the SQL for BULK INSERT. So I will leave it out from now on as the Format File terminators take over.
The problem still exists though, and it seems very odd to me why a single comma in the middle of quoted text would cause the field to end.