我使用 SqlBulkCopy 以编程方式将数据从 Excel 导出到 SQL Server 2005。 它工作得很好,我遇到的唯一问题是它没有保留 Excel 文件中的行顺序。 我没有可供排序的列,我只想按照 Excel 电子表格中显示的顺序插入记录。
我无法修改 Excel 文件,并且必须使用我现有的文件。 按任何现有列排序都会破坏顺序。
请帮忙。
PS 最终将 ID 列插入到电子表格中,看起来无法在导出/导入期间保留订单
I'm exporting data programatically from Excel to SQL Server 2005 using SqlBulkCopy. It works great, the only problem I have is that it doesn't preserve the row sequence i have in Excel file. I don't have a column to order by, I just want the records to be inserted in the same order they appear in the Excel Spreadsheet.
I can't modify the Excel file, and have to work with what I've got. Sorting by any of the existing columns will break the sequence.
Please help.
P.S. Ended up inserting ID column to the spreadsheet, looks like there's no way to keep the order during export/import
发布评论
评论(4)
我不认为行排序是由 SQL 指定或保证的,除非您使用“ORDER BY”子句。
来自 Bill Vaughn 的帖子 (http://betav.com/blog/billva /2008/08/sql_server_indexing_tips_and_t.html):
另一个包含信息的链接:
http://sqlblogcasts.com/blogs/simons/archive/2007/08/21/What-is-the-position-of-a-row--.aspx
I don't think that row ordering is specified or guaranteed by SQL unless you use an "ORDER BY " clause.
From a post by Bill Vaughn (http://betav.com/blog/billva/2008/08/sql_server_indexing_tips_and_t.html):
Another link with info:
http://sqlblogcasts.com/blogs/simons/archive/2007/08/21/What-is-the-position-of-a-row--.aspx
经过大量研究后,很明显,Microsoft 编写的批量插入命令无法保留行顺序。 您要么必须自己直接将 ID 列添加到导入文件中,使用 shell 或其他外部脚本,要么不这样做。 看起来这对微软来说是一个需要(而且很容易)添加的功能,但在十多年来他们什么也没做之后,这不会发生。
然而,我需要在导入后保留导入文件中的实际记录顺序,因为如果设置的列具有相同的值,则较高的记录将取代较低的记录。
所以我走了一条不同的路。 我的限制是:
我喜欢使用 Powershell 为每行创建有序插入语句,然后在 Sql 中运行的逻辑。 它本质上是将每个记录排队以进行单独插入而不是批量插入。 是的,它会起作用,但也会非常慢。 我经常有超过 500K 行的文件。 我需要快速的东西。
所以我遇到了 XML。 将文件直接批量上传到单个 XML 变量中。 当每个记录添加到 XML 中时,这将保留记录的顺序。 然后解析 XML 变量并将结果插入表中,同时添加标识列。
假设导入文件是标准文本文件,每条记录都以换行符结尾 (Char(13)+Char(10))
我的方法有 2 个步骤:
执行IMPORT SQL 语句(使用 OPENROWSET),用 XML 标签封装每个记录。 将结果捕获到 XML 变量中。
通过 XML 标记将变量解析到表中,添加一个递增的 [ID] 列。
<前><代码>--------------------------------'+Replace([BulkColumn],Char(13)+Char(10),' ')+' ' 作为XML)
声明@X xml;
---------------------------------
SELECT @X=Cast('
FROM OPENROWSET (BULK N'\\FileServer\ImportFolder\ImportFile_20170120.csv',SINGLE_CLOB) T
---------------------------------
SELECT [记录].[X].query('.').value('.','varchar(max)') [记录]
,ROW_NUMBER() OVER (ORDER BY (SELECT 100)) [ID]
--进入#TEMP
FROM @X.nodes('X') [记录](X);
---------------------------------
XML 标记取代了每个换行符。
如果文件以换行符结尾,这将导致在末尾添加一个空白行。 只需删除最后一行即可。
我使用动态 sql 将其写入我的过程中,这样我就可以传入 FileName 并将 ID 设置为从 1 或 0 开始(如果有标题行)。
我能够在大约 5 秒内针对包含 300K 记录的文件运行此程序。
After lots of research it seems evident that there's no way to retain row order with the Bulk Insert command written as it is featured by Microsoft. You either have to add an ID column yourself directly into the import file, use a shell or other external script, or you do without. It seems it would be a needed (and easy) feature for Microsoft to add, but after more than a decade of nothing from them, it's not going to happen.
Yet I needed to preserve the actual record order in the import file after importing as higher up records would supersede those lower if a set column had the same value.
So I went a different route. My constraints were:
I liked the logic of using Powershell to create ordered insert statements for each row, then running in Sql. It was essentially queuing each record up for individual insert rather than BULK insert. Yes, it would work, but it would also be very slow. I often have files with 500K+ rows in them. I needed something FAST.
So I ran across XML. Bulk upload the file directly into a single XML variable. This would retain the order of the records as each is added to the XML. Then parse the XML variable and insert the results into a table, adding an identity column at the same time.
There is an assumption that the import file is a standard text file, with each record ending in a Line Feed (Char(13)+Char(10))
My approach has 2 steps:
Execute the IMPORT SQL statement (using OPENROWSET), encapsulating each record with XML tags. Capture the results into an XML variable.
Parse the variable by the XML tags into a table, adding an incrementing [ID] column.
The XML tags replace each Line Feed.
If the file ends with a Line Feed, this will cause a blank row to be added at the end. Simply delete the last row.
I wrote this into my procedure using dynamic sql so I could pass in the FileName and set the ID to begin at 1 or 0 (in case there's a header row).
I was able to run this against a file of 300K records in about 5 seconds.
您还可以在表中定义一个标识列,该标识列在数据加载期间自动递增。 这样,当您希望记录再次以相同的顺序排列时,您可以对其进行排序。
You might also be able to define an identity column in your table that auto-increments during data load. That way, you can sort on it later when you want the records in the same order again.
如果您可以将 Excel 电子表格保存为 CSV,则可以很容易地使用任何脚本语言生成 INSERT 语句列表,这些语句将以与电子表格完全相同的顺序执行。 下面是 Groovy 中的一个简单示例,但任何脚本语言都可以轻松完成此任务:
然后您可以针对数据库执行“yourInsertScript.sql”,您的订单将与电子表格相同。
If you can save the excel spreadsheet as a CSV it is very easy to generate a list of INSERT statements with any scripting language which will be executed in the exact same order as the spreadsheet. Here's a quick example in Groovy but any scripting language will do it just as easily if not easier:
You can then execute your "yourInsertScript.sql" against your database and your order will be the same as your spreadsheet.