SQL 插入速度更快?
我正在处理每块 50k 行的数据块。 我使用 LINQ 将它们插入到 SQL 数据库中:
for(int i=0;i<50000;i++)
{
DB.TableName.InsertOnSubmit
(
new TableName
{
Value1 = Array[i,0],
Value2 = Array[i,1]
}
);
}
DB.SubmitChanges();
这大约需要 6 分钟,如果可能的话,我希望花费更少的时间。有什么建议吗?
I'm dealing with chunks of data that are 50k rows each.
I'm inserting them into an SQL database using LINQ:
for(int i=0;i<50000;i++)
{
DB.TableName.InsertOnSubmit
(
new TableName
{
Value1 = Array[i,0],
Value2 = Array[i,1]
}
);
}
DB.SubmitChanges();
This takes about 6 minutes, and I want it to take much less if possible. Any suggestions?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您需要检查/做很多事情。
为数据库分配了多少磁盘空间?是否有足够的空闲空间来完成所有插入而不自动增加尺寸?如果不是,请增加数据库文件大小,因为它必须停止每次插入以自动调整数据库本身的大小。
不要进行单独插入。他们花了太长时间。相反,可以使用表值参数 (sql 2008)、sql 批量复制或单个插入语句(按优先顺序)。
之前删除该表上的所有索引并在加载后重新创建它们。有了这么多的插入,它们可能会被打入地狱。
如果您有任何触发器,请考虑删除它们,直到加载完成。
数据库服务器中有足够的可用 RAM 吗?您需要检查服务器本身,看看它是否消耗了所有可用的 RAM?如果是这样,您可能会考虑在加载之前重新启动...sql server 倾向于消耗并保留它可以得到的所有内容。
沿着 RAM 线,我们希望在服务器中保留足够的 RAM,以将整个数据库保存在内存中。我不确定这对您是否可行。
磁盘速度如何?队列深度很长吗?除了更换硬件之外,这里没有什么可做的。
There are a lot of things you need to check/do.
How much disk space is allocated to the database? Is there enough free to do all of the inserts without it auto increasing in size? If not, increase the database file size as it has to stop every so many inserts to auto resize the db itself.
do NOT do individual inserts. They take way too long. Instead either use table-value parameters (sql 2008), sql bulk copy, or a single insert statement (in that order of preference).
drop any indexes on that table before and recreate them after the load. With that many inserts they are probably going to be fragged to hell anyway.
If you have any triggers, consider dropping them until the load is complete.
Do you have enough RAM available in the database server? You need to check on the server itself to see if it's consuming ALL the available RAM? If so, you might consider doing a reboot prior to the load... sql server has a tendency to just consume and hold on to everything it can get it's hands on.
Along the RAM lines, we like to keep enough RAM in the server to hold the entire database in memory. I'm not sure if this is feasible for you or not.
How is it's disk speed? Is the queue depth pretty long? Other than hardware replacement there's not much to be done here.
由于您正在执行简单的插入操作,并且没有从使用 LinqToSql 中获得太多收益,因此请查看 SqlBulkCopy,它将消除大部分往返并减少 Sql Server 端的开销。您只需进行很少的编码更改即可使用它。
还要考虑按表索引的列对数据进行预排序,因为这将在 SQL-Server 更新表时带来更好的缓存命中率。
还要考虑是否应该将数据上传到未索引的临时临时表,然后使用单个 sql 语句将存储过程插入到主表中。这可能让SqlServer 将索引工作分散到所有CPU 上。
As you are doing a simple insert and not gaining much from the use of LinqToSql, have a look at SqlBulkCopy, it will remove most of the round trips and reduce the overhead on the Sql Server side as well. You will have to make very few coding changes to use it.
Also look at pre-sorting your data by the column that the table is indexed on, as this will lead to better cache hits when SQL-Server is update the table.
Also consider if you should upload the data to a temp staging table that is not indexed, then a stored proc to insert into the main table with a single sql statement. This may let SqlServer spread the indexing work over all your CPUs.
如果您正在读取文件,最好使用 BULK INSERT (Transact -SQL),如果您一次从内存中写入这么多数据(50K 行),那么您最好先写入平面文件,然后在该文件上使用批量插入。
if you are reading in a file you'd be better off using BULK INSERT (Transact-SQL) and if you are writing that much (50K rows) at one time from memory, you might be better off writing to a flat file first and then using Bulk Insert on that file.