在submitChanges之前检测LINQ插入并使用BulkInsert代替
我有一堆代码,除了一些更新/删除之外,还设置了向 LINQ DB 添加 10k 到 20k 的插入。所有这些更改都是通过 db.SubmitChanges() 调用提交的。问题是它太慢了:(性能已经在 StackOverflow 上的这篇精彩文章中进行了彻底测试:
问题是我不想重写准备数据库对象的所有代码。我尝试 psuedo编写我想要在下面执行的操作
免责声明:这不是真正的代码!它不会编译......甚至不会关闭:)
旧方式:
SchedDB.Data.SchedulerDBDataContext db = new SchedDB.Data.SchedulerDBDataContext();
//Do a bunch of stuff to populate "db"
db.SubmitChanges()
新方式:
SchedDB.Data.SchedulerDBDataContext db = new SchedDB.Data.SchedulerDBDataContext();
//Do a bunch of stuff to populate "db"
//NEW: Detect all of the inserts in the "db" object. Remove those inserts and generate code to insert them with a batch dataadapter. For example:
//
//DataTable dtProducts = new DataTable()
//dtProducts.Columns.Add(ProductID) //yada yada all of the columns here
//
//DataTable dtCustomers = new DataTable()
//dtCustomers.Columns.Add(CustomerID) //yada yada all of the columns here
//foreach (insertItem in db.Inserts) //this is pseudo code, I need help here
//{
// if (insertItem.destinationTable == "Products")
// {
// DataRow dr = dtProducts.NewRow();
// dr["ProductID"] = insertItem.ProductID; //This line of code probably isn't even close to being right... I can't imagine the "insertItem" holding all of the columns no matter what the destinationTable is
// }
// if (insertItem.destinationTable == "Customers")
// {
// //similar code, all customer columns, yada yada
// }
// IMPORTANT: remove the item from the LINQ db so it doesn't insert it anymore
//
// Make a table adapter for each datatable
// Set the .BatchSize parameter
//Commit each table adapter.
// db.SubmitChanges() //If there are any updates or deletes they are still in here and still need to happen.
如果有任何错误批量更新,那么很高兴知道这一点,这样我就可以回滚 LINQ 数据库,并可能运行一些其他代码来清除 dataapapters 插入。 (我可以处理这个问题,我的所有插入都有一个额外的列,它是批量插入所独有的 int 。)
I've got a bunch of code that is setup to add 10k to 20k inserts to a LINQ DB in addition to a few updates/deletes. All of these changes are committed through a db.SubmitChanges() call. The problem is it's too slow :( The performance has been thoroughly tested in this great post on StackOverflow:
Very slow insert process using Linq to Sql
The issue is that I do not want to rewrite all of the code that prepares my DB object. I've attempted to psuedo code what I'd like to do below.
Disclaimer: This is not real code! It will not compile.... not even close :)
OLD WAY:
SchedDB.Data.SchedulerDBDataContext db = new SchedDB.Data.SchedulerDBDataContext();
//Do a bunch of stuff to populate "db"
db.SubmitChanges()
NEW WAY:
SchedDB.Data.SchedulerDBDataContext db = new SchedDB.Data.SchedulerDBDataContext();
//Do a bunch of stuff to populate "db"
//NEW: Detect all of the inserts in the "db" object. Remove those inserts and generate code to insert them with a batch dataadapter. For example:
//
//DataTable dtProducts = new DataTable()
//dtProducts.Columns.Add(ProductID) //yada yada all of the columns here
//
//DataTable dtCustomers = new DataTable()
//dtCustomers.Columns.Add(CustomerID) //yada yada all of the columns here
//foreach (insertItem in db.Inserts) //this is pseudo code, I need help here
//{
// if (insertItem.destinationTable == "Products")
// {
// DataRow dr = dtProducts.NewRow();
// dr["ProductID"] = insertItem.ProductID; //This line of code probably isn't even close to being right... I can't imagine the "insertItem" holding all of the columns no matter what the destinationTable is
// }
// if (insertItem.destinationTable == "Customers")
// {
// //similar code, all customer columns, yada yada
// }
// IMPORTANT: remove the item from the LINQ db so it doesn't insert it anymore
//
// Make a table adapter for each datatable
// Set the .BatchSize parameter
//Commit each table adapter.
// db.SubmitChanges() //If there are any updates or deletes they are still in here and still need to happen.
If there are any errors in the mass updates then it would be nice to know that so I can roll back the LINQ db and potentially run some other code to clear the dataapapters inserts. (I can handle this, all of my inserts have an extra column that is an int that is unique to the batch insert.)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
好的,我找到了问题的答案。您可以从这里下载一个甜蜜的课程:
http://code.msdn.microsoft.com/LinqEntityDataReader
这个类非常适合我,我可以将 LINQ 生成的“Customer”对象集合传递给 SQLBulkCopy!
OK I found the answer to my question. A sweet class you can download from here:
http://code.msdn.microsoft.com/LinqEntityDataReader
This class is perfect for me, I can just pass a collection of "Customer" objects generated by LINQ to SQLBulkCopy!
与其执行所有这些操作,为什么不直接插入 db.SubmitChanges();在每个单独的项目之后调用,还是在 X 个项目块之后调用?这也不理想,但比一次提交大量更改要好,并且可能很容易在代码中进行更改。
向 DataContext 提供中等大小的更新,它实际上并不会为您完成这些更新。然后转向更有成效的事情。
Instead of doing all of this, why not just insert db.SubmitChanges(); calls after each individual item, or after blocks of X items? That's not ideal, either, but better than submitting massive changes all at once, and potentially very easy to change in the code.
Feed the DataContext moderately sized updates, and it doesn't actually do a bad job of getting them done for you. Then move on to more productive things.