Open XML SDK v2.0 删除 20,000 行中的第一行时出现性能问题行 Excel 文件
使用 OpenXML SDK v2.0 删除 20,000 多行 Excel 文件中的第一行时,有人遇到性能问题吗?
我正在使用 Open XML SDK 文档中建议的删除行编码。使用 Open XML SDK 删除第一行需要几分钟的时间,但在 Excel 应用程序中只需要一秒钟。
我最终发现瓶颈实际上是处理行删除时的冒泡方法。删除行后有许多行更新。因此,就我而言,大约有 20,000 行需要更新,逐行上移数据。
我想知道是否有更快的方法来删除行。
有人有想法吗?
Do anyone come across a performance issue when deleting a first row in a 20,000+ rows Excel file using OpenXML SDK v2.0?
I am using the delete row coding suggested in the Open XML SDK document. It takes me several minutes just to delete the first row using Open XML SDK, but it only takes just a second in Excel applicaton.
I eventually found out that the bottle-neck is actually on the bubble-up approach in dealing with row deletion. There are many rows updating after the deleted row. So in my case, there are around 20,000 rows to be updated, shifting up the data row by row.
I wonder if there is any faster way to do the row deletion.
Do anybody have an idea?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
好吧,坏消息是:是的,就是这样。
将 SDK 本身移到 System.IO.Packaging 中并在 Linq 中创建一个 IEnumerable/List 可能会获得更好的性能所有行的 to-XML,将其复制到新的
IEnumerable
/List
(不含第一行),重写的|
是它在索引中的位置,并将其写回到r
属性
内覆盖现有的子项。您需要对 sharedStrings.xml 文件中的任何字符串执行相同的操作 - 即删除其中的
.
元素被删除的行,但在这种情况下,它们现在已隐式索引,因此您可以直接删除它们。Well, the bad news here is: yep, that's the way it is.
You may get slightly better performance moving outside of the SDK itself into
System.IO.Packaging
and just creating anIEnumerable
/List
in like Linq-to-XML of all the rows, copy that to a newIEnumerable
/List
without the first row, rewrite ther
attribute of<row r="?"/>
to be it's place in the index, and the write that back inside<sheetData/>
over existing children.You'd need to kind of do the same for any strings in the sharedStrings.xml file - i.e. removing the
<ssi>.<si>
elements that were in the row that was deleted, but in this case they are now implicitly indexed, so you'd be able to get away with just outright removing them.解压缩文件、对其进行操作并重新打包的方法非常容易出错。
怎么样:如果您说它在 Excel 中运行良好:您尝试过使用 Interop 吗?这将启动一个新的 Excel 实例(可见或不可见),然后您可以打开文件、删除该行、保存并再次关闭应用程序。
Range 对象适用于多种用途。也用于删除元素。看一下:MSDN 范围描述< /a>.还有一个提示:Interop 使用 Excel,因此所有对象都必须使用从 1 开始的索引来寻址!
如需更多资源,请查看此 StackOverflow-thread。
The approach of unzipping the file, manipulating it and repacking it is very error-prune.
How about this: If you say, that it works fine in Excel: Have you tried to use the Interop? This starts a new instance of Excel (either visible or invisible), then you can open the File, delete the line, save and close the application again.
The Range-object is qualified for many purposes. Also for deleting elements. Have a look at: MSDN Range-Description. One more hint: Interop uses Excel, so all Objects have to be adressed with a 1-based index!
For more resources take a look at this StackOverflow-thread.