是否有可用于处理(读取)Excel 2003 XML 文档的库
我们遇到的情况是,Excel 的 XML 2003 格式广泛用于将文件上传到第三方系统。需要从这些文件中提取一些信息以进行其他处理。是否有任何库可以处理 Excel 2003 XML 格式?
进一步详细信息:这些 XML 文件的格式已由第三方定义,我们无法更改。此外,数据本身不是表格形式的,也不是任何一致的格式。
我以前曾使用 System.Xml.Linq 从这些 XML 电子表格的非常简单的版本中提取数据(基本上是行、列索引方法)。现在所需的数据来自更复杂的版本,其中包括合并单元格、命名范围等。
此处理是在服务器上完成的,因此 Interop 不是基于此的选项 有关服务器上 Excel 的臭名昭著的 MS 知识库
有人还可以建议一种处理这些文件的方法吗?
We have a situation where Excel's XML 2003 format is widely used for file uploads to a third party system. There is a requirement that some information needs to be extracted from these files for other processing. Are there any libraries available that process Excel's 2003 XML format?
Further detail: the format of these XML files have been defined by the third party and we cannot change it. In addition, the data itself is not tabular or in any consistent format.
I have previously used System.Xml.Linq
to extract data (basically a row, column index approach) from very simple versions of these XML spreadsheets. The data required now is from more complex versions which include merged cells, named ranges etc.
This processing is done on a server, thus Interop is not an option based on this infamous MS KB about Excel on a server
Can someone also suggest an approach to processing these files?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
最终的解决方案要求我创建一个 XSLT 以从文件中提取所需的数据,并将其转换为非常简单的数据 XML 表示形式。我还创建了一些类,这些类表示使用
XmlSerializer.Deserialize()
反序列化生成的 XML 的对象模型。然而,为了使其有效工作,需要对源 Excel XML 文件进行更新,以包含需要提取的单元格的命名范围。使用名称范围可以实现更简单的 XSLT,但最大的代码味道是依赖于我无法控制的文件中命名范围的存在。
代码的高级概述
The eventual solution required me to create an XSLT to extract the required data from the file and transform to a very simple XML representation of the data. I also created classes that represented the object model for the deserialzation of the generated XML using
XmlSerializer.Deserialize()
.However, for this to work effectively, an update was required to the source Excel XML files to include Named Ranges for the cells that needed to be extracted.Having the Name Ranges allowed for a more simpler XSLT, however the biggest code-smell is the dependency on the existence of the Named Ranges in a file I don't control.
A high level overview of the code
您是否考虑过使用 xsd.exe 您的工作室是否有工具来生成类以便轻松读取 xml 文件?
当然,它不会包含任何用于组合字段的智能逻辑,但好处是您不需要构建自己的阅读逻辑或安装互操作库。
Have you considered using the xsd.exe tool with your studio to generate classes for easily reading the xml files?
It won't include any smart logic for combining fields of course-- but the benefit is that you won't need to build your own reading logic or install interop libraries.
检查 CodePlex 上的 Excel 数据阅读器是否满足您的要求。我使用它将简单的数据列表从 Excel 导入到我们的一个应用程序中。
Check if Excel Data Reader on CodePlex does what you require. I'm using it to import simple lists of data from Excel into one of our applications.