是否有可用于处理(读取)Excel 2003 XML 文档的库

发布于 2024-11-01 10:01:20 字数 458 浏览 2 评论 0原文

我们遇到的情况是,Excel 的 XML 2003 格式广泛用于将文件上传到第三方系统。需要从这些文件中提取一些信息以进行其他处理。是否有任何库可以处理 Excel 2003 XML 格式?

进一步详细信息:这些 XML 文件的格式已由第三方定义,我们无法更改。此外,数据本身不是表格形式的,也不是任何一致的格式。

我以前曾使用 System.Xml.Linq 从这些 XML 电子表格的非常简单的版本中提取数据(基本上是行、列索引方法)。现在所需的数据来自更复杂的版本,其中包括合并单元格、命名范围等。

此处理是在服务器上完成的,因此 Interop 不是基于此的选项 有关服务器上 Excel 的臭名昭著的 MS 知识库

有人还可以建议一种处理这些文件的方法吗?

We have a situation where Excel's XML 2003 format is widely used for file uploads to a third party system. There is a requirement that some information needs to be extracted from these files for other processing. Are there any libraries available that process Excel's 2003 XML format?

Further detail: the format of these XML files have been defined by the third party and we cannot change it. In addition, the data itself is not tabular or in any consistent format.

I have previously used System.Xml.Linq to extract data (basically a row, column index approach) from very simple versions of these XML spreadsheets. The data required now is from more complex versions which include merged cells, named ranges etc.

This processing is done on a server, thus Interop is not an option based on this infamous MS KB about Excel on a server

Can someone also suggest an approach to processing these files?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

空名 2024-11-08 10:01:20

最终的解决方案要求我创建一个 XSLT 以从文件中提取所需的数据,并将其转换为非常简单的数据 XML 表示形式。我还创建了一些类,这些类表示使用 XmlSerializer.Deserialize() 反序列化生成的 XML 的对象模型。

然而,为了使其有效工作,需要对源 Excel XML 文件进行更新,以包含需要提取的单元格的命名范围。使用名称范围可以实现更简单的 XSLT,但最大的代码味道是依赖于我无法控制的文件中命名范围的存在。

代码的高级概述

   XPathDocument doc = new XPathDocument("path to Excel xml file");
   XslCompiledTransform xslt = new XslCompiledTransform();

   StringReader sr = new StringReader(Resources.XSLT); // embedded resource, the xslt is read in as a string
   XmlTextReader xs = new XmlTextReader(sr);
   xslt.Load(xs);
   XmlWriterSettings settings = new XmlWriterSettings()
                                    {
                                        Indent = true,
                                        Encoding = Encoding.UTF8,
                                        OmitXmlDeclaration = false
                                    };

   MemoryStream memStream = new MemoryStream();
   using (XmlWriter writer = XmlWriter.Create(memStream, settings))
   {
       xslt.Transform(doc, writer); // the simple xml..almost there 
   }

   MyCustomClass curve;
   {
       XmlSerializer deSerializer = new XmlSerializer(typeof(MyCustomClass));
       // reset needed to beginning of mem stream since current position is the last write position
       memStream.Position = 0; 
       curve = (MyCustomClass)deSerializer.Deserialize(memStream);
    }

The eventual solution required me to create an XSLT to extract the required data from the file and transform to a very simple XML representation of the data. I also created classes that represented the object model for the deserialzation of the generated XML using XmlSerializer.Deserialize().

However, for this to work effectively, an update was required to the source Excel XML files to include Named Ranges for the cells that needed to be extracted.Having the Name Ranges allowed for a more simpler XSLT, however the biggest code-smell is the dependency on the existence of the Named Ranges in a file I don't control.

A high level overview of the code

   XPathDocument doc = new XPathDocument("path to Excel xml file");
   XslCompiledTransform xslt = new XslCompiledTransform();

   StringReader sr = new StringReader(Resources.XSLT); // embedded resource, the xslt is read in as a string
   XmlTextReader xs = new XmlTextReader(sr);
   xslt.Load(xs);
   XmlWriterSettings settings = new XmlWriterSettings()
                                    {
                                        Indent = true,
                                        Encoding = Encoding.UTF8,
                                        OmitXmlDeclaration = false
                                    };

   MemoryStream memStream = new MemoryStream();
   using (XmlWriter writer = XmlWriter.Create(memStream, settings))
   {
       xslt.Transform(doc, writer); // the simple xml..almost there 
   }

   MyCustomClass curve;
   {
       XmlSerializer deSerializer = new XmlSerializer(typeof(MyCustomClass));
       // reset needed to beginning of mem stream since current position is the last write position
       memStream.Position = 0; 
       curve = (MyCustomClass)deSerializer.Deserialize(memStream);
    }
给我一枪 2024-11-08 10:01:20

您是否考虑过使用 xsd.exe 您的工作室是否有工具来生成类以便轻松读取 xml 文件?

当然,它不会包含任何用于组合字段的智能逻辑,但好处是您不需要构建自己的阅读逻辑或安装互操作库。

Have you considered using the xsd.exe tool with your studio to generate classes for easily reading the xml files?

It won't include any smart logic for combining fields of course-- but the benefit is that you won't need to build your own reading logic or install interop libraries.

空城缀染半城烟沙 2024-11-08 10:01:20

检查 CodePlex 上的 Excel 数据阅读器是否满足您的要求。我使用它将简单的数据列表从 Excel 导入到我们的一个应用程序中。

Check if Excel Data Reader on CodePlex does what you require. I'm using it to import simple lists of data from Excel into one of our applications.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文