当没有标准模式时,读取 xml 文件的最佳方法是什么?
我正在开发一个应用程序,其中我必须每次读取具有不同节点集的 XML 文件,尽管所有文件中只出现一定数量的节点,但它们出现的组合不断变化,XML文件是由我无法控制的另一个系统生成的,我正在研究 Linq to XML 和 XML 序列化,但我想序列化不是一个选择,因为它需要预先构建的类来创建对象。
XML 数据示例
<Employee>
<PersonalInfo>
<FirstName>Vamsi</FirstName>
<LastName>Krishna</LastName>
</PersonalInfo>
<EmploymentInfo>
<Department>
<Id>101</Id>
<Position>SD</Position>
<Department>
<EmploymentInfo>
</Employee>
另一种格式
<Employee>
<PersonalInfo>
<FirstName>Vamsi</FirstName>
<LastName>Krishna</LastName>
</PersonalInfo>
</Employee>
您可以观察到第二个示例中完全缺少 EmploymentInfo
节点,可以将 XML 数据呈现给应用程序的组合有很多种,我必须读取 XML文件验证它通过我的 C# 代码插入到 SQL Server 数据库中。
I am working on an application in which, i have to read XML files that have a different set of nodes each time, although only a certain number of nodes appear in all the files, the combination in which they appear keep on changing, the XML files are generated by another system which i cannot control, I am looking into Linq to XML and XML serialization, but i guess serialization is not a choice since it needs pre-built classes to create objects.
Example XML data
<Employee>
<PersonalInfo>
<FirstName>Vamsi</FirstName>
<LastName>Krishna</LastName>
</PersonalInfo>
<EmploymentInfo>
<Department>
<Id>101</Id>
<Position>SD</Position>
<Department>
<EmploymentInfo>
</Employee>
Another Format
<Employee>
<PersonalInfo>
<FirstName>Vamsi</FirstName>
<LastName>Krishna</LastName>
</PersonalInfo>
</Employee>
You can observe that EmploymentInfo
node is completely missing in the second example, there are many number of combinations in which the XML data can be presented to the application, I have to read the XML file validate it insert into an SQL Server database through my C# code.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我想说这要看情况。
如果您只想以强类型方式与另一个系统通信,并且您可以预期 XML 模式不会频繁更改,那么您可能可以接受 XML 序列化。只需将反序列化封装到一个单独的组件中并编写它们的不同版本(是的,您需要能够确定当前使用的架构版本)。我的意思是,每个版本都有自己的一组由序列化器定位的类。
但是,如果您确实无法从外部应用程序使用的架构中推断出系统,并且需要一些智能解析器,那么您最好使用 XPath 或 Linq to XML 或其他一些 XML 级 API 来手动处理 XML。
顺便说一句,您的两个示例对于 XMLSerializer 来说都非常简单。在第二种情况下,它只会将
EmploymentInfo
设置为 null。I'd say it depends.
If you just want to communicate with another system in a strongly-typed way, and you can expect the XML schemas to not be changing very frequently, you might be OK with XML serialization. Just encapsulate the deserialization into a separate component and write different versions of them (yes, you'll need to be able to determine the schema version that is currently used). I mean, each version would have it's own set of classes that are targeted by the serializer.
But if you really cannot infer a system out of the schemas used by the external app and need some intelligent parser, you'd better use XPath or Linq to XML or some other XML-level APIs to manually handle the XML-s.
BTW, both of your samples are pretty easy for the
XMLSerializer
. In the second case it will just setEmploymentInfo
to null.您可以编写一个使用 .Net Xpath 实现 的解析器类。解析器应在处理数据之前测试特定节点的子元素。
请访问 MSDN 了解完整语法。
更新
一个小例子,我将如何解决这个问题。首先,一些模型类保存一些数据:
现在是您的“解析器”:
在您的生产代码中,您可以使用此解析器类来获取 xml 数据的模型。
You could write a parser class wich ueses .Net Xpath implementation. The parser should test the child elements for specific nodes before processing the data.
Visit MSDN for the complete syntax.
Update
A little example what i would do to solve the problem. At first, some Model classes to hold some data:
Now your "Parser":
In your productive code you can use this parser class to get a model of your xml data.