当没有标准模式时,读取 xml 文件的最佳方法是什么?

发布于 2024-12-18 13:38:55 字数 884 浏览 2 评论 0原文

我正在开发一个应用程序,其中我必须每次读取具有不同节点集的 XML 文件,尽管所有文件中只出现一定数量的节点,但它们出现的组合不断变化,XML文件是由我无法控制的另一个系统生成的,我正在研究 Linq to XML 和 XML 序列化,但我想序列化不是一个选择,因为它需要预先构建的类来创建对象。

XML 数据示例

<Employee>
  <PersonalInfo>
    <FirstName>Vamsi</FirstName>
    <LastName>Krishna</LastName>
  </PersonalInfo>
  <EmploymentInfo>
    <Department>
      <Id>101</Id>
      <Position>SD</Position>
    <Department>
  <EmploymentInfo>
</Employee>

另一种格式

<Employee>
  <PersonalInfo>
    <FirstName>Vamsi</FirstName>
    <LastName>Krishna</LastName>
  </PersonalInfo>      
</Employee>

您可以观察到第二个示例中完全缺少 EmploymentInfo 节点,可以将 XML 数据呈现给应用程序的组合有很多种,我必须读取 XML文件验证它通过我的 C# 代码插入到 SQL Server 数据库中。

I am working on an application in which, i have to read XML files that have a different set of nodes each time, although only a certain number of nodes appear in all the files, the combination in which they appear keep on changing, the XML files are generated by another system which i cannot control, I am looking into Linq to XML and XML serialization, but i guess serialization is not a choice since it needs pre-built classes to create objects.

Example XML data

<Employee>
  <PersonalInfo>
    <FirstName>Vamsi</FirstName>
    <LastName>Krishna</LastName>
  </PersonalInfo>
  <EmploymentInfo>
    <Department>
      <Id>101</Id>
      <Position>SD</Position>
    <Department>
  <EmploymentInfo>
</Employee>

Another Format

<Employee>
  <PersonalInfo>
    <FirstName>Vamsi</FirstName>
    <LastName>Krishna</LastName>
  </PersonalInfo>      
</Employee>

You can observe that EmploymentInfo node is completely missing in the second example, there are many number of combinations in which the XML data can be presented to the application, I have to read the XML file validate it insert into an SQL Server database through my C# code.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

[浮城] 2024-12-25 13:38:55

我想说这要看情况。

如果您只想以强类型方式与另一个系统通信,并且您可以预期 XML 模式不会频繁更改,那么您可能可以接受 XML 序列化。只需将反序列化封装到一个单独的组件中并编写它们的不同版本(是的,您需要能够确定当前使用的架构版本)。我的意思是,每个版本都有自己的一组由序列化器定位的类。

但是,如果您确实无法从外部应用程序使用的架构中推断出系统,并且需要一些智能解析器,那么您最好使用 XPath 或 Linq to XML 或其他一些 XML 级 API 来手动处理 XML。

顺便说一句,您的两个示例对于 XMLSerializer 来说都非常简单。在第二种情况下,它只会将 EmploymentInfo 设置为 null。

I'd say it depends.

If you just want to communicate with another system in a strongly-typed way, and you can expect the XML schemas to not be changing very frequently, you might be OK with XML serialization. Just encapsulate the deserialization into a separate component and write different versions of them (yes, you'll need to be able to determine the schema version that is currently used). I mean, each version would have it's own set of classes that are targeted by the serializer.

But if you really cannot infer a system out of the schemas used by the external app and need some intelligent parser, you'd better use XPath or Linq to XML or some other XML-level APIs to manually handle the XML-s.

BTW, both of your samples are pretty easy for the XMLSerializer. In the second case it will just set EmploymentInfo to null.

生生漫 2024-12-25 13:38:55

您可以编写一个使用 .Net Xpath 实现 的解析器类。解析器应在处理数据之前测试特定节点的子元素。

请访问 MSDN 了解完整语法。

更新

一个小例子,我将如何解决这个问题。首先,一些模型类保存一些数据:

public class PersonalInfo 
{
   public string FirstName { get; set;}
   public string LastName { get; set;}
   // more properties
}

public class EmployeeModel 
{
    // remove List<> if you always just have 1 personalinfo child element
    public List<PersonalInfo> {get; set;}   
    public List<EmploymentInfo> {get; set;}
    // more properties
}

现在是您的“解析器”:

public class MyParser
{
    // load xml string or xml file in constructor
    public MyParser(string xmlSource) { .. }


    public EmployeeModel GetEmployeeModel()
    {
         var result = new EmployeeModel();
         // use what ever you want to select nodes from your xml
         // and set data of result

         return result;
    }
}

在您的生产代码中,您可以使用此解析器类来获取 xml 数据的模型。

You could write a parser class wich ueses .Net Xpath implementation. The parser should test the child elements for specific nodes before processing the data.

Visit MSDN for the complete syntax.

Update

A little example what i would do to solve the problem. At first, some Model classes to hold some data:

public class PersonalInfo 
{
   public string FirstName { get; set;}
   public string LastName { get; set;}
   // more properties
}

public class EmployeeModel 
{
    // remove List<> if you always just have 1 personalinfo child element
    public List<PersonalInfo> {get; set;}   
    public List<EmploymentInfo> {get; set;}
    // more properties
}

Now your "Parser":

public class MyParser
{
    // load xml string or xml file in constructor
    public MyParser(string xmlSource) { .. }


    public EmployeeModel GetEmployeeModel()
    {
         var result = new EmployeeModel();
         // use what ever you want to select nodes from your xml
         // and set data of result

         return result;
    }
}

In your productive code you can use this parser class to get a model of your xml data.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文