使用 C# 解析 XML

发布于 2024-10-17 16:00:41 字数 2119 浏览 1 评论 0原文

我有一个 XML 文件,如下所示:
XML file

我上传了 XML 文件:http://dl.dropbox.com/u/10773282/2011/result.xml 。它是机器生成的 XML,因此您可能需要一些 XML 查看器/编辑器。

我使用此 C# 代码获取 CoverageDSPriv/Module/* 中的元素。

using System;
using System.Xml;
using System.Xml.Linq;

namespace HIR {
  class Dummy {

    static void Main(String[] argv) {

      XDocument doc = XDocument.Load("result.xml");

      var coveragePriv = doc.Descendants("CoverageDSPriv"); //.First();
      var cons = coveragePriv.Elements("Module");

      foreach (var con in cons)
      {
        var id = con.Value;
        Console.WriteLine(id);
      }
    }
  }
}

运行代码,我得到这个结果。

hello.exe6144008016161810hello.exehello.exehello.exe81061hello.exehello.exe!17main_main40030170170010180180011190190012200200013hello.exe!107testfunctiontestfunction(int)40131505001460600158080216120120017140140018AA

我期望得到

hello.exe
61440
...

然而,我只得到一行长字符串。

  • Q1:可能出现什么问题?
  • Q2:如何获取 cons 中的元素数量?我尝试了 cons.Count,但它不起作用。
  • Q3:如果我需要获取 的嵌套值,我使用以下代码:

    varcoveragePriv = doc.Descendants("CoverageDSPriv"); //。第一的(); var cons =coveragePriv.Elements("Module").Elements("ModuleName");

我可以忍受这一点,但如果元素嵌套得很深,我可能希望有直接的方法来获取元素。还有其他方法可以做到这一点吗?

ADDED

var cons = coveragePriv.Elements("Module").Elements();

解决了这个问题,但是对于 NamespaceTable,它再次打印出一行中的所有元素。

hello.exe
61440
0
8
0
1
6
1
61810hello.exehello.exehello.exe81061hello.exehello.exe!17main_main40030170170010180180011190190012200200013hello.exe!107testfunctiontestfunction(int)40131505001460600158080216120120017140140018

或者,Linq to XML 可能是更好的解决方案,如这篇文章

I have an XML file as follows:
XML file

I uploaded the XML file : http://dl.dropbox.com/u/10773282/2011/result.xml . It's a machine generated XML, so you might need some XML viewer/editor.

I use this C# code to get the elements in CoverageDSPriv/Module/*.

using System;
using System.Xml;
using System.Xml.Linq;

namespace HIR {
  class Dummy {

    static void Main(String[] argv) {

      XDocument doc = XDocument.Load("result.xml");

      var coveragePriv = doc.Descendants("CoverageDSPriv"); //.First();
      var cons = coveragePriv.Elements("Module");

      foreach (var con in cons)
      {
        var id = con.Value;
        Console.WriteLine(id);
      }
    }
  }
}

Running the code, I get this result.

hello.exe6144008016161810hello.exehello.exehello.exe81061hello.exehello.exe!17main_main40030170170010180180011190190012200200013hello.exe!107testfunctiontestfunction(int)40131505001460600158080216120120017140140018AA

I expect to get

hello.exe
61440
...

However, I get just one line of long string.

  • Q1 : What might be wrong?
  • Q2 : How to get the # of elements in cons? I tried cons.Count, but it doesn't work.
  • Q3 : If I need to get nested value of <CoverageDSPriv><Module><ModuleNmae> I use this code :

    var coveragePriv = doc.Descendants("CoverageDSPriv"); //.First();
    var cons = coveragePriv.Elements("Module").Elements("ModuleName");

I can live with this, but if the elements are deeply nested, I might be wanting to have direct way to get the elements. Are there any other ways to do that?

ADDED

var cons = coveragePriv.Elements("Module").Elements();

solves this issue, but for the NamespaceTable, it again prints out all the elements in one line.

hello.exe
61440
0
8
0
1
6
1
61810hello.exehello.exehello.exe81061hello.exehello.exe!17main_main40030170170010180180011190190012200200013hello.exe!107testfunctiontestfunction(int)40131505001460600158080216120120017140140018

Or, Linq to XML can be a better solution, as this post.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

得不到的就毁灭 2024-10-24 16:00:42

在我看来,您只有一个名为 Module 的元素 - 因此 .Value 只是返回整个元素的 InnerText。你是有意这样做的吗?

coveragePriv.Element("Module").Elements();

这将返回 Module 元素的所有子元素,这似乎就是您想要的。

更新:

的子级,但您似乎希望以与 类似的方式处理它。 Module> 表示您要写出每个子元素。因此,一种强力方法是为添加另一个循环:

foreach (var con in cons)
{
    if (con.Name == "NamespaceTable") 
    {
        foreach (var nsElement in con.Elements()) 
        {
            var nsId = nsElement.Value;
            Console.WriteLine(nsId);
        }
    }
    else
    {
        var id = con.Value;
        Console.WriteLine(id);
    }
}

或者,也许您宁愿通过.Descendents()将它们完全非规范化:

var cons = coveragePriv.Element("Module").Descendents();

foreach (var con in cons)
{
    var id = con.Value;
    Console.WriteLine(id);
}

It looks to me like you only have one element named Module -- so .Value is simply returning you the InnerText of that entire element. Were you intending this instead?

coveragePriv.Element("Module").Elements();

This would return all the child elements of the Module element, which seems to be what your'e after.

Update:

<NamespaceTable> is a child of <Module> but you appear to want to handle it similarly to <Module> in that you want to write out each child element. Thus, one brute-force approach would be to add another loop for <NamespaceTable>:

foreach (var con in cons)
{
    if (con.Name == "NamespaceTable") 
    {
        foreach (var nsElement in con.Elements()) 
        {
            var nsId = nsElement.Value;
            Console.WriteLine(nsId);
        }
    }
    else
    {
        var id = con.Value;
        Console.WriteLine(id);
    }
}

Alternatively, perhaps you'd rather just denormalize them altogether via .Descendents():

var cons = coveragePriv.Element("Module").Descendents();

foreach (var con in cons)
{
    var id = con.Value;
    Console.WriteLine(id);
}
若沐 2024-10-24 16:00:42

XMLElement.Value 产生意外结果。在使用 .net 的 XML 中,您实际上负责手动遍历 xml 树。如果元素是文本,那么 value 可能会返回您想要的内容,但如果它是另一个元素,则不会返回那么多。

我已经完成了大量的 xml 解析,并且发现有更好的方法来处理 XML,具体取决于您对数据的处理方式。

1) 如果您计划将此数据输出为文本、更多 xml 或 html,则可以研究 XSLT 转换。这是将数据转换为其他可读格式的好方法。当我们想要在我们的网站上以 html 形式显示元数据时,我们会使用它。

2) 研究 XML 序列化。 C# 使这变得非常简单,并且使用起来令人惊奇,因为这样您就可以在使用数据时使用常规 C# 对象。 MS 甚至拥有从 XML 创建序列化类的工具。我通常从这里开始,清理它并添加我自己的调整以使其按照我的意愿工作。最好的方法是将对象反序列化为 XML,然后查看它是否与您所拥有的相匹配。

3) 尝试 Linq to XML。它将允许您像查询数据库一样查询 XML。一般来说,它会慢一些,但除非您需要绝对的性能,否则它可以很好地减少您的工作。

XMLElement.Value has unexpected results. In XML using .net you are really in charge of manually traversing the xml tree. If the element is text then value may return what you want but if its another element then not so much.

I have done a lot of xml parsing and I find there are way better ways to handle XML depending on what you are doing with the data.

1) You can look into XSLT transforms if you plan on outputting this data as text, more xml, or html. This is a great way to convert the data to some other readable format. We use this when we want to display our metadata on our website in html.

2) Look into XML Serialization. C# makes this very easy and it is amazing to use because then you can work with a regular C# object when consuming the data. MS even has tools to create the serlization class from the XML. I usually start with that, clean it up and add my own tweaks to make it work as I wish. The best way is to deserialize the object to XML and see if that matches what you have.

3) Try Linq to XML. It will allow you to query the XML as if it were a database. It is a little slower generally but unless you need absolute performance it works very well for minimizing your work.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文