如何使用 C# 和 LINQ 提取 XML 深处的信息?

发布于 2024-11-03 07:09:28 字数 14772 浏览 3 评论 0原文

这是我在 StackOverflow 上的第一篇文章,所以请耐心等待。如果我的代码示例有点长,我预先表示歉意。

使用 C# 和 LINQ,我尝试在一个更大的 XML 文件中识别一系列第三级 id 元素(在本例中为 000049)。每个第三级 id 都是唯一的,我想要的 ID 是基于每个第三级的一系列后代信息。更具体地说,如果 type == Alocation type(old) ==Vaultlocation type(new) == out,那么我想要选择该 id。下面是我正在使用的 XML 和 C# 代码。

一般来说,我的代码有效。如下所示,它将返回两次 id 000049,这是正确的。然而,我发现了一个小故障。如果我删除第一个包含 type == Ahistory 块,我的代码仍然返回 000049 的 id 两次,而它应该只返回它一次。我知道为什么会发生这种情况,但我找不到更好的方法来运行查询。有没有更好的方法来运行我的查询以获得我想要的输出并且仍然使用 LINQ?

我的 XML:

<?xml version="1.0" encoding="ISO8859-1" ?>
<data type="historylist">
    <date type="runtime">
        <year>2011</year>
        <month>04</month>
        <day>22</day>
        <dayname>Friday</dayname>
        <hour>15</hour>
        <minutes>24</minutes>
        <seconds>46</seconds>
    </date>
    <customer>
        <id>0001</id>
        <description>customer</description>
        <mediatype>
            <id>kit</id>
            <description>customer kit</description>
            <volume>
                <id>000049</id>
                <history>
                    <date type="optime">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                        <hour>03</hour>
                        <minutes>00</minutes>
                        <seconds>02</seconds>
                    </date>
                    <userid>batch</userid>
                    <type>OD</type>
                    <location type="old">
                        <repository>vault</repository>
                        <slot>0</slot>
                    </location>
                    <location type="new">
                        <repository>out</repository>
                        <slot>0</slot>
                    </location>
                    <container>0001.kit.000049</container>
                    <date type="movedate">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                    </date>
                </history>
                <history>
                    <date type="optime">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                        <hour>06</hour>
                        <minutes>43</minutes>
                        <seconds>33</seconds>
                    </date>
                    <userid>vaultred</userid>
                    <type>A</type>
                    <location type="old">
                        <repository>vault</repository>
                        <slot>0</slot>
                    </location>
                    <location type="new">
                        <repository>out</repository>
                        <slot>0</slot>
                    </location>
                    <container>0001.kit.000049</container>
                    <date type="movedate">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                    </date>
                </history>
                <history>
                    <date type="optime">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                        <hour>06</hour>
                        <minutes>43</minutes>
                        <seconds>33</seconds>
                    </date>
                    <userid>vaultred</userid>
                    <type>S</type>
                    <location type="old">
                        <repository>vault</repository>
                        <slot>0</slot>
                    </location>
                    <location type="new">
                        <repository>out</repository>
                        <slot>0</slot>
                    </location>
                    <container>0001.kit.000049</container>
                    <date type="movedate">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                    </date>
                </history>
                <history>
                    <date type="optime">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                        <hour>06</hour>
                        <minutes>45</minutes>
                        <seconds>00</seconds>
                    </date>
                    <userid>batch</userid>
                    <type>O</type>
                    <location type="old">
                        <repository>out</repository>
                        <slot>0</slot>
                    </location>
                    <location type="new">
                        <repository>site</repository>
                        <slot>0</slot>
                    </location>
                    <container>0001.kit.000049</container>
                    <date type="movedate">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                    </date>
                </history>
                <history>
                    <date type="optime">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                        <hour>11</hour>
                        <minutes>25</minutes>
                        <seconds>59</seconds>
                    </date>
                    <userid>ihcmdm</userid>
                    <type>A</type>
                    <location type="old">
                        <repository>out</repository>
                        <slot>0</slot>
                    </location>
                    <location type="new">
                        <repository>site</repository>
                        <slot>0</slot>
                    </location>
                    <container>0001.kit.000049</container>
                    <date type="movedate">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                    </date>
                </history>
                <history>
                    <date type="optime">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                        <hour>11</hour>
                        <minutes>25</minutes>
                        <seconds>59</seconds>
                    </date>
                    <userid>ihcmdm</userid>
                    <type>S</type>
                    <location type="old">
                        <repository>out</repository>
                        <slot>0</slot>
                    </location>
                    <location type="new">
                        <repository>site</repository>
                        <slot>0</slot>
                    </location>
                    <container>0001.kit.000049</container>
                    <date type="movedate">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                    </date>
                </history>
            </volume>
            ...

我的 C# 代码:

IEnumerable<XElement> caseIdLeavingVault =
    from volume in root.Descendants("volume")
    where
        (from type in volume.Descendants("type")
         where type.Value == "A"
         select type).Any() &&
        (from locationOld in volume.Descendants("location")
         where
             ((String)locationOld.Attribute("type") == "old" &&
              (String)locationOld.Element("repository") == "vault") &&
             (from locationNew in volume.Descendants("location")
              where
                  ((String)locationNew.Attribute("type") == "new" &&
                   (String)locationNew.Element("repository") == "out")
              select locationNew).Any()
         select locationOld).Any()
    select volume.Element("id");

    ...

foreach (XElement volume in caseIdLeavingVault)
{
    Console.WriteLine(volume.Value.ToString());
}

谢谢。


好吧,伙计们,我又被难住了。鉴于同样的情况和下面@Elian的解决方案(效果很好),我需要历史记录“optime”“movedate”日期用于选择id。这有道理吗?我希望以这样的方式结束:

select new { 
    id = volume.Element("id").Value, 

    // this is from "optime"
    opYear = <whaterver>("year").Value, 
    opMonth = <whatever>("month").Value, 
    opDay = <whatever>("day").Value, 

    // this is from "movedate"
    mvYear = <whaterver>("year").Value, 
    mvMonth = <whatever>("month").Value, 
    mvDay = <whatever>("day").Value 
} 

我尝试了很多不同的组合,但是 Attribute > 不断妨碍我,我似乎无法得到我想要的东西。


好的。我找到了解决方案 效果很好:

select new {
    caseId = volume.Element("id").Value,

    // this is from "optime"
    opYear = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("year").Value,
    opMonth = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("month").Value,
    opDay = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("day").Value,

    // this is from "movedate"
    mvYear = volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("year").Value,
    mvMonth = volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("month").Value,
    mvDay = volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("day").Value
};

但是,当它找到没有 "movedate"id 时,它会失败。其中一些已经存在,所以现在我正在努力。


好吧,昨天下午晚些时候,我终于找到了我一直想要的解决方案:

var caseIdLeavingSite =
    from volume in root.Descendants("volume")
    where volume.Elements("history").Any(
        h => h.Element("type").Value == "A" &&
        h.Elements("location").Any(l => l.Attribute("type").Value == "old" && ((l.Element("repository").Value == "site") ||
                                                                               (l.Element("repository").Value == "init"))) &&
        h.Elements("location").Any(l => l.Attribute("type").Value == "new" && l.Element("repository").Value == "toVault")
        )
    select new {
        caseId = volume.Element("id").Value,
        opYear = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("year").Value,
        opMonth = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("month").Value,
        opDay = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("day").Value,
        mvYear = (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").Any() == true) ? 
                 (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("year").Value) : "0",
        mvMonth = (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").Any() == true) ? 
                  (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("month").Value) : "0",
        mvDay = (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").Any() == true) ? 
                (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("day").Value) : "0"
   };

这满足了 @Elian 帮助的要求,并获取了必要的附加日期信息。它还通过使用三元运算符 ?: 解决了没有 "movedate" 元素时的少数情况。

现在,如果有人知道如何提高效率,我仍然感兴趣。谢谢。

This is my first post on StackOverflow, so please bear with me. And I apologize upfront if my code example is a bit long.

Using C# and LINQ, I'm trying to identify a series of third level id elements (000049 in this case) in a much larger XML file. Each third level id is unique, and the ones I want are based on a series of descendant info for each. More specifically, if type == A and location type(old) == vault and location type(new) == out, then I want to select that id. Below is the XML and C# code that I'm using.

In general my code works. As written below it will return an id of 000049 twice, which is correct. However, I have found a glitch. If I remove the first history block that contains type == A, my code still returns an id of 000049 twice when it should only return it once. I know why it is happening, but I can't figure out a better way to run the query. Is there a better way to run my query to get the output I want and still use LINQ?

My XML:

<?xml version="1.0" encoding="ISO8859-1" ?>
<data type="historylist">
    <date type="runtime">
        <year>2011</year>
        <month>04</month>
        <day>22</day>
        <dayname>Friday</dayname>
        <hour>15</hour>
        <minutes>24</minutes>
        <seconds>46</seconds>
    </date>
    <customer>
        <id>0001</id>
        <description>customer</description>
        <mediatype>
            <id>kit</id>
            <description>customer kit</description>
            <volume>
                <id>000049</id>
                <history>
                    <date type="optime">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                        <hour>03</hour>
                        <minutes>00</minutes>
                        <seconds>02</seconds>
                    </date>
                    <userid>batch</userid>
                    <type>OD</type>
                    <location type="old">
                        <repository>vault</repository>
                        <slot>0</slot>
                    </location>
                    <location type="new">
                        <repository>out</repository>
                        <slot>0</slot>
                    </location>
                    <container>0001.kit.000049</container>
                    <date type="movedate">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                    </date>
                </history>
                <history>
                    <date type="optime">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                        <hour>06</hour>
                        <minutes>43</minutes>
                        <seconds>33</seconds>
                    </date>
                    <userid>vaultred</userid>
                    <type>A</type>
                    <location type="old">
                        <repository>vault</repository>
                        <slot>0</slot>
                    </location>
                    <location type="new">
                        <repository>out</repository>
                        <slot>0</slot>
                    </location>
                    <container>0001.kit.000049</container>
                    <date type="movedate">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                    </date>
                </history>
                <history>
                    <date type="optime">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                        <hour>06</hour>
                        <minutes>43</minutes>
                        <seconds>33</seconds>
                    </date>
                    <userid>vaultred</userid>
                    <type>S</type>
                    <location type="old">
                        <repository>vault</repository>
                        <slot>0</slot>
                    </location>
                    <location type="new">
                        <repository>out</repository>
                        <slot>0</slot>
                    </location>
                    <container>0001.kit.000049</container>
                    <date type="movedate">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                    </date>
                </history>
                <history>
                    <date type="optime">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                        <hour>06</hour>
                        <minutes>45</minutes>
                        <seconds>00</seconds>
                    </date>
                    <userid>batch</userid>
                    <type>O</type>
                    <location type="old">
                        <repository>out</repository>
                        <slot>0</slot>
                    </location>
                    <location type="new">
                        <repository>site</repository>
                        <slot>0</slot>
                    </location>
                    <container>0001.kit.000049</container>
                    <date type="movedate">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                    </date>
                </history>
                <history>
                    <date type="optime">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                        <hour>11</hour>
                        <minutes>25</minutes>
                        <seconds>59</seconds>
                    </date>
                    <userid>ihcmdm</userid>
                    <type>A</type>
                    <location type="old">
                        <repository>out</repository>
                        <slot>0</slot>
                    </location>
                    <location type="new">
                        <repository>site</repository>
                        <slot>0</slot>
                    </location>
                    <container>0001.kit.000049</container>
                    <date type="movedate">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                    </date>
                </history>
                <history>
                    <date type="optime">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                        <hour>11</hour>
                        <minutes>25</minutes>
                        <seconds>59</seconds>
                    </date>
                    <userid>ihcmdm</userid>
                    <type>S</type>
                    <location type="old">
                        <repository>out</repository>
                        <slot>0</slot>
                    </location>
                    <location type="new">
                        <repository>site</repository>
                        <slot>0</slot>
                    </location>
                    <container>0001.kit.000049</container>
                    <date type="movedate">
                        <year>2011</year>
                        <month>04</month>
                        <day>22</day>
                        <dayname>Friday</dayname>
                    </date>
                </history>
            </volume>
            ...

My C# code:

IEnumerable<XElement> caseIdLeavingVault =
    from volume in root.Descendants("volume")
    where
        (from type in volume.Descendants("type")
         where type.Value == "A"
         select type).Any() &&
        (from locationOld in volume.Descendants("location")
         where
             ((String)locationOld.Attribute("type") == "old" &&
              (String)locationOld.Element("repository") == "vault") &&
             (from locationNew in volume.Descendants("location")
              where
                  ((String)locationNew.Attribute("type") == "new" &&
                   (String)locationNew.Element("repository") == "out")
              select locationNew).Any()
         select locationOld).Any()
    select volume.Element("id");

    ...

foreach (XElement volume in caseIdLeavingVault)
{
    Console.WriteLine(volume.Value.ToString());
}

Thanks.


OK guys, I'm stumped again. Given this same situation and @Elian's solution below (which works great), I need the "optime" and "movedate" dates for the history used to select the id. Does that make sense? I was hoping to end with something like this:

select new { 
    id = volume.Element("id").Value, 

    // this is from "optime"
    opYear = <whaterver>("year").Value, 
    opMonth = <whatever>("month").Value, 
    opDay = <whatever>("day").Value, 

    // this is from "movedate"
    mvYear = <whaterver>("year").Value, 
    mvMonth = <whatever>("month").Value, 
    mvDay = <whatever>("day").Value 
} 

I have tried so many different combinations, but the Attributes for <date type="optime"> and <date type="movedate"> keep getting in my way and I can't seem to get what I want.


OK. I found a solution that works well:

select new {
    caseId = volume.Element("id").Value,

    // this is from "optime"
    opYear = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("year").Value,
    opMonth = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("month").Value,
    opDay = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("day").Value,

    // this is from "movedate"
    mvYear = volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("year").Value,
    mvMonth = volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("month").Value,
    mvDay = volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("day").Value
};

However, it does fail when it finds an id with no "movedate". A few of these exist, so now I am working on that.


Well, late yesterday afternoon I finally figured out the solution I had been wanting:

var caseIdLeavingSite =
    from volume in root.Descendants("volume")
    where volume.Elements("history").Any(
        h => h.Element("type").Value == "A" &&
        h.Elements("location").Any(l => l.Attribute("type").Value == "old" && ((l.Element("repository").Value == "site") ||
                                                                               (l.Element("repository").Value == "init"))) &&
        h.Elements("location").Any(l => l.Attribute("type").Value == "new" && l.Element("repository").Value == "toVault")
        )
    select new {
        caseId = volume.Element("id").Value,
        opYear = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("year").Value,
        opMonth = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("month").Value,
        opDay = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("day").Value,
        mvYear = (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").Any() == true) ? 
                 (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("year").Value) : "0",
        mvMonth = (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").Any() == true) ? 
                  (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("month").Value) : "0",
        mvDay = (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").Any() == true) ? 
                (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("day").Value) : "0"
   };

This satisfies the requirements that @Elian helped with and grabs the additional date info necessary. It also accounts for those few instances when there is no element for "movedate" by using the ternary operator ?:.

Now, if anyone knows how to make this more efficient, I'm still interested. Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

┾廆蒐ゝ 2024-11-10 07:09:28

我认为您想要这样的东西:

IEnumerable<XElement> caseIdLeavingVault =
    from volume in document.Descendants("volume")
    where volume.Elements("history").Any(
        h => h.Element("type").Value == "A" &&
            h.Elements("location").Any(l => l.Attribute("type").Value == "old" && l.Element("repository").Value == "vault") &&
            h.Elements("location").Any(l => l.Attribute("type").Value == "new" && l.Element("repository").Value == "out")
        )
    select volume.Element("id");

您的代码独立检查卷是否具有 A 类型的 元素和(不一定相同) 元素,其中包含所需的 元素。

上面的代码检查是否存在 元素,该元素既是 A 类型,又包含所需的 元素。

更新:Abatishchev 建议使用 xpath 查询而不是 LINQ to XML 的解决方案,但他的查询太简单,并且不能准确返回您所要求的内容。下面的 xpath 查询可以解决这个问题,但它也有点长:

data/customer/mediatype/volume[history[type = 'A' and location[@type = 'old' and repository = 'vault'] and location[@type = 'new' and repository = 'out']]]/id

I think you want something like this:

IEnumerable<XElement> caseIdLeavingVault =
    from volume in document.Descendants("volume")
    where volume.Elements("history").Any(
        h => h.Element("type").Value == "A" &&
            h.Elements("location").Any(l => l.Attribute("type").Value == "old" && l.Element("repository").Value == "vault") &&
            h.Elements("location").Any(l => l.Attribute("type").Value == "new" && l.Element("repository").Value == "out")
        )
    select volume.Element("id");

Your code independently checks if a volume has a <history> element of type A and a (not necessarily the same) <history> element which has the required <location> elements.

The code above checks if there exists a <history> element that is both of type A and contains the required <location> elements.

Update: Abatishchev suggested a solution that uses an xpath query instead of LINQ to XML, but his query is too simple and doesn't return exactly what you asked for. The following xpath query will do the trick, but it's also a little bit longer:

data/customer/mediatype/volume[history[type = 'A' and location[@type = 'old' and repository = 'vault'] and location[@type = 'new' and repository = 'out']]]/id
孤独难免 2024-11-10 07:09:28

您为什么要使用如此复杂且昂贵的 LINQ to XML 查询:

using System.Xml;

string xml = @"...";
string xpath = "data/customer/mediatype/volume/history/type[text()='A']/../location[@type='old' or @type='new']/../../id";

var doc = new XmlDocument();
doc.LoadXml(xml); // or use Load(path);

var nodes = doc.SelectNodes(xpath);

foreach (XmlNode node in nodes)
{
    Console.WriteLine(node.InnerText); // 000049
}

当您可以使用简单的 XPath 查询时,或者如果您不需要 XML DOM 模型,

using System.Xml.XPath;

XPathDocument doc = null;
using (var stream = new StringReader(xml))
{
    doc = new XPathDocument(stream); // specify just path to file if you have such one
}
var nav = doc.CreateNavigator();
XPathNodeIterator nodes = (XPathNodeIterator)nav.Evaluate(xpath);
foreach (XPathNavigator node in nodes)
{
    Console.WriteLine(node.Value);
}

What for do you use such complex and expensive LINQ to XML query when you can use simple XPath query:

using System.Xml;

string xml = @"...";
string xpath = "data/customer/mediatype/volume/history/type[text()='A']/../location[@type='old' or @type='new']/../../id";

var doc = new XmlDocument();
doc.LoadXml(xml); // or use Load(path);

var nodes = doc.SelectNodes(xpath);

foreach (XmlNode node in nodes)
{
    Console.WriteLine(node.InnerText); // 000049
}

or if you don't need XML DOM model:

using System.Xml.XPath;

XPathDocument doc = null;
using (var stream = new StringReader(xml))
{
    doc = new XPathDocument(stream); // specify just path to file if you have such one
}
var nav = doc.CreateNavigator();
XPathNodeIterator nodes = (XPathNodeIterator)nav.Evaluate(xpath);
foreach (XPathNavigator node in nodes)
{
    Console.WriteLine(node.Value);
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文