如何使用 C# 和 LINQ 提取 XML 深处的信息?
这是我在 StackOverflow 上的第一篇文章,所以请耐心等待。如果我的代码示例有点长,我预先表示歉意。
使用 C# 和 LINQ,我尝试在一个更大的 XML 文件中识别一系列第三级 id
元素(在本例中为 000049)。每个第三级 id 都是唯一的,我想要的 ID 是基于每个第三级的一系列后代信息。更具体地说,如果 type == A
且 location type(old) ==Vault
且 location type(new) == out
,那么我想要选择该 id
。下面是我正在使用的 XML 和 C# 代码。
一般来说,我的代码有效。如下所示,它将返回两次 id
000049,这是正确的。然而,我发现了一个小故障。如果我删除第一个包含 type == A
的 history
块,我的代码仍然返回 000049 的 id
两次,而它应该只返回它一次。我知道为什么会发生这种情况,但我找不到更好的方法来运行查询。有没有更好的方法来运行我的查询以获得我想要的输出并且仍然使用 LINQ?
我的 XML:
<?xml version="1.0" encoding="ISO8859-1" ?>
<data type="historylist">
<date type="runtime">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
<hour>15</hour>
<minutes>24</minutes>
<seconds>46</seconds>
</date>
<customer>
<id>0001</id>
<description>customer</description>
<mediatype>
<id>kit</id>
<description>customer kit</description>
<volume>
<id>000049</id>
<history>
<date type="optime">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
<hour>03</hour>
<minutes>00</minutes>
<seconds>02</seconds>
</date>
<userid>batch</userid>
<type>OD</type>
<location type="old">
<repository>vault</repository>
<slot>0</slot>
</location>
<location type="new">
<repository>out</repository>
<slot>0</slot>
</location>
<container>0001.kit.000049</container>
<date type="movedate">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
</date>
</history>
<history>
<date type="optime">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
<hour>06</hour>
<minutes>43</minutes>
<seconds>33</seconds>
</date>
<userid>vaultred</userid>
<type>A</type>
<location type="old">
<repository>vault</repository>
<slot>0</slot>
</location>
<location type="new">
<repository>out</repository>
<slot>0</slot>
</location>
<container>0001.kit.000049</container>
<date type="movedate">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
</date>
</history>
<history>
<date type="optime">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
<hour>06</hour>
<minutes>43</minutes>
<seconds>33</seconds>
</date>
<userid>vaultred</userid>
<type>S</type>
<location type="old">
<repository>vault</repository>
<slot>0</slot>
</location>
<location type="new">
<repository>out</repository>
<slot>0</slot>
</location>
<container>0001.kit.000049</container>
<date type="movedate">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
</date>
</history>
<history>
<date type="optime">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
<hour>06</hour>
<minutes>45</minutes>
<seconds>00</seconds>
</date>
<userid>batch</userid>
<type>O</type>
<location type="old">
<repository>out</repository>
<slot>0</slot>
</location>
<location type="new">
<repository>site</repository>
<slot>0</slot>
</location>
<container>0001.kit.000049</container>
<date type="movedate">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
</date>
</history>
<history>
<date type="optime">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
<hour>11</hour>
<minutes>25</minutes>
<seconds>59</seconds>
</date>
<userid>ihcmdm</userid>
<type>A</type>
<location type="old">
<repository>out</repository>
<slot>0</slot>
</location>
<location type="new">
<repository>site</repository>
<slot>0</slot>
</location>
<container>0001.kit.000049</container>
<date type="movedate">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
</date>
</history>
<history>
<date type="optime">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
<hour>11</hour>
<minutes>25</minutes>
<seconds>59</seconds>
</date>
<userid>ihcmdm</userid>
<type>S</type>
<location type="old">
<repository>out</repository>
<slot>0</slot>
</location>
<location type="new">
<repository>site</repository>
<slot>0</slot>
</location>
<container>0001.kit.000049</container>
<date type="movedate">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
</date>
</history>
</volume>
...
我的 C# 代码:
IEnumerable<XElement> caseIdLeavingVault =
from volume in root.Descendants("volume")
where
(from type in volume.Descendants("type")
where type.Value == "A"
select type).Any() &&
(from locationOld in volume.Descendants("location")
where
((String)locationOld.Attribute("type") == "old" &&
(String)locationOld.Element("repository") == "vault") &&
(from locationNew in volume.Descendants("location")
where
((String)locationNew.Attribute("type") == "new" &&
(String)locationNew.Element("repository") == "out")
select locationNew).Any()
select locationOld).Any()
select volume.Element("id");
...
foreach (XElement volume in caseIdLeavingVault)
{
Console.WriteLine(volume.Value.ToString());
}
谢谢。
好吧,伙计们,我又被难住了。鉴于同样的情况和下面@Elian的解决方案(效果很好),我需要历史记录
的“optime”
和“movedate”
日期用于选择id
。这有道理吗?我希望以这样的方式结束:
select new {
id = volume.Element("id").Value,
// this is from "optime"
opYear = <whaterver>("year").Value,
opMonth = <whatever>("month").Value,
opDay = <whatever>("day").Value,
// this is from "movedate"
mvYear = <whaterver>("year").Value,
mvMonth = <whatever>("month").Value,
mvDay = <whatever>("day").Value
}
我尝试了很多不同的组合,但是
和 的
不断妨碍我,我似乎无法得到我想要的东西。Attribute
>
好的。我找到了解决方案 效果很好:
select new {
caseId = volume.Element("id").Value,
// this is from "optime"
opYear = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("year").Value,
opMonth = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("month").Value,
opDay = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("day").Value,
// this is from "movedate"
mvYear = volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("year").Value,
mvMonth = volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("month").Value,
mvDay = volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("day").Value
};
但是,当它找到没有 "movedate"
的 id
时,它会失败。其中一些已经存在,所以现在我正在努力。
好吧,昨天下午晚些时候,我终于找到了我一直想要的解决方案:
var caseIdLeavingSite =
from volume in root.Descendants("volume")
where volume.Elements("history").Any(
h => h.Element("type").Value == "A" &&
h.Elements("location").Any(l => l.Attribute("type").Value == "old" && ((l.Element("repository").Value == "site") ||
(l.Element("repository").Value == "init"))) &&
h.Elements("location").Any(l => l.Attribute("type").Value == "new" && l.Element("repository").Value == "toVault")
)
select new {
caseId = volume.Element("id").Value,
opYear = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("year").Value,
opMonth = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("month").Value,
opDay = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("day").Value,
mvYear = (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").Any() == true) ?
(volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("year").Value) : "0",
mvMonth = (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").Any() == true) ?
(volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("month").Value) : "0",
mvDay = (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").Any() == true) ?
(volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("day").Value) : "0"
};
这满足了 @Elian 帮助的要求,并获取了必要的附加日期信息。它还通过使用三元运算符 ?:
解决了没有 "movedate"
元素时的少数情况。
现在,如果有人知道如何提高效率,我仍然感兴趣。谢谢。
This is my first post on StackOverflow, so please bear with me. And I apologize upfront if my code example is a bit long.
Using C# and LINQ, I'm trying to identify a series of third level id
elements (000049 in this case) in a much larger XML file. Each third level id
is unique, and the ones I want are based on a series of descendant info for each. More specifically, if type == A
and location type(old) == vault
and location type(new) == out
, then I want to select that id
. Below is the XML and C# code that I'm using.
In general my code works. As written below it will return an id
of 000049 twice, which is correct. However, I have found a glitch. If I remove the first history
block that contains type == A
, my code still returns an id
of 000049 twice when it should only return it once. I know why it is happening, but I can't figure out a better way to run the query. Is there a better way to run my query to get the output I want and still use LINQ?
My XML:
<?xml version="1.0" encoding="ISO8859-1" ?>
<data type="historylist">
<date type="runtime">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
<hour>15</hour>
<minutes>24</minutes>
<seconds>46</seconds>
</date>
<customer>
<id>0001</id>
<description>customer</description>
<mediatype>
<id>kit</id>
<description>customer kit</description>
<volume>
<id>000049</id>
<history>
<date type="optime">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
<hour>03</hour>
<minutes>00</minutes>
<seconds>02</seconds>
</date>
<userid>batch</userid>
<type>OD</type>
<location type="old">
<repository>vault</repository>
<slot>0</slot>
</location>
<location type="new">
<repository>out</repository>
<slot>0</slot>
</location>
<container>0001.kit.000049</container>
<date type="movedate">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
</date>
</history>
<history>
<date type="optime">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
<hour>06</hour>
<minutes>43</minutes>
<seconds>33</seconds>
</date>
<userid>vaultred</userid>
<type>A</type>
<location type="old">
<repository>vault</repository>
<slot>0</slot>
</location>
<location type="new">
<repository>out</repository>
<slot>0</slot>
</location>
<container>0001.kit.000049</container>
<date type="movedate">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
</date>
</history>
<history>
<date type="optime">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
<hour>06</hour>
<minutes>43</minutes>
<seconds>33</seconds>
</date>
<userid>vaultred</userid>
<type>S</type>
<location type="old">
<repository>vault</repository>
<slot>0</slot>
</location>
<location type="new">
<repository>out</repository>
<slot>0</slot>
</location>
<container>0001.kit.000049</container>
<date type="movedate">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
</date>
</history>
<history>
<date type="optime">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
<hour>06</hour>
<minutes>45</minutes>
<seconds>00</seconds>
</date>
<userid>batch</userid>
<type>O</type>
<location type="old">
<repository>out</repository>
<slot>0</slot>
</location>
<location type="new">
<repository>site</repository>
<slot>0</slot>
</location>
<container>0001.kit.000049</container>
<date type="movedate">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
</date>
</history>
<history>
<date type="optime">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
<hour>11</hour>
<minutes>25</minutes>
<seconds>59</seconds>
</date>
<userid>ihcmdm</userid>
<type>A</type>
<location type="old">
<repository>out</repository>
<slot>0</slot>
</location>
<location type="new">
<repository>site</repository>
<slot>0</slot>
</location>
<container>0001.kit.000049</container>
<date type="movedate">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
</date>
</history>
<history>
<date type="optime">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
<hour>11</hour>
<minutes>25</minutes>
<seconds>59</seconds>
</date>
<userid>ihcmdm</userid>
<type>S</type>
<location type="old">
<repository>out</repository>
<slot>0</slot>
</location>
<location type="new">
<repository>site</repository>
<slot>0</slot>
</location>
<container>0001.kit.000049</container>
<date type="movedate">
<year>2011</year>
<month>04</month>
<day>22</day>
<dayname>Friday</dayname>
</date>
</history>
</volume>
...
My C# code:
IEnumerable<XElement> caseIdLeavingVault =
from volume in root.Descendants("volume")
where
(from type in volume.Descendants("type")
where type.Value == "A"
select type).Any() &&
(from locationOld in volume.Descendants("location")
where
((String)locationOld.Attribute("type") == "old" &&
(String)locationOld.Element("repository") == "vault") &&
(from locationNew in volume.Descendants("location")
where
((String)locationNew.Attribute("type") == "new" &&
(String)locationNew.Element("repository") == "out")
select locationNew).Any()
select locationOld).Any()
select volume.Element("id");
...
foreach (XElement volume in caseIdLeavingVault)
{
Console.WriteLine(volume.Value.ToString());
}
Thanks.
OK guys, I'm stumped again. Given this same situation and @Elian's solution below (which works great), I need the "optime"
and "movedate"
dates for the history
used to select the id
. Does that make sense? I was hoping to end with something like this:
select new {
id = volume.Element("id").Value,
// this is from "optime"
opYear = <whaterver>("year").Value,
opMonth = <whatever>("month").Value,
opDay = <whatever>("day").Value,
// this is from "movedate"
mvYear = <whaterver>("year").Value,
mvMonth = <whatever>("month").Value,
mvDay = <whatever>("day").Value
}
I have tried so many different combinations, but the Attribute
s for <date type="optime">
and <date type="movedate">
keep getting in my way and I can't seem to get what I want.
OK. I found a solution that works well:
select new {
caseId = volume.Element("id").Value,
// this is from "optime"
opYear = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("year").Value,
opMonth = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("month").Value,
opDay = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("day").Value,
// this is from "movedate"
mvYear = volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("year").Value,
mvMonth = volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("month").Value,
mvDay = volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("day").Value
};
However, it does fail when it finds an id
with no "movedate"
. A few of these exist, so now I am working on that.
Well, late yesterday afternoon I finally figured out the solution I had been wanting:
var caseIdLeavingSite =
from volume in root.Descendants("volume")
where volume.Elements("history").Any(
h => h.Element("type").Value == "A" &&
h.Elements("location").Any(l => l.Attribute("type").Value == "old" && ((l.Element("repository").Value == "site") ||
(l.Element("repository").Value == "init"))) &&
h.Elements("location").Any(l => l.Attribute("type").Value == "new" && l.Element("repository").Value == "toVault")
)
select new {
caseId = volume.Element("id").Value,
opYear = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("year").Value,
opMonth = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("month").Value,
opDay = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("day").Value,
mvYear = (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").Any() == true) ?
(volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("year").Value) : "0",
mvMonth = (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").Any() == true) ?
(volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("month").Value) : "0",
mvDay = (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").Any() == true) ?
(volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("day").Value) : "0"
};
This satisfies the requirements that @Elian helped with and grabs the additional date info necessary. It also accounts for those few instances when there is no element for "movedate"
by using the ternary operator ?:
.
Now, if anyone knows how to make this more efficient, I'm still interested. Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我认为您想要这样的东西:
您的代码独立检查卷是否具有
A
类型的
元素和(不一定相同)
元素,其中包含所需的
元素。上面的代码检查是否存在
元素,该元素既是A
类型,又包含所需的
元素。更新:Abatishchev 建议使用 xpath 查询而不是 LINQ to XML 的解决方案,但他的查询太简单,并且不能准确返回您所要求的内容。下面的 xpath 查询可以解决这个问题,但它也有点长:
I think you want something like this:
Your code independently checks if a volume has a
<history>
element of typeA
and a (not necessarily the same)<history>
element which has the required<location>
elements.The code above checks if there exists a
<history>
element that is both of typeA
and contains the required<location>
elements.Update: Abatishchev suggested a solution that uses an xpath query instead of LINQ to XML, but his query is too simple and doesn't return exactly what you asked for. The following xpath query will do the trick, but it's also a little bit longer:
您为什么要使用如此复杂且昂贵的 LINQ to XML 查询:
当您可以使用简单的 XPath 查询时,或者如果您不需要 XML DOM 模型,
What for do you use such complex and expensive LINQ to XML query when you can use simple XPath query:
or if you don't need XML DOM model: