获取 XElement 的 InnerXml 的最佳方式?

发布于 2024-07-04 01:57:20 字数 639 浏览 6 评论 0 原文

获取下面代码中混合 body 元素内容的最佳方法是什么? 该元素可能包含 XHTML 或文本,但我只希望其内容为字符串形式。 XmlElement 类型具有 InnerXml 属性,这正是我所追求的。

编写的代码几乎满足了我的要求,但包括周围的...元素,其中我不想。

XDocument doc = XDocument.Load(new StreamReader(s));
var templates = from t in doc.Descendants("template")
                where t.Attribute("name").Value == templateName
                select new
                {
                   Subject = t.Element("subject").Value,
                   Body = t.Element("body").ToString()
                };

What's the best way to get the contents of the mixed body element in the code below? The element might contain either XHTML or text, but I just want its contents in string form. The XmlElement type has the InnerXml property which is exactly what I'm after.

The code as written almost does what I want, but includes the surrounding <body>...</body> element, which I don't want.

XDocument doc = XDocument.Load(new StreamReader(s));
var templates = from t in doc.Descendants("template")
                where t.Attribute("name").Value == templateName
                select new
                {
                   Subject = t.Element("subject").Value,
                   Body = t.Element("body").ToString()
                };

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(14

濫情▎り 2024-07-11 01:57:20

// 使用正则表达式可能会更快地简单地修剪开始和结束元素标记

var content = element.ToString();
var matchBegin = Regex.Match(content, @"<.+?>");
content = content.Substring(matchBegin.Index + matchBegin.Length);          
var matchEnd = Regex.Match(content, @"</.+?>", RegexOptions.RightToLeft);
content = content.Substring(0, matchEnd.Index);

// using Regex might be faster to simply trim the begin and end element tag

var content = element.ToString();
var matchBegin = Regex.Match(content, @"<.+?>");
content = content.Substring(matchBegin.Index + matchBegin.Length);          
var matchEnd = Regex.Match(content, @"</.+?>", RegexOptions.RightToLeft);
content = content.Substring(0, matchEnd.Index);
坚持沉默 2024-07-11 01:57:20

doc.ToString() 或 doc.ToString(SaveOptions) 完成这项工作。
请参阅 http:// /msdn.microsoft.com/en-us/library/system.xml.linq.xelement.tostring(v=vs.110).aspx

doc.ToString() or doc.ToString(SaveOptions) does the work.
See http://msdn.microsoft.com/en-us/library/system.xml.linq.xelement.tostring(v=vs.110).aspx

三人与歌 2024-07-11 01:57:20

想知道(注意我去掉了 b+= 并只使用了 b+)

t.Element( "body" ).Nodes()
 .Aggregate( "", ( b, node ) => b + node.ToString() );

是否会比“不是 100% 确定”效率稍低

string.Join( "", t.Element.Nodes()
                  .Select( n => n.ToString() ).ToArray() );

...但是浏览一下 Reflector 中的 Aggregate() 和 string.Join()...我 我认为我把它读为聚合,只是附加一个返回值,所以本质上你得到:

string = string + string

与 string.Join,它其中提到了 FastStringAllocation 之类的东西,这让我想起了微软的人可能会在那里带来一些额外的性能提升。 当然,我的 .ToArray() 称我否定了这一点,但我只是想提供另一个建议。

Wondering if (notice I got rid of the b+= and just have b+)

t.Element( "body" ).Nodes()
 .Aggregate( "", ( b, node ) => b + node.ToString() );

might be slightly less efficient than

string.Join( "", t.Element.Nodes()
                  .Select( n => n.ToString() ).ToArray() );

Not 100% sure...but glancing at Aggregate() and string.Join() in Reflector...I think I read it as Aggregate just appending a returning value, so essentially you get:

string = string + string

versus string.Join, it has some mention in there of FastStringAllocation or something, which makes me thing the folks at Microsoft might have put some extra performance boost in there. Of course my .ToArray() call my negate that, but I just wanted to offer up another suggestion.

2024-07-11 01:57:20

是否可以使用 System.Xml 命名空间对象来完成这里的工作,而不是使用 LINQ? 正如您已经提到的,XmlNode.InnerXml 正是您所需要的。

Is it possible to use the System.Xml namespace objects to get the job done here instead of using LINQ? As you already mentioned, XmlNode.InnerXml is exactly what you need.

拍不死你 2024-07-11 01:57:20

我想看看这些建议的解决方案中哪一个表现最好,所以我进行了一些比较测试。 出于兴趣,我还将 LINQ 方法与 Greg 建议的普通旧 System.Xml 方法进行了比较。 这种变化很有趣,但不是我所期望的,最慢的方法比最快的方法慢 3 倍以上

结果按最快到最慢排序:

  1. CreateReader - Instance Hunter(0.113 秒)
  2. Plain old System.Xml - Greg Hurlman(0.134 秒)
  3. 带字符串连接的聚合 - Mike Powell(0.324 秒)
  4. StringBuilder - Vin(0.333 秒)
  5. String.Join on数组 - Terry (0.360 秒)
  6. String.Concat on 数组 - Marcin Kosieradzki (0.364)

方法

我使用了具有 20 个相同节点的单个 XML 文档(称为“提示”):

<hint>
  <strong>Thinking of using a fake address?</strong>
  <br />
  Please don't. If we can't verify your address we might just
  have to reject your application.
</hint>

上面显示为秒的数字是连续提取 20 个节点的“内部 XML”1000 次,并取 5 次运行的平均值(平均值)的结果。 我没有包括将 XML 加载并解析为 XmlDocument (对于 System.Xml 方法)或 XDocument (对于所有其他人)。

我使用的 LINQ 算法是:(C# - 全部采用 XElement“父级”并返回内部 XML 字符串)

CreateReader:

var reader = parent.CreateReader();
reader.MoveToContent();

return reader.ReadInnerXml();

使用字符串连接进行聚合:

return parent.Nodes().Aggregate("", (b, node) => b += node.ToString());

StringBuilder:

StringBuilder sb = new StringBuilder();

foreach(var node in parent.Nodes()) {
    sb.Append(node.ToString());
}

return sb.ToString();

String.Join on array:

return String.Join("", parent.Nodes().Select(x => x.ToString()).ToArray());

String.Concat on array:

return String.Concat(parent.Nodes().Select(x => x.ToString()).ToArray());

我还没有显示此处为“普通旧 System.Xml”算法,因为它只是在节点上调用 .InnerXml。


结论

如果性能很重要(例如大量 XML、频繁解析),我会每次都使用 Daniel 的 CreateReader 方法。 如果您只是做一些查询,您可能需要使用 Mike 更简洁的 Aggregate 方法。

如果您在具有大量节点(可能是 100 个)的大型元素上使用 XML,您可能会开始看到使用 StringBuilder 相对于 Aggregate 方法的好处,而不是使用 CreateReader< /代码>。 我认为 JoinConcat 方法在这些条件下不会更有效,因为将大列表转换为大数组会带来损失(这里甚至很明显)较小的列表)。

I wanted to see which of these suggested solutions performed best, so I ran some comparative tests. Out of interest, I also compared the LINQ methods to the plain old System.Xml method suggested by Greg. The variation was interesting and not what I expected, with the slowest methods being more than 3 times slower than the fastest.

The results ordered by fastest to slowest:

  1. CreateReader - Instance Hunter (0.113 seconds)
  2. Plain old System.Xml - Greg Hurlman (0.134 seconds)
  3. Aggregate with string concatenation - Mike Powell (0.324 seconds)
  4. StringBuilder - Vin (0.333 seconds)
  5. String.Join on array - Terry (0.360 seconds)
  6. String.Concat on array - Marcin Kosieradzki (0.364)

Method

I used a single XML document with 20 identical nodes (called 'hint'):

<hint>
  <strong>Thinking of using a fake address?</strong>
  <br />
  Please don't. If we can't verify your address we might just
  have to reject your application.
</hint>

The numbers shown as seconds above are the result of extracting the "inner XML" of the 20 nodes, 1000 times in a row, and taking the average (mean) of 5 runs. I didn't include the time it took to load and parse the XML into an XmlDocument (for the System.Xml method) or XDocument (for all the others).

The LINQ algorithms I used were: (C# - all take an XElement "parent" and return the inner XML string)

CreateReader:

var reader = parent.CreateReader();
reader.MoveToContent();

return reader.ReadInnerXml();

Aggregate with string concatenation:

return parent.Nodes().Aggregate("", (b, node) => b += node.ToString());

StringBuilder:

StringBuilder sb = new StringBuilder();

foreach(var node in parent.Nodes()) {
    sb.Append(node.ToString());
}

return sb.ToString();

String.Join on array:

return String.Join("", parent.Nodes().Select(x => x.ToString()).ToArray());

String.Concat on array:

return String.Concat(parent.Nodes().Select(x => x.ToString()).ToArray());

I haven't shown the "Plain old System.Xml" algorithm here as it's just calling .InnerXml on nodes.


Conclusion

If performance is important (e.g. lots of XML, parsed frequently), I'd use Daniel's CreateReader method every time. If you're just doing a few queries, you might want to use Mike's more concise Aggregate method.

If you're using XML on large elements with lots of nodes (maybe 100's), you'd probably start to see the benefit of using StringBuilder over the Aggregate method, but not over CreateReader. I don't think the Join and Concat methods would ever be more efficient in these conditions because of the penalty of converting a large list to a large array (even obvious here with smaller lists).

眼眸印温柔 2024-07-11 01:57:20

感谢那些发现并证明了最佳方法的人(谢谢!),这里它包含在一个扩展方法中:

public static string InnerXml(this XNode node) {
    using (var reader = node.CreateReader()) {
        reader.MoveToContent();
        return reader.ReadInnerXml();
    }
}

With all due credit to those who discovered and proved the best approach (thanks!), here it is wrapped up in an extension method:

public static string InnerXml(this XNode node) {
    using (var reader = node.CreateReader()) {
        reader.MoveToContent();
        return reader.ReadInnerXml();
    }
}
残月升风 2024-07-11 01:57:20

保持简单和高效:

String.Concat(node.Nodes().Select(x => x.ToString()).ToArray())
  • 当连接字符串时,聚合的内存和性能效率很低。
  • 使用 Join("", sth) 使用的字符串数组是 Concat 的两倍......并且在代码中看起来很奇怪。
  • 使用 += 看起来很奇怪,但显然并不比使用 '+' 差多少 - 可能会针对相同的代码进行优化,因为赋值结果未使用并且可能被编译器安全地删除。
  • StringBuilder 是如此势在必行——每个人都知道不必要的“状态”很糟糕。

Keep it simple and efficient:

String.Concat(node.Nodes().Select(x => x.ToString()).ToArray())
  • Aggregate is memory and performance inefficient when concatenating strings
  • Using Join("", sth) is using two times bigger string array than Concat... And looks quite strange in code.
  • Using += looks very odd, but apparently is not much worse than using '+' - probably would be optimized to the same code, becase assignment result is unused and might be safely removed by compiler.
  • StringBuilder is so imperative - and everybody knows that unnecessary "state" sucks.
再浓的妆也掩不了殇 2024-07-11 01:57:20

我认为这是一个更好的方法(在 VB 中,应该不难翻译):

给定一个 XElement x:

Dim xReader = x.CreateReader
xReader.MoveToContent
xReader.ReadInnerXml

I think this is a much better method (in VB, shouldn't be hard to translate):

Given an XElement x:

Dim xReader = x.CreateReader
xReader.MoveToContent
xReader.ReadInnerXml
素衣风尘叹 2024-07-11 01:57:20

在 XElement 上使用这个“扩展”方法怎么样? 为我工作!

public static string InnerXml(this XElement element)
{
    StringBuilder innerXml = new StringBuilder();

    foreach (XNode node in element.Nodes())
    {
        // append node's xml string to innerXml
        innerXml.Append(node.ToString());
    }

    return innerXml.ToString();
}

或者使用一点 Linq

public static string InnerXml(this XElement element)
{
    StringBuilder innerXml = new StringBuilder();
    doc.Nodes().ToList().ForEach( node => innerXml.Append(node.ToString()));

    return innerXml.ToString();
}

注意:上面的代码必须使用 element.Nodes() 而不是 element.Elements()。 记住两者之间的区别非常重要。 element.Nodes() 为您提供诸如 XTextXAttribute 等的所有内容,但 XElement 仅是一个 Element。

How about using this "extension" method on XElement? worked for me !

public static string InnerXml(this XElement element)
{
    StringBuilder innerXml = new StringBuilder();

    foreach (XNode node in element.Nodes())
    {
        // append node's xml string to innerXml
        innerXml.Append(node.ToString());
    }

    return innerXml.ToString();
}

OR use a little bit of Linq

public static string InnerXml(this XElement element)
{
    StringBuilder innerXml = new StringBuilder();
    doc.Nodes().ToList().ForEach( node => innerXml.Append(node.ToString()));

    return innerXml.ToString();
}

Note: The code above has to use element.Nodes() as opposed to element.Elements(). Very important thing to remember the difference between the two. element.Nodes() gives you everything like XText, XAttribute etc, but XElement only an Element.

软甜啾 2024-07-11 01:57:20

@Greg:看来您已将答案编辑为完全不同的答案。 我的答案是肯定的,我可以使用 System.Xml 来完成此操作,但希望先接触一下 LINQ to XML。

我将在下面留下我的原始回复,以防其他人想知道为什么我不能只使用 XElement 的 .Value 属性来获取我需要的内容:

@Greg:Value 属性连接任何子节点的所有文本内容。 因此,如果 body 元素仅包含文本,则它可以工作,但如果它包含 XHTML,我会将所有文本连接在一起,但没有标签。

@Greg: It appears you've edited your answer to be a completely different answer. To which my answer is yes, I could do this using System.Xml but was hoping to get my feet wet with LINQ to XML.

I'll leave my original reply below in case anyone else wonders why I can't just use the XElement's .Value property to get what I need:

@Greg: The Value property concatenates all the text contents of any child nodes. So if the body element contains only text it works, but if it contains XHTML I get all the text concatenated together but none of the tags.

审判长 2024-07-11 01:57:20

就我个人而言,我最终使用 Aggregate 方法编写了一个 InnerXml 扩展方法:

public static string InnerXml(this XElement thiz)
{
   return thiz.Nodes().Aggregate( string.Empty, ( element, node ) => element += node.ToString() );
}

这样我的客户端代码就和旧的 System.Xml 命名空间一样简洁:

var innerXml = myXElement.InnerXml();

Personally, I ended up writing an InnerXml extension method using the Aggregate method:

public static string InnerXml(this XElement thiz)
{
   return thiz.Nodes().Aggregate( string.Empty, ( element, node ) => element += node.ToString() );
}

My client code is then just as terse as it would be with the old System.Xml namespace:

var innerXml = myXElement.InnerXml();
只是一片海 2024-07-11 01:57:20

我最终使用了这个:

Body = t.Element("body").Nodes().Aggregate("", (b, node) => b += node.ToString());

I ended up using this:

Body = t.Element("body").Nodes().Aggregate("", (b, node) => b += node.ToString());
万水千山粽是情ミ 2024-07-11 01:57:20
public static string InnerXml(this XElement xElement)
{
    //remove start tag
    string innerXml = xElement.ToString().Trim().Replace(string.Format("<{0}>", xElement.Name), "");
    ////remove end tag
    innerXml = innerXml.Trim().Replace(string.Format("</{0}>", xElement.Name), "");
    return innerXml.Trim();
}
public static string InnerXml(this XElement xElement)
{
    //remove start tag
    string innerXml = xElement.ToString().Trim().Replace(string.Format("<{0}>", xElement.Name), "");
    ////remove end tag
    innerXml = innerXml.Trim().Replace(string.Format("</{0}>", xElement.Name), "");
    return innerXml.Trim();
}
誰認得朕 2024-07-11 01:57:20
var innerXmlAsText= XElement.Parse(xmlContent)
                    .Descendants()
                    .Where(n => n.Name.LocalName == "template")
                    .Elements()
                    .Single()
                    .ToString();

将为您完成这项工作

var innerXmlAsText= XElement.Parse(xmlContent)
                    .Descendants()
                    .Where(n => n.Name.LocalName == "template")
                    .Elements()
                    .Single()
                    .ToString();

Will do the job for you

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文