使用LINQ在C#中提取启动序列和结束序列之间的子字符串

发布于 2025-01-21 14:21:00 字数 1718 浏览 1 评论 0原文

我有一个XML实例,其中包含处理指令。我想要一个特定的(示意图声明):

<?xml-model href="../../a/b/c.sch" schematypens="http://purl.oclc.org/dsdl/schematron"?>

可能会有或可能没有这些处理指令,所以我不能依靠它在DOM中的位置;另一方面,可以保证只有一个(或无)此类示on文件参考。因此,我得到了这样的理解:

XProcessingInstruction p = d.Nodes().OfType<XProcessingInstruction>()
   .Where(x => x.Target.Equals("xml-model") && 
    x.Data.Contains("schematypens=\"http://purl.oclc.org/dsdl/schematron\""))
   .FirstOrDefault();

在给出的示例中,p.data的内容是字符串

href="../../a/b/c.sch" schematypens="http://purl.oclc.org/dsdl/schematron"

i 需要通过@href 提取指定的路径(即在此中示例我想要字符串../../ a/b/c.sch没有双引号。换句话说:我需要href =“以及下一个” 之前的子字符串。我正在尝试以Linq实现我的目标:

var a = p.Data.Split(' ').Where(s => s.StartsWith("href=\""))
       .Select(s => s.Substring("href=\"".Length))
       .Select(s => s.TakeWhile(c => c != '"'));

我会认为这给了我一个iEnumerable&lt; char&gt;,然后我可以通过描述的方式之一将其转换为字符串在这里,但事实并非如此:根据Linqpad,我似乎正在得到ienumerabale&lt; ;我无法设法将其制成字符串。

如何使用LINQ正确完成这件事?也许我最好在LINQ中使用Regex?


编辑:键入此内容后,我提出了一个有效的解决方案,但这似乎非常不高:

string a = new string
   (
      p.Data.Substring(p.Data.IndexOf("href=\"") + "href=\"".Length)
      .TakeWhile(c => c != '"').ToArray()
   );

什么是更好的方法?

I have an XML instance that contains processing instructions. I want a specific one (the schematron declaration):

<?xml-model href="../../a/b/c.sch" schematypens="http://purl.oclc.org/dsdl/schematron"?>

There may or may not be more than these very processing instructions present, so I can't rely on its position in the DOM; it is guaranteed, on the other hand, that there will be only one (or none) such Schematron file reference. Thus, I get it like so:

XProcessingInstruction p = d.Nodes().OfType<XProcessingInstruction>()
   .Where(x => x.Target.Equals("xml-model") && 
    x.Data.Contains("schematypens=\"http://purl.oclc.org/dsdl/schematron\""))
   .FirstOrDefault();

In the example given, the content of p.Data is the string

href="../../a/b/c.sch" schematypens="http://purl.oclc.org/dsdl/schematron"

I need to extract the path specified via @href (i. e. in this example I would want the string ../../a/b/c.sch) without double quotes. In other words: I need the substring after href=" and before the next ". I'm trying to achieve my goal with LINQ:

var a = p.Data.Split(' ').Where(s => s.StartsWith("href=\""))
       .Select(s => s.Substring("href=\"".Length))
       .Select(s => s.TakeWhile(c => c != '"'));

I would have thought this gave me a IEnumerable<char> which I could then convert to a string in one of the ways described here, but that's not the case: According to LINQPad, I seem to be getting a IEnumerabale<IEnumerable<char>> which I can't manage to make into a string.

How could this be done correctly using LINQ? Maybe I'd better be using Regex within LINQ?


Edit: After typing this down, I came up with a working solution, but it seems very inelegant:

string a = new string
   (
      p.Data.Substring(p.Data.IndexOf("href=\"") + "href=\"".Length)
      .TakeWhile(c => c != '"').ToArray()
   );

What would be a better way?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

っ〆星空下的拥抱 2025-01-28 14:21:00

尝试以下操作:

var input = @"<?xml-model href=""../../a/b/c.sch"" schematypens=""http://purl.oclc.org/dsdl/schematron""?>";
var match = Regex.Match(input, @"href=""(.*?)""");
var url = match.Groups[1].Value;

这给了我../../ url中的a/b/c.sch

请不要将REGEX用于一般XML解析,但在这种情况下很好。

Try this:

var input = @"<?xml-model href=""../../a/b/c.sch"" schematypens=""http://purl.oclc.org/dsdl/schematron""?>";
var match = Regex.Match(input, @"href=""(.*?)""");
var url = match.Groups[1].Value;

That gives me ../../a/b/c.sch in url.

Please don't use Regex for general XML parsing, but for this situation it's fine.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文