正则表达式将查询字符串值解析为命名组
我有一个包含以下内容的 HTML:
... some text ...
<a href="file.aspx?userId=123§ion=2">link</a> ... some text ...
... some text ...
<a href="file.aspx?section=5&user=678">link</a> ... some text ...
... some text ...
我想解析它并获得与命名组的匹配:
匹配 1
group["user"]=123
group["section"]=2
match 2
group["user"]=678
group["section"]=5
如果参数总是按顺序排列,首先是用户,然后是部分,我可以做到这一点,但我不知道该怎么做如果顺序不同。
谢谢你!
I have a HTML with the following content:
... some text ...
<a href="file.aspx?userId=123§ion=2">link</a> ... some text ...
... some text ...
<a href="file.aspx?section=5&user=678">link</a> ... some text ...
... some text ...
I would like to parse that and get a match with named groups:
match 1
group["user"]=123
group["section"]=2
match 2
group["user"]=678
group["section"]=5
I can do it if parameters always go in order, first User and then Section, but I don't know how to do it if the order is different.
Thank you!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
就我而言,我必须解析 Url,因为实用程序 HttpUtility.ParseQueryString 在 WP7 中不可用。 所以,我创建了一个像这样的扩展方法:
然后使用它是问题,例如
注意: 我直接返回 IEnumerable 而不是字典,只是因为我假设可能有重复参数的名称。 如果存在重复的名称,则字典将抛出异常。
In my case I had to parse an Url because the utility HttpUtility.ParseQueryString is not available in WP7. So, I created a extension method like this:
Then It's matter of using it, for example
NOTE: I'm returning the IEnumerable instead of the dictionary directly just because I'm assuming that there might be duplicated parameter's name. If there are duplicated names, then the dictionary will throw an exception.
为什么要使用正则表达式来拆分它?
您可以首先提取查询字符串。 将结果拆分为 & 然后通过将结果从=上分割来创建一个地图
Why use regex to split it out?
You could first extrct the query string. Split the result on & and then create a map by splitting the result from that on =
您没有指定您使用的语言,但这应该可以在 C# 中实现:
You didn't specify what language you are working in, but this should do the trick in C#:
使用正则表达式首先找到键值对,然后进行分割......似乎不对。
我对完整的正则表达式解决方案感兴趣。
任何人?
Using regex to first find the key value pairs and then doing splits... doesn't seem right.
I'm interested in a complete regex solution.
Anyone?
看看这个
你可以得到类似 Groups["key"].Captures[i] & 的配对。 组["value"].Captures[i]
Check this out
You can get pairs with something like Groups["key"].Captures[i] & Groups["value"].Captures[i]
也许是这样的(我对正则表达式很生疏,而且一开始也不擅长它们。未经测试):(顺便
说一句,XHTML 格式错误; & 应该是 & 在属性中。)
Perhaps something like this (I am rusty on regex, and wasn't good at them in the first place anyway. Untested):
(By the way, the XHTML is malformed; & should be & in the attributes.)
另一种方法是将捕获组放入前瞻中:
如果只有两个参数,则没有理由比 Mike 和 strager 建议的基于交替的方法更喜欢这种方法。 但是,如果您需要匹配三个参数,其他正则表达式将增长到当前长度的几倍,而这个正则表达式只需要另一次前瞻,就像两个现有的一样。
顺便说一句,与您对 Claus 的回答相反,您使用哪种语言非常重要。一种语言与另一种语言的功能、语法和 API 存在巨大差异。
Another approach is to put the capturing groups inside lookaheads:
If there are only two parameters, there's no reason to prefer this way over the alternation-based approaches suggested by Mike and strager. But if you needed to match three parameters, the other regexes would grow to several times their current length, while this one would only need another lookahead like just like the two existing ones.
By the way, contrary to your response to Claus, it matters quite a bit which language you're working in. There's a huge variation in capabilities, syntax, and API from one language to the next.
您没有说明您正在使用哪种正则表达式。 由于您的示例 URL 链接到 .aspx 文件,因此我假设是 .NET。 在 .NET 中,单个正则表达式可以具有多个同名的命名捕获组,并且 .NET 会将它们视为一组。 因此,您可以使用正则表达式
这个带有交替的简单正则表达式将比任何带有环视的技巧更有效。 如果您的要求包括仅匹配链接中的参数,您可以轻松扩展它。
You did not say which regex flavor you are using. Since your sample URL links to an .aspx file, I'll assume .NET. In .NET, a single regex can have multiple named capturing groups with the same name, and .NET will treat them as if they were one group. Thus you can use the regex
This simple regex with alternation will be far more efficient than any tricks with lookaround. You can easily expand it if your requirements include matching the parameters only if they're in a link.
一个简单的Python实现克服了排序问题
a simple python implementation overcoming the ordering problem