匹配链接 url 的正则表达式中存在语法错误
我在一些 nemerle 代码中有以下方法:
private static getLinks(text : string) : array[string] {
def linkrx = Regex(@"<a\shref=['|\"](.*?)['|\"].*?>");
def m = linkrx.Matches(text);
mutable txmatches : array[string];
for (mutable i = 0; i < m.Count; ++i) {
txmatches[i] = m[i].Value;
}
txmatches
}
问题是编译器由于某种原因试图解析正则表达式语句内的括号,导致程序无法编译。 如果我删除@,(我被告知放在那里)我在“\s”上收到无效转义字符错误
这是编译器输出:(
NCrawler.n:23:21:23:22: ←[01;31merror←[0m: when parsing this `(' brace group
NCrawler.n:23:38:23:39: ←[01;31merror←[0m: unexpected closing bracket `]'
NCrawler.n:22:57:22:58: ←[01;31merror←[0m: when parsing this `{' brace group
NCrawler.n:23:38:23:39: ←[01;31merror←[0m: unexpected closing bracket `]'
NCrawler.n:8:1:8:2: ←[01;31merror←[0m: when parsing this `{' brace group
NCrawler.n:23:38:23:39: ←[01;31merror←[0m: unexpected closing bracket `]'
NCrawler.n:23:38:23:39: ←[01;31merror←[0m: unexpected closing bracket `]'
第23行是带有正则表达式代码的行)
我应该做什么?
I have the following method in some nemerle code:
private static getLinks(text : string) : array[string] {
def linkrx = Regex(@"<a\shref=['|\"](.*?)['|\"].*?>");
def m = linkrx.Matches(text);
mutable txmatches : array[string];
for (mutable i = 0; i < m.Count; ++i) {
txmatches[i] = m[i].Value;
}
txmatches
}
the problem is that the compiler for some reason is trying to parse the brackets inside the regex statement and its causing the program to not compile. If i remove the @, (which i was told to put there) i get an invalid escape character error on the "\s"
Heres the compiler output:
NCrawler.n:23:21:23:22: ←[01;31merror←[0m: when parsing this `(' brace group
NCrawler.n:23:38:23:39: ←[01;31merror←[0m: unexpected closing bracket `]'
NCrawler.n:22:57:22:58: ←[01;31merror←[0m: when parsing this `{' brace group
NCrawler.n:23:38:23:39: ←[01;31merror←[0m: unexpected closing bracket `]'
NCrawler.n:8:1:8:2: ←[01;31merror←[0m: when parsing this `{' brace group
NCrawler.n:23:38:23:39: ←[01;31merror←[0m: unexpected closing bracket `]'
NCrawler.n:23:38:23:39: ←[01;31merror←[0m: unexpected closing bracket `]'
(line 23 is the line with the regex code on it)
What should I do?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我不知道 Nemerle,但似乎使用
@
会禁用所有转义,包括"
的转义。尝试以下之一:
I don't know Nemerle, but it seems like using
@
disables all escapes, including the escape for the"
.Try one of these:
我不是 Nemerle 程序员,但我知道您应该始终使用 XML 解析器来处理基于 XML 的数据,而不是正则表达式。
我猜有人已经为 Nemerle 创建了 DOM 或 XPath 库,因此您可以
通过 XPath 访问 //a[@href] 或通过 DOM 访问类似 a.href.value 的内容。
当前的正则表达式不喜欢,例如
我没有测试这个,但它应该更像它
I'm not Nemerle programmer but i know that yous shoud ALWAYS use XML parser for XML based data and not regexps.
I guess someone has created DOM or XPath library for Nemerle so you can access either
//a[@href] via XPath or something like a.href.value via DOM.
That current regexp doesn't like for example
I didn't test this but it should be more like it
问题出在引号上,而不是括号上。 在 Nemerle 中,就像在 C# 中一样,您可以使用另一个引号(而不是反斜杠)对引号进行转义。
编辑:还要注意,您不需要方括号内的管道; 内容被视为一组字符(或字符范围),并隐含 OR。
The problem is with the quotation marks, not the brackets. In Nemerle, as in C#, you escape a quotation mark with another quotation mark, not a backslash.
EDIT: Note as well that you don't need the pipe inside the square brackets; the contents are treated as a set of characters (or ranges of characters), with the OR being implied.