使用正则表达式从 SPARQL 查询中提取信息
我很难创建一个从该 SPARQL 查询中提取命名空间的正则表达式:
SELECT *
WHERE {
?Vehicle rdf:type umbel-sc:CompactCar ;
skos:subject <http://dbpedia.org/resource/Category:Vehicles_with_CVT_transmission>;
dbp-prop:assembly ?Place.
?Place geo-ont:parentFeature dbpedia:United_States .
}
我需要得到:
"rdf", "umbel-sc", "skos", "dbp-prop", "geo-ont", "dbpedia"
我需要这样的表达式:
\\s+([^\\:]*):[^\\s]+
但上面的表达式不起作用,因为它在到达 之前也会占用空格:
。我做错了什么?
I am having a hard time creating a regular expression that extracts the namespaces from this SPARQL query:
SELECT *
WHERE {
?Vehicle rdf:type umbel-sc:CompactCar ;
skos:subject <http://dbpedia.org/resource/Category:Vehicles_with_CVT_transmission>;
dbp-prop:assembly ?Place.
?Place geo-ont:parentFeature dbpedia:United_States .
}
I need to get:
"rdf", "umbel-sc", "skos", "dbp-prop", "geo-ont", "dbpedia"
I need a expression like this:
\\s+([^\\:]*):[^\\s]+
But the above one does not work, because it also eats spaces before reaching :
. What am I doing wrong?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
正则表达式会吃掉这些空格,是的,但是括号捕获的组不会包含它。这是一个问题吗?您可以通过读取
Regex.Match
返回的Match
对象中的Groups[1].Value
来访问该组。如果您确实需要正则表达式不匹配这些空格,则可以使用所谓的后向断言:
顺便说一句,您不需要将所有内容加倍你的反斜杠。使用逐字字符串来代替,如下所示:
The regular expression will eat those spaces, yes, but the group captured by your parenthesis won’t contain it. Is that a problem? You can access this group by reading from
Groups[1].Value
in theMatch
object returned fromRegex.Match
.If you really need the regex to not match these spaces, you can use a so-called look-behind assertion:
As an aside, you don’t need to double all your backslashes. Use a verbatim string instead, like this:
我不知道 SPARQL 语法的细节,但我想它不是一种正则语言,因此正则表达式无法完美地做到这一点。但是,如果您搜索看起来像单词并且左侧被空格包围且右侧被冒号包围的内容,则您可以非常接近。
对于快速解决方案或者如果您的输入格式已知且受到足够的限制,此方法可能足够好。对于更通用的解决方案,建议您为 SPARQL 语言寻找或创建合适的解析器。
话虽如此,试试这个:
结果:
I don't know the details of SPARQL syntax, but I would imagine that it is not a regular language so regular expressions won't be able to do this perfectly. However you can get pretty close if you search for something that looks like a word and is surrounded by space on the left and a colon on the right.
This method might be good enough for a quick solution or if your input format is known and sufficiently restricted. For a more general solution suggest you look for or create a proper parser for the SPARQL language.
With that said, try this:
Result: