用于简化雅虎问答提要标题的正则表达式
我正在尝试解析雅虎答案提要 - http://answers.yahoo.com/rss/allq< /a> 问题是标题有
[类别]:开放问题:
在我不想要的每个标题中...我想编写一个正则表达式来删除这个...
我们可以做的任何事情来删除开头的所有字母[第一个:应该这样做。
:
后面还有一个空格,我们也需要将其删除。
预先感谢您,我也会尝试自己找到解决方案。
I am trying to parse the yahoo answers feed - http://answers.yahoo.com/rss/allq
The issue is that the titles have
[ Category ] : Open Question :
in every title that I do not want... I want to write a regexp to remove this...
anything that we can make to remove all the letters in the starting [ and the first : should do it.
there is a space after the :
also, we need to remove that too.
Thanks for this in advance, I will also try to find a solution myself.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您是否考虑过使用 Yahoo 的 YQL 服务来解析此提要(或其他网页)?
他们已经提供了示例查询供您在 Yahoo 获取答案数据:
answers.getbycategory:
http://developer.yahoo.com/yql/console/#h= select%20*%20from%20answers.getbycategory%20where%20category_id%3D2115500137%20and%20type%3D%22resolved%22
answers.getbyuser:
http://developer.yahoo.com/yql/console/#h=选择%20*%20from%20answers.getbyuser%20where%20user_id%3D%22YbaMGtHFaa%22
answers.getquestion:
http://developer.yahoo.com/yql/console/#h=选择%20*%20from%20answers.getquestion%20where%20question_id%3D%2220090526102023AAkRbch%22
answers.search:
http://developer.yahoo.com/yql/console/#h=选择%20*%20from%20answers.search%20where%20query%3D%22cars%22%20and%20category_id%3D2115500137%20and%20type%3D%22resolved%22
(仅供参考,以防您我不知道这个方便的服务,而不是使用正则表达式进行屏幕抓取。)
Have you considered using Yahoo's YQL service to parse this feed (or other web pages)?
They already have sample queries for you to get at Yahoo Answers data:
answers.getbycategory:
http://developer.yahoo.com/yql/console/#h=select%20*%20from%20answers.getbycategory%20where%20category_id%3D2115500137%20and%20type%3D%22resolved%22
answers.getbyuser:
http://developer.yahoo.com/yql/console/#h=select%20*%20from%20answers.getbyuser%20where%20user_id%3D%22YbaMGtHFaa%22
answers.getquestion:
http://developer.yahoo.com/yql/console/#h=select%20*%20from%20answers.getquestion%20where%20question_id%3D%2220090526102023AAkRbch%22
answers.search:
http://developer.yahoo.com/yql/console/#h=select%20*%20from%20answers.search%20where%20query%3D%22cars%22%20and%20category_id%3D2115500137%20and%20type%3D%22resolved%22
(Just an FYI in case you weren't aware of this convenient service. I use it instead of screen scraping with RegEx's.)
以下正则表达式应该完成这项工作:
c# 中的使用示例:
它的作用是以
[
括号开头,并采用任何字符,直到匹配:
并采用以下空格。希望这有帮助,
汤姆.
感谢@ cmptrgeekken 指出了非贪婪的事情!
the following regex should do the job:
Usage sample in c#:
What it does is start with an
[
bracket and take any characters until it matches a:
and take the follwing space.Hope this helps,
Tom.
Thanks @ cmptrgeekken for pointing the non greedy thing out!