如何提取与模式匹配的子字符串?
我必须解析大型 html 文本文件并提取与特定模式匹配的子字符串。例如:
<span id='report-9429'>Report for May 2009</span>
A lot of code and text.
<span id='report-10522'>Report for Apr 2009</span>
A lot of code and text.
<span id='report-15212'>Report for Apr 2009</span>
其中 9429、10522 和 15212 是我必须作为子字符串数组获取的部分。该文件包含许多这些内容,我需要获取所有这些内容。
Cocoa 中有某种 RegExp 功能吗?这样的正则表达式会是什么样子?
I must parse big html text files and extract substrings which match a certain pattern. For example:
<span id='report-9429'>Report for May 2009</span>
A lot of code and text.
<span id='report-10522'>Report for Apr 2009</span>
A lot of code and text.
<span id='report-15212'>Report for Apr 2009</span>
Where 9429, 10522 and 15212 are the parts which I must get as array of substrings. The file contains many of these, and I need to get all of them.
Is there some sort of RegExp feature in Cocoa? And how would such a RegExp look like?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以使用 NSRegularExpression (尽管显然它不适用于 Snow Leo) 或 RegexKit。
您的正则表达式可能如下所示:
对于 NSRegularExpression,代码可能如下所示:
You might use NSRegularExpression (though apparently it doesn't work on Snow Leo) or RegexKit.
Your regex might look like this:
For NSRegularExpression, the code might look like this: