如何提取与模式匹配的子字符串?

发布于 2024-11-30 09:57:10 字数 425 浏览 1 评论 0原文

我必须解析大型 html 文本文件并提取与特定模式匹配的子字符串。例如:

<span id='report-9429'>Report for May 2009</span>
A lot of code and text.
<span id='report-10522'>Report for Apr 2009</span>
A lot of code and text.
<span id='report-15212'>Report for Apr 2009</span>

其中 9429、10522 和 15212 是我必须作为子字符串数组获取的部分。该文件包含许多这些内容,我需要获取所有这些内容。

Cocoa 中有某种 RegExp 功能吗?这样的正则表达式会是什么样子?

I must parse big html text files and extract substrings which match a certain pattern. For example:

<span id='report-9429'>Report for May 2009</span>
A lot of code and text.
<span id='report-10522'>Report for Apr 2009</span>
A lot of code and text.
<span id='report-15212'>Report for Apr 2009</span>

Where 9429, 10522 and 15212 are the parts which I must get as array of substrings. The file contains many of these, and I need to get all of them.

Is there some sort of RegExp feature in Cocoa? And how would such a RegExp look like?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

鸵鸟症 2024-12-07 09:57:10

您可以使用 NSRegularExpression (尽管显然它不适用于 Snow Leo) 或 RegexKit

您的正则表达式可能如下所示:

<span id='report-(\d+)'>Report for \w+ \d+</span>

对于 NSRegularExpression,代码可能如下所示:

NSString *pattern = @"<span id='report-(\d+)'>Report for \w+ \d+</span>";
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:pattern
                                                                       options:0
                                                                         error:nil];
[regex enumerateMatchesInString:string
                        options:0
                          range:NSMakeRange(0, [string length])
                     usingBlock:^(NSTextCheckingResult *result, NSMatchingFlags flags, BOOL *stop) {
    NSString *reportId = [string substringWithRange:[result rangeAtIndex:1]];
    // Do something with reportId
}];

You might use NSRegularExpression (though apparently it doesn't work on Snow Leo) or RegexKit.

Your regex might look like this:

<span id='report-(\d+)'>Report for \w+ \d+</span>

For NSRegularExpression, the code might look like this:

NSString *pattern = @"<span id='report-(\d+)'>Report for \w+ \d+</span>";
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:pattern
                                                                       options:0
                                                                         error:nil];
[regex enumerateMatchesInString:string
                        options:0
                          range:NSMakeRange(0, [string length])
                     usingBlock:^(NSTextCheckingResult *result, NSMatchingFlags flags, BOOL *stop) {
    NSString *reportId = [string substringWithRange:[result rangeAtIndex:1]];
    // Do something with reportId
}];
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文