改进在文本正文中查找 URL 的算法 - obj-c

发布于 2024-10-09 23:02:29 字数 2308 浏览 0 评论 0原文

我正在尝试提出一种算法来查找文本正文中的 URL。我目前有以下代码(这是我坐下来破解的代码,我知道必须有更好的方法):

    statusText.text = @"http://google.com http://www.apple.com www.joshholat.com";

NSMutableArray *urlLocations = [[NSMutableArray alloc] init];

NSRange currentLocation = NSMakeRange(0, statusText.text.length);
for (int x = 0; x < statusText.text.length; x++) {
    currentLocation = [[statusText.text substringFromIndex:(x + currentLocation.location)] rangeOfString:@"http://"];
    if (currentLocation.location > statusText.text.length) break;
    [urlLocations addObject:[NSNumber numberWithInt:(currentLocation.location + x)]];
}
currentLocation = NSMakeRange(0, statusText.text.length);
for (int x = 0; x < statusText.text.length; x++) {
    currentLocation = [[statusText.text substringFromIndex:(x + currentLocation.location)] rangeOfString:@"http://www."];
    if (currentLocation.location > statusText.text.length) break;
    [urlLocations addObject:[NSNumber numberWithInt:(currentLocation.location + x)]];
}
currentLocation = NSMakeRange(0, statusText.text.length);
for (int x = 0; x < statusText.text.length; x++) {
    currentLocation = [[statusText.text substringFromIndex:(x + currentLocation.location)] rangeOfString:@" www." options:NSLiteralSearch];
    if (currentLocation.location > statusText.text.length) break;
    [urlLocations addObject:[NSNumber numberWithInt:(currentLocation.location + 1 + x)]];
}

//Get rid of any duplicate locations
NSSet *uniqueElements = [NSSet setWithArray:urlLocations];
[urlLocations release];
NSArray *finalURLLocations = [[NSArray alloc] init];
finalURLLocations = [uniqueElements allObjects];

//Parse out the URLs of each of the locations
for (int x = 0; x < [finalURLLocations count]; x++) {
    NSRange temp = [[statusText.text substringFromIndex:[[finalURLLocations objectAtIndex:x] intValue]] rangeOfString:@" "];
    int length = temp.location + [[finalURLLocations objectAtIndex:x] intValue];
    if (temp.location > statusText.text.length) length = statusText.text.length;
    length = length - [[finalURLLocations objectAtIndex:x] intValue];
    NSLog(@"URL: %@", [statusText.text substringWithRange:NSMakeRange([[finalURLLocations objectAtIndex:x] intValue], length)]);
}

我觉得它可以通过使用正则表达式或其他东西来改进。任何有助于改进这一点的帮助将不胜感激。

I'm trying to come up with an algorithm to find URLs in a body of text. I currently have the following code (this was my sit down and hack it out code, and I know there has to be a better way):

    statusText.text = @"http://google.com http://www.apple.com www.joshholat.com";

NSMutableArray *urlLocations = [[NSMutableArray alloc] init];

NSRange currentLocation = NSMakeRange(0, statusText.text.length);
for (int x = 0; x < statusText.text.length; x++) {
    currentLocation = [[statusText.text substringFromIndex:(x + currentLocation.location)] rangeOfString:@"http://"];
    if (currentLocation.location > statusText.text.length) break;
    [urlLocations addObject:[NSNumber numberWithInt:(currentLocation.location + x)]];
}
currentLocation = NSMakeRange(0, statusText.text.length);
for (int x = 0; x < statusText.text.length; x++) {
    currentLocation = [[statusText.text substringFromIndex:(x + currentLocation.location)] rangeOfString:@"http://www."];
    if (currentLocation.location > statusText.text.length) break;
    [urlLocations addObject:[NSNumber numberWithInt:(currentLocation.location + x)]];
}
currentLocation = NSMakeRange(0, statusText.text.length);
for (int x = 0; x < statusText.text.length; x++) {
    currentLocation = [[statusText.text substringFromIndex:(x + currentLocation.location)] rangeOfString:@" www." options:NSLiteralSearch];
    if (currentLocation.location > statusText.text.length) break;
    [urlLocations addObject:[NSNumber numberWithInt:(currentLocation.location + 1 + x)]];
}

//Get rid of any duplicate locations
NSSet *uniqueElements = [NSSet setWithArray:urlLocations];
[urlLocations release];
NSArray *finalURLLocations = [[NSArray alloc] init];
finalURLLocations = [uniqueElements allObjects];

//Parse out the URLs of each of the locations
for (int x = 0; x < [finalURLLocations count]; x++) {
    NSRange temp = [[statusText.text substringFromIndex:[[finalURLLocations objectAtIndex:x] intValue]] rangeOfString:@" "];
    int length = temp.location + [[finalURLLocations objectAtIndex:x] intValue];
    if (temp.location > statusText.text.length) length = statusText.text.length;
    length = length - [[finalURLLocations objectAtIndex:x] intValue];
    NSLog(@"URL: %@", [statusText.text substringWithRange:NSMakeRange([[finalURLLocations objectAtIndex:x] intValue], length)]);
}

I feel like it could be improved via the usage of regular expressions or something. Any help in improving this would be greatly appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

悲凉≈ 2024-10-16 23:02:29

如果您的目标是 iOS 4.0+,您应该让 Apple 为您完成工作并使用内置数据检测器。使用 NSTextCheckingTypeLink 选项创建一个 NSDataDetector 实例,并在字符串上运行它。 NSDataDetector 的文档有一些关于该类的使用的好例子。

如果您因任何原因不使用数据检测器,John Gruber 几个月前发布了一个用于检测 URL 的良好正则表达式模式:

If you target iOS 4.0+, you should let Apple do the work for you and use the built-in data detectors. Create an instance of NSDataDetector with the NSTextCheckingTypeLink option and run it over your string. The documentation for NSDataDetector has some good examples on the usage of the class.

If you don't/can't use data detectors for any reason, John Gruber has posted a good regex pattern for detecting URLs a few months ago: http://daringfireball.net/2010/07/improved_regex_for_matching_urls

乖乖哒 2024-10-16 23:02:29

作为后续行动,我的代码已更改为:

    statusText.text = @"http://google.com http://www.apple.com www.joshholat.com hey there google.com";

NSError *error = NULL;
NSDataDetector *detector = [NSDataDetector dataDetectorWithTypes:NSTextCheckingTypeLink error:&error];

NSArray *matches = [detector matchesInString:statusText.text
                                     options:0
                                       range:NSMakeRange(0, statusText.text.length)];

for (NSTextCheckingResult *match in matches) {
    if ([match resultType] == NSTextCheckingTypeLink) {
        NSLog(@"URL: %@", [[match URL] absoluteURL]);
    }
}

Just as a follow up, here's what my code has been changed to:

    statusText.text = @"http://google.com http://www.apple.com www.joshholat.com hey there google.com";

NSError *error = NULL;
NSDataDetector *detector = [NSDataDetector dataDetectorWithTypes:NSTextCheckingTypeLink error:&error];

NSArray *matches = [detector matchesInString:statusText.text
                                     options:0
                                       range:NSMakeRange(0, statusText.text.length)];

for (NSTextCheckingResult *match in matches) {
    if ([match resultType] == NSTextCheckingTypeLink) {
        NSLog(@"URL: %@", [[match URL] absoluteURL]);
    }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文