尝试分割一个非常大的字符串
可能的重复:
将 NSString 拆分为最有效的内存方式子串
我正在尝试分割一个 20Mb 的字符串。我尝试过使用 ComponentsSeparatedByString 但它消耗了太多的 RAM。我认为这是因为它分割了字符串,但又使原始字符串保持完整。这意味着该字符串有效地存储在内存中两次(即使我在分割后立即释放原始字符串,它仍然是一个问题。)
我对 Objective C 很陌生。我尝试编写一些代码来删除将原始字符串中的子字符串添加到找到的字符串数组中。这个想法是,随着找到的字符串的可变数组变大,原始字符串会变小。唯一的问题是它会泄漏内存并崩溃。如果有人能告诉我我做错了什么,那你就太好了!
NSRange range = [mainHtml rangeOfString:@"<p class=NumberedParagraph>"];
int counter = 1;
// locations will == int max if it can't find any more occurances
while (range.location < [mainHtml length]) {
NSString *curStr;
NSRange curStrRange;
NSRange rangeToSearchIn = NSMakeRange(range.location+1, [mainHtml length] - range.location - 1);
NSRange nextRange = [mainHtml rangeOfString:@"<p class=NumberedParagraph>" options:NSCaseInsensitiveSearch range:rangeToSearchIn];
if (nextRange.location > [mainHtml length])
{
// This is the last string - get everything up to the end of the file
curStrRange = NSMakeRange(0, [mainHtml length]);
curStr = [mainHtml substringFromIndex:range.location];
} else {
curStrRange = NSMakeRange(range.location, nextRange.location - range.location);
curStr = [mainHtml substringWithRange:curStrRange];
}
// Remove the substring just processed from the orignal string
// * it crashes here, normally on the 3rd itteration
mainHtml = [mainHtml substringFromIndex:curStrRange.location + curStrRange.length];
range = [mainHtml rangeOfString:@"<p class=NumberedParagraph>"];
[self.parts addObject:curStr];
}
Possible Duplicate:
Most memory efficient way to split an NSString in to substrings
I'm trying to split a 20Mb string. I've tried using componentsSeparatedByString but it consumes too much RAM. I think that this is down to the fact that it splits the string but also leaves the original string intact. This means that the string is effectivly stored in memory twice (even if I release the original string right after the split it is still an issue.)
I'm very new to Objective C. I've tried to write some code that removes the substring from the original string as it adds it to the array of found strings. The idea is that as the mutable array of found strings gets larger the original string gets smaller. The only problem is that it leaks memory and crashes. If someone could tell me what I'm doing wrong then that yould be great!
NSRange range = [mainHtml rangeOfString:@"<p class=NumberedParagraph>"];
int counter = 1;
// locations will == int max if it can't find any more occurances
while (range.location < [mainHtml length]) {
NSString *curStr;
NSRange curStrRange;
NSRange rangeToSearchIn = NSMakeRange(range.location+1, [mainHtml length] - range.location - 1);
NSRange nextRange = [mainHtml rangeOfString:@"<p class=NumberedParagraph>" options:NSCaseInsensitiveSearch range:rangeToSearchIn];
if (nextRange.location > [mainHtml length])
{
// This is the last string - get everything up to the end of the file
curStrRange = NSMakeRange(0, [mainHtml length]);
curStr = [mainHtml substringFromIndex:range.location];
} else {
curStrRange = NSMakeRange(range.location, nextRange.location - range.location);
curStr = [mainHtml substringWithRange:curStrRange];
}
// Remove the substring just processed from the orignal string
// * it crashes here, normally on the 3rd itteration
mainHtml = [mainHtml substringFromIndex:curStrRange.location + curStrRange.length];
range = [mainHtml rangeOfString:@"<p class=NumberedParagraph>"];
[self.parts addObject:curStr];
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我认为@babbidi 的想法是正确的。 mainHtml 很大,并且您周围有许多未发布的自动发布副本(每次迭代一个副本)。尝试在代码中添加以下 @autorelease 以在每个循环结束时释放所有自动释放的对象。如果您没有使用 Mac OS X 10.7,那么您只需在主循环之外手动创建自动释放池,并在每次迭代时耗尽它一次。
I think that @babbidi had the correct idea. mainHtml is large and you have many autoreleased copies of it around (one copy for each iteration) that are not being released. Try adding the following @autorelease in your code to release all the autoreleased objects at the end of each loop. If you are not using Mac OS X 10.7 then you need only create the autorelease pool manually outside the main loop and drain it once per iteration.
我不相信你有任何泄漏。
substringFromIndex:
返回一个自动释放的字符串,因此它可能会保留在内存中多次迭代。您可以创建自己的substringFromIndex:
方法(例如:createSubstringFromIndex
),该方法将返回一个保留的字符串,您可以手动释放该字符串。在您的代码中,您必须将其替换
为:
I don't believe you have any leaks.
substringFromIndex:
returns an autoreleased string, so it might be kept in memory for more then one iteration. You could create your ownsubstringFromIndex:
method (e.g:createSubstringFromIndex
) which will return a string retained string which you can manually release.in your code you'd have to replace this:
with this: