将 NSString 拆分成单词,然后将其重新连接成原始形式

发布于 2024-12-22 04:26:50 字数 398 浏览 2 评论 0原文

我像这样分割一个 NSString:(过滤字符串是一个 nsstring)

seperatorSet = [NSMutableCharacterSet whitespaceAndNewlineCharacterSet];
    [seperatorSet formUnionWithCharacterSet:[NSCharacterSet punctuationCharacterSet]];
NSMutableArray *words = [[filterString componentsSeparatedByCharactersInSet:seperatorSet] mutableCopy];

我想用原始标点符号和间距将单词放回过滤字符串的形式。我想这样做的原因是我想改变一些文字并将其恢复到原来的样子。

I am splitting an NSString like this: (filter string is an nsstring)

seperatorSet = [NSMutableCharacterSet whitespaceAndNewlineCharacterSet];
    [seperatorSet formUnionWithCharacterSet:[NSCharacterSet punctuationCharacterSet]];
NSMutableArray *words = [[filterString componentsSeparatedByCharactersInSet:seperatorSet] mutableCopy];

I want to put words back into the form of filter string with the original punctuation and spacing. The reason I want to do this is I want to change some words and put it back together as it was originally.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

┼── 2024-12-29 04:26:50

按单词分割的更可靠的方法是使用字符串枚举。空格并不总是分隔符,并且并非所有语言都分隔空格(例如日语)。

NSString * string = @" \n word1!    word2,%$?'/word3.word4   ";

[string enumerateSubstringsInRange:NSMakeRange(0, string.length)
                           options:NSStringEnumerationByWords
                        usingBlock:
 ^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
     NSLog(@"Substring: '%@'", substring);
 }];

 // Logs:
 // Substring: 'word1'
 // Substring: 'word2'
 // Substring: 'word3'
 // Substring: 'word4' 

A more robust way to split by words is to use string enumeration. A space is not always the delimiter and not all languages delimit spaces anyway (e.g. Japanese).

NSString * string = @" \n word1!    word2,%$?'/word3.word4   ";

[string enumerateSubstringsInRange:NSMakeRange(0, string.length)
                           options:NSStringEnumerationByWords
                        usingBlock:
 ^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
     NSLog(@"Substring: '%@'", substring);
 }];

 // Logs:
 // Substring: 'word1'
 // Substring: 'word2'
 // Substring: 'word3'
 // Substring: 'word4' 
陌上青苔 2024-12-29 04:26:50
NSString *myString = @"Foo Bar Blah B..";
NSArray *myWords = [myString componentsSeparatedByCharactersInSet:
                    [NSCharacterSet characterSetWithCharactersInString:@" "]
                    ];
NSString* string = [myWords componentsJoinedByString: @" "];
NSLog(@"%@",string);
NSString *myString = @"Foo Bar Blah B..";
NSArray *myWords = [myString componentsSeparatedByCharactersInSet:
                    [NSCharacterSet characterSetWithCharactersInString:@" "]
                    ];
NSString* string = [myWords componentsJoinedByString: @" "];
NSLog(@"%@",string);
ζ澈沫 2024-12-29 04:26:50

由于您删除了原始标点符号,因此无法自动将其转回原处。

唯一的方法是不使用componentsSeparatedByCharactersInSet

另一种解决方案可能是迭代字符串,并针对每个字符检查它是否属于您的字符集。
如果是,则将字符添加到列表中,并将子字符串添加到另一个列表中(您可以使用 NSMutableArray 类)。
例如,这样您就知道第一个和第二个子字符串之间的标点符号是分隔符列表中的第一个字符。

Since you eliminate the original punctuation, there's no way to turn it back automatically.

The only way is not to use componentsSeparatedByCharactersInSet.

An alternative solution may be to iterate through the string and, for each char, check if it belongs to your character set.
If yes, add the char to a list and the substring to another list (you may use NSMutableArray class).
This way, for example, you know that the punctuation char between the first and the second substring is the first character in your list of separators.

伤痕我心 2024-12-29 04:26:50

您可以使用数组类的 pathArray componentsJoinedByString: 方法重新加入单词:

NSString *orig = [words pathArray componentsJoinedByString:@" "];

You can use the pathArray componentsJoinedByString: method of the array class to rejoin the words:

NSString *orig = [words pathArray componentsJoinedByString:@" "];
扛起拖把扫天下 2024-12-29 04:26:50

您如何确定哪些单词需要替换?与其从一开始就将其分开,也许使用 -stringByReplacingOccurrencesOfString:withString:options:range: 会更合适。

How are you determining which words need to be replaced? Instead of breaking it apart in the first place, perhaps using -stringByReplacingOccurrencesOfString:withString:options:range: would be more suitable.

多孤肩上扛 2024-12-29 04:26:50

我的猜测是您可能没有使用最好的 API。如果您真的担心单词,那么您应该使用基于单词的 API。我有点不清楚那是 NSDataDetector 还是其他东西。 (我相信 NSRegularExpression 可以以更智能的方式处理单词边界。)

My guess is you may not be using the best API. If you're really worried about words, you should be using a word-based API. I'm a bit hazy on whether that would be NSDataDetector or something else. (I believe NSRegularExpression can deal with word boundaries in a smarter way.)

仲春光 2024-12-29 04:26:50

如果您使用的是 Mac OS X 10.7+ 或 iOS 4+,您可以使用 NSRegularExpression,替换单词的模式为:“\b word \b” - (no 单词周围有空格) \b 匹配单词边界。查看方法 replaceMatchesInString:options:range:withTemplate:stringByReplacingMatchesInString:options:range:withTemplate:

在 10.6 pr 之前,如果您希望使用正则表达式,您可以包装 regcomp/regexec 基于 C 的函数,它们也支持字边界。但是,对于这个简单的情况,您可能更喜欢使用其他答案中提到的其他 Cocoa 选项之一。

If you are using Mac OS X 10.7+ or iOS 4+ you can use NSRegularExpression, The pattern to replace a word is: "\b word \b" - (no spaces around word) \b matches a word boundary. Look at methods replaceMatchesInString:options:range:withTemplate: and stringByReplacingMatchesInString:options:range:withTemplate:.

Under 10.6 pr earlier if you wish to use regular expressions you can wrap the regcomp/regexec C-based functions, they support word boundaries as well. However you may prefer to use one of the other Cocoa options mentioned in other answers for this simple case.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文