如何在 Objective-C 中将 NSString 转换为带变音符号的 const char*?
我在将 NSString 中的变音符号转换为 const char* 时遇到问题。
此方法解析单词的文本文件(逐行),将单词作为字符串保存在 NSArray *结果中。然后转换为 const char tmpConstChars。例如,此 const char 保存“ä”,如“√§”。如何从 NSString 转换为 const char * - 我认为这是正确的。
- (void)inputWordsByFile:(NSString *)path
{
NSError *error = [[NSError alloc] init];
NSString *content = [NSString stringWithContentsOfFile:path encoding:NSUTF8StringEncoding error:&error];
NSArray *results = [content componentsSeparatedByString:@"\n"];
NSMutableArray *words = [[NSMutableArray alloc] initWithArray:results];
[words removeLastObject];
for(int i=0; i<[words count]; i++){
const char *tmpConstChars = [[words objectAtIndex:i] UTF8String];
[self addWordToTree:tmpConstChars];
}
}
I have a problem with umlauts in a NSString converting to const char*.
This method parses a textfile of words (line by line), saves the words as strings in NSArray *results. Then convert to const char tmpConstChars. This const char saves, for example, an 'ä' like '√§'. How to convert from NSString to const char * - I Thought this is correct.
- (void)inputWordsByFile:(NSString *)path
{
NSError *error = [[NSError alloc] init];
NSString *content = [NSString stringWithContentsOfFile:path encoding:NSUTF8StringEncoding error:&error];
NSArray *results = [content componentsSeparatedByString:@"\n"];
NSMutableArray *words = [[NSMutableArray alloc] initWithArray:results];
[words removeLastObject];
for(int i=0; i<[words count]; i++){
const char *tmpConstChars = [[words objectAtIndex:i] UTF8String];
[self addWordToTree:tmpConstChars];
}
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
除非我弄错了,否则
UTF8String
方法会返回字符串的 UTF-8 编码字节。对于 zählen,这些是:…其中 <195, 164>是 UTF-8 编码序列
ä
< /a>.因此,当您查看tmpChars+2
时,您会得到 ASCII 代码 164 的字符。这可能不是你想要的。你不是更喜欢unichar
吗?有一个characterAtIndex:
方法可以返回这些内容,尽管是一个接一个:Unless I am mistaken, the
UTF8String
method returns the UTF-8 encoding bytes for the string. For zählen, these are:…where <195, 164> is the UTF-8 encoding sequence for
ä
. Thus, when you poke intotmpChars+2
, you get the character with ASCII code 164 back. Which is probably not what you want. Aren’t you more afterunichar
s? There’s acharacterAtIndex:
method that returns those, albeit one after one: