如何在 Objective-C 中将 NSString 转换为带变音符号的 const char*？

发布于 2024-11-15 22:19:45 字数 775 浏览 3 评论 0原文

我在将 NSString 中的变音符号转换为 const char* 时遇到问题。

此方法解析单词的文本文件（逐行），将单词作为字符串保存在 NSArray *结果中。然后转换为 const char tmpConstChars。例如，此 const char 保存“ä”，如“√§”。如何从 NSString 转换为 const char * - 我认为这是正确的。

- (void)inputWordsByFile:(NSString *)path
{

    NSError *error = [[NSError alloc] init];
    NSString *content = [NSString stringWithContentsOfFile:path encoding:NSUTF8StringEncoding error:&error];
    NSArray *results = [content componentsSeparatedByString:@"\n"];

    NSMutableArray *words = [[NSMutableArray alloc] initWithArray:results];
    [words removeLastObject];
    for(int i=0; i<[words count]; i++){

    const char *tmpConstChars = [[words objectAtIndex:i] UTF8String];
    [self addWordToTree:tmpConstChars];

    }
}

原文

I have a problem with umlauts in a NSString converting to const char*.

This method parses a textfile of words (line by line), saves the words as strings in NSArray *results. Then convert to const char tmpConstChars. This const char saves, for example, an 'ä' like '√§'. How to convert from NSString to const char * - I Thought this is correct.

- (void)inputWordsByFile:(NSString *)path
{

    NSError *error = [[NSError alloc] init];
    NSString *content = [NSString stringWithContentsOfFile:path encoding:NSUTF8StringEncoding error:&error];
    NSArray *results = [content componentsSeparatedByString:@"\n"];

    NSMutableArray *words = [[NSMutableArray alloc] initWithArray:results];
    [words removeLastObject];
    for(int i=0; i<[words count]; i++){

    const char *tmpConstChars = [[words objectAtIndex:i] UTF8String];
    [self addWordToTree:tmpConstChars];

    }
}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

情绪 2024-11-22 22:19:45

除非我弄错了，否则 UTF8String 方法会返回字符串的 UTF-8 编码字节。对于 zählen，这些是：

$ perl -MEncode -Mutf8 -E 'say join ", ", map ord, split //, encode("utf8", "zählen")'
122, 195, 164, 104, 108, 101, 110

…其中 <195, 164>是 UTF-8 编码序列 ä< /a>.因此，当您查看 tmpChars+2 时，您会得到 ASCII 代码 164 的字符。这可能不是你想要的。你不是更喜欢 unichar 吗？有一个 characterAtIndex: 方法可以返回这些内容，尽管是一个接一个：

NSString *test = @"zählen";
unichar c = [test characterAtIndex:1];
NSLog(@"---> %C", c); // ---> ä

Unless I am mistaken, the UTF8String method returns the UTF-8 encoding bytes for the string. For zählen, these are:

$ perl -MEncode -Mutf8 -E 'say join ", ", map ord, split //, encode("utf8", "zählen")'
122, 195, 164, 104, 108, 101, 110

…where <195, 164> is the UTF-8 encoding sequence for ä. Thus, when you poke into tmpChars+2, you get the character with ASCII code 164 back. Which is probably not what you want. Aren’t you more after unichars? There’s a characterAtIndex: method that returns those, albeit one after one:

NSString *test = @"zählen";
unichar c = [test characterAtIndex:1];
NSLog(@"---> %C", c); // ---> ä

回复收藏 0 原文

~没有更多了~

关于作者

一个人的夜不怕黑

暂无简介

文章

26 人气

关注发私信

友情链接

文江博客

如何在 Objective-C 中将 NSString 转换为带变音符号的 const char*？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

﹏雨一样淡蓝的深情

友谊不毕业

苹果你个爱泡泡

alipaysp_DQOPIT9H5Y

仅此而已

谈下烟灰

友情链接

如何在 Objective-C 中将 NSString 转换为带变音符号的 const char*？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

﹏雨一样淡蓝的深情

友谊不毕业

苹果你个爱泡泡

alipaysp_DQOPIT9H5Y

仅此而已

谈下烟灰

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。