Cocoa Touch 的规范 NSUnicodeStringEncoding 是什么意思?
是UTF-16吗? 32?还有别的事吗?
出于性能原因,我想对此进行研究,因为我将很多字符串从 UTF-8 转换为“本机 NSString
”,并且性能损失似乎落在 __CFFromUTF8,这是一个内置的转换函数。顺便说一句:我只是猜测
NSUnicodeStringEncoding
是内部使用的,因为 NSString
的 fastestEncoding
返回该值(即对于国际字符串;使用 ANSI 时,返回 MacRomans)。
Is it UTF-16? 32? Something else?
I want to look into this for performance reasons, since I'm converting a lot of strings from UTF-8 to "native NSString
", and the performance penalty seems to land on __CFFromUTF8
, which is a built-in conversion function. Btw: I'm just guessing thatNSUnicodeStringEncoding
is what is used internally, since NSString
's fastestEncoding
returns that value (i.e. for international strings; when using ANSI, MacRomans is returned).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
使用
dataUsingEncoding:
进行测试表明NSUnicodeStringEncoding
是小端 UTF-16,前面带有字节顺序标记(在模拟器和真实设备上)和 Apple 的 字符串编程Cocoa 指南 说“NSString 对象在概念上是具有平台字节顺序的 UTF-16”,因此我认为假设内部使用 UTF-16 是合理的。(同一指南接着说“这并不一定意味着他们的内部存储机制”,因此他们完全保留将来更改这一点的权利)
Testing using
dataUsingEncoding:
indicatesNSUnicodeStringEncoding
is little-endian UTF-16 preceded with a byte order mark (on both the simulator and a real device) and Apple's String Programming Guide for Cocoa says "NSString objects are conceptually UTF-16 with platform endianness", so I'd think it's reasonable to assume UTF-16 is used internally.(the same guide goes on to say "That doesn’t necessarily imply anything about their internal storage mechanism", so they're fully reserving the right to change this in future)