为什么用short*代替char*作为字符串? char* 和 unsigned char* 之间的区别?

发布于 2025-01-05 16:19:40 字数 471 浏览 0 评论 0原文

正如标题所说,我有两个问题。

编辑:澄清一下,他们实际上并没有使用 charshort,他们通过特定的方式确保它们是 8 位和 16 位类型定义。实际类型称为 UInt8UInt16

1.问题

iTunes SDK 使用无符号短*,其中需要字符串。使用它代替 char*/unsigned char* 有何优点?如何将其转换为 char*,以及使用此类型时有何不同?

2.问题

我只在必须存储字符串时见过 char* 。那么我什么时候应该使用 unsigned char* ,或者它没有任何区别?

As the title says, I'm having two questions.

Edit: To clarify, they don't actually use char and short, they ensure them to be 8-bit and 16-bit by specific typedefs. The actual type is then called UInt8 and UInt16.

1. Question

The iTunes SDK uses unsigned short* where a string is needed. What are the advantages of using it instead of char*/unsigned char*? How to convert it to char*, and what differs when working with this type instead?

2. Question

I've only seen char* when a string must be stored, yet. When should I use unsigned char* then, or doesn't it make any difference?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

梦里南柯 2025-01-12 16:19:40

无符号短数组可以与宽字符串一起使用 - 例如,如果您有 UTF-16 编码文本 - 尽管我希望在这些情况下看到 wchar_t 。但他们可能有自己的理由,比如 MacOS 和 Windows 之间的兼容。 (如果我的消息来源正确,MacOS 的 wchar_t 是 32 位,而 Windows 是 16 位。)

您可以通过调用适当的库函数在两种类型的字符串之间进行转换。哪个功能合适取决于具体情况。 SDK不是自带的吗?

并且使用 char 而不是 unsigned char,好吧,所有字符串历来都是用 char 定义的,因此切换到 unsigned char代码> 会引入不兼容性。
(切换到signed char也会导致不兼容,但不知何故没有那么多......)

编辑现在问题已经编辑了,让我说我没有看到在我输入答案之前进行编辑。但是,是的,由于上述原因,UInt16 是比 wchar_t 更好的 16 位实体表示。

unsigned short arrays can be used with wide character strings - for instance if you have UTF-16 encoded texts - although I'd expect to see wchar_t in those cases. But they may have their reasons, like being compatible between MacOS and Windows. (If my sources are right, MacOS' wchar_t is 32 bits, while Windows' is 16 bits.)

You convert between the two types of string by calling the appropriate library function. Which function is appropriate depends on the situation. Doesn't the SDK come with one?

And char instead of unsigned char, well, all strings have historically always been defined with char, so switching to unsigned char would introduce incompatibilities.
(Switching to signed char would also cause incompatibilities, but somehow not as many...)

Edit Now the question has been edited, let me say that I didn't see the edits before I typed my answer. But yes, UInt16 is a better representation of a 16 bit entity than wchar_t for the above reason.

挖鼻大婶 2025-01-12 16:19:40

1.问题 - 答案

我认为他们使用 unsigned Short* 因为他们必须对 unicode 字符使用 UTF-16 编码,从而表示 BMP 内和外的字符。您问题的其余部分取决于源和目标的 Unicode 编码类型 (UTF-8,16,32)

2。问题 - 答案

再次取决于编码类型以及您正在谈论的字符串。如果您计划处理扩展 ASCII 表之外的字符串,则切勿使用有符号或无符号字符。 (除英语之外的任何其他语言)

1. Question - Answer

I would suppose that they use unsigned short* because they must be utilizing UTF-16 encoding for unicode characters and hence representing characters both in and out of the BMP. The rest of your question depends on the type of Unicode encoding of the source and the destination (UTF-8,16,32)

2. Question - Answer

Again depends on the type of encoding and what strings are you talking about. You should never used signed or unsigned characters if you plan to deal with strings of characters outside of the Extended ASCII table. (Any other language except from English)

甜警司 2025-01-12 16:19:40
  1. 可能是轻率地尝试使用 UTF-16 字符串。 C 有一个 宽字符 类型,wchar_t 及其 char(或 wchar_t)可以是 16 位长。虽然我对 SDK 不太熟悉,无法说明为什么他们要走这条路线,但它可能是为了解决编译器问题。在 C99 中,有更合适的 [u]int[least/fast]16_t 类型 - 请参阅

    请注意,C 对数据类型及其底层大小几乎没有保证。有符号或无符号的 Shorts 不保证为 16 位(尽管保证至少有那么多),字符也不限制为 8 或 Widechars 16 或 32。

    要在字符和短字符串之间进行转换,您可以使用 SDK 提供的转换函数。如果您确切地知道它们在这些短字符串中存储的内容以及您想要在 char 字符串中存储的内容,您也可以编写自己的库或使用第 3 方库。

  2. 这实际上并没有什么区别。如果您想对字符进行(无符号)算术或位操作,通常会转换为 unsigned char

编辑:在你告诉我们他们使用 UInt16 而不是 unsigned Short 之前,我写了(或者开始写,无论如何)这个答案。在这种情况下,就不会涉及到野兔的大脑了。专有类型可能用于与没有 stdint 类型的旧版(或不兼容)编译器兼容,以存储 UTF-16 数据。这是完全合理的。

  1. Probably a harebrained attempt to use UTF-16 strings. C has a wide character type, wchar_t and its chars (or wchar_ts) can be 16 bits long. Though I'm not familiar enough with the SDK to say why exactly they went through this route, it's probably to work around compiler issues. In C99 there are much more suitable [u]int[least/fast]16_t types - see <stdint.h>.

    Note that C makes very little guarantees about data types and their underlying sizes. Signed or unsigned shorts aren't guaranteed to be 16 bits (though they are guaranteed to be at least that much), nor are chars restricted to 8 or widechars 16 or 32.

    To convert between char and short strings, you'd use the conversion functions provided by the SDK. You could also write your own or use a 3rd party library, if you knew exactly what they stored in those short strings AND what you wanted in your char strings.

  2. It doesn't really make a difference. You'd normally convert to unsigned char if you wanted to do (unsigned) arithmetic or bit manipulation on a character.

Edit: I wrote (or started writing, anyhow) this answer before you told us they used UInt16 and not unsigned short. In that case there are no hare brains involved; the proprietary type is probably used for compatibility with older (or noncompliant) compilers which don't have the stdint types, to store UTF-16 data. Which is perfectly reasonable.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文