Android 设备上短信的默认字符集/编码是什么?

发布于 2024-10-09 10:58:28 字数 172 浏览 2 评论 0原文

如果有必要保持简单的话,我主要关心北美的英语手机。

具体来说,当发送/接收短信和彩信时,字符是如何编码的?有区别吗?

我最初的研究表明 UTF-8 是默认设置,但我也看到了针对美国设备的 US-ASCII 的引用以及针对其他语言环境的其他字符集。

If necessary to keep it simple I am primarily concerned with English handsets in North America.

Specifically- when sending/recieving SMS and MMS messages, how are the characters encoded? Is there a difference?

My initial research suggested that UTF-8 was the default, but I have also seen references to US-ASCII for US devices and other charsets for other locales.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

谁对谁错谁最难过 2024-10-16 10:58:28

引用:
平台的默认字符集
是UTF-8。 (这与某些
较旧的实现,其中
默认字符集取决于用户的
语言环境。)

更多信息可以在这里找到:
字符集| Android 开发者

Quote:
The platform's default charset
is UTF-8. (This is in contrast to some
older implementations, where the
default charset depended on the user's
locale.)

More information can be found here:
Charset| Android Developers

一生独一 2024-10-16 10:58:28

如果您使用 GSM 03.38 无法直接编码的表情符号或文本,那么美国的简短答案是 GSM 03.38 和 UTF-16BE。

发送/接收短信时,编码绝对不是 UTF-8,因为 UTF-8 不支持PDU 或 SMPP 协议。搜索 SMPP 规范,了解支持的内容。在所有受支持的编码中,唯一兼容 Unicode 的选项是 UCS-2BE。我的观察是,大多数手机(包括所有 Android 和 iPhone)只是 假设这实际上是 UTF-16BE 因为它允许完整的 Unicode 字符集(包括 Emojis

The short answer for the US is GSM 03.38 and UTF-16BE if you use Emojis or text that GSM 03.38 cannot encode directly.

When sending/receiving SMS the encoding is definitely not UTF-8 since that isn't supported by the PDU or the SMPP protocol. Search for the SMPP spec for clarification on what is supported. Out of all supported encodings, the only Unicode compatible option is UCS-2BE. My observation is that most phones (includes all Android and iPhone) just assume this is actually UTF-16BE because it allows for the complete Unicode character set (including things like Emojis ????️).

SMS also have special mandatory encodings under the GSM03.38 specification which is based on septets. They allow up to 160 characters per PDU (as with many encodings not all characters are 1 code unit).

MMS is another animal entirely which isn't supported well outside of North America. But for MMS encoding the following are available (big endian or network byte order assumed when not specified):

  • US-ASCII
  • ISO-8859-1
  • ISO-8859-2
  • ISO-8859-3
  • ISO-8859-4
  • ISO-8859-5
  • ISO-8859-6
  • ISO-8859-7
  • ISO-8859-8
  • ISO-8859-9
  • SHIFT-JIS
  • UTF-8
  • BIG5
  • UCS2
  • UTF-16

MMS, however, isn't typically used unless you send a very long message (longer than 4 PDUs which is 560 bytes on Android) or if your message includes an image or something that cannot be encoded as a plain SMS.

Worth noting also is that MMS is much slower than SMS because it uses the SMTP protocol with special addressing (not based on DNS) and special multipart content types (see MM4 for details on this).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文