7bit 和 8bit 编码消息在输出之前是否必须解码？

发布于 2024-11-26 20:07:50 字数 843 浏览 0 评论 0原文

7位传输编码和UTF-7之间以及8位和UTF-8之间可能有什么关系？

如下面的代码所示，手动将消息正文编码转换为预期编码（假设“utf-8”）是否有意义？

 function decodeBody($body, $transferEncoding, $bodyEncoding) {

        switch ($transferEncoding) { 

            case '7BIT' :
            case '8BIT' :   
                                    // any additional decoding here ?
                $body = mb_convert_encoding($body, 'utf-8', $bodyEncoding);
            break;


            case 'BASE64' :
                $body = base64_decode($body);
                $body = mb_convert_encoding($body, 'utf-8', $bodyEncoding);
            break;

            case 'QUOTED_PRINTABLE' :
                $body = quoted_printable_decode($body);
                $body = mb_convert_encoding($body, 'utf-8', $bodyEncoding);
            break;

        }

        return $body;
    }

原文

What's the possible relation between 7bit transfer encoding and UTF-7, as well as between 8bit and UTF-8 ?

Does it make sense manually converting message body encoding to expected one (assume 'utf-8') as in the code below ?

 function decodeBody($body, $transferEncoding, $bodyEncoding) {

        switch ($transferEncoding) { 

            case '7BIT' :
            case '8BIT' :   
                                    // any additional decoding here ?
                $body = mb_convert_encoding($body, 'utf-8', $bodyEncoding);
            break;


            case 'BASE64' :
                $body = base64_decode($body);
                $body = mb_convert_encoding($body, 'utf-8', $bodyEncoding);
            break;

            case 'QUOTED_PRINTABLE' :
                $body = quoted_printable_decode($body);
                $body = mb_convert_encoding($body, 'utf-8', $bodyEncoding);
            break;

        }

        return $body;
    }

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

不打扰别人 2024-12-03 20:07:50

引用RFC1341：

值“8bit”， “7bit”和“binary”都意味着没有执行编码......
“8bit”表示行很短，但可能有非 ASCII 字符（设置了高位的八位字节）。

这意味着 7bit 是纯 ascii，您不需要将其转换为完全是 UTF-8（因此在这种情况下无需使用 mb_convert_encoding() ）。 “8bit”意味着可能存在非ascii字符，但据我了解，它不一定是UTF-8字符集编码——可能是iso-8859-1或其他字符。所以据我所知“8bit”并不自动意味着UTF-8。

回复收藏 0 原文

夕嗳→ 2024-12-03 20:07:50

不，不需要解码

根据 RFC 822，原始 SMTP 协议被设计为以 7 位格式（原始 ASCII）传输消息。

因此任何第 8 位设置为 1 的消息都应该以某种方式进行编码。

7位意味着完全需要编码。

8 位 表示这是一条 8 位干净的邮件，并且 SMTP 服务器未更改该邮件。它不需要在接收端进行解码。
但是如果你想使用 SMTP 发送消息，你必须将其编码为“7bit”或“quote-printable”（不使用消息的第 8 位），

所以你根本不需要这一行：

$body = mb_convert_encoding($body, 'utf-8', $bodyEncoding);

但是之后解码这些编码，根据 RFC2047，消息正文甚至标头中的某些字符串可能需要MIME 解码，因此如果数据存在，则必须将字符串传递给该方法不是一个blob（不是附件文件）：

imap_utf8()

No, No need to decoding

According to RFC 822 the original SMTP protocol has been designed to transfer message in 7-bit format (original ASCII).

so any message has the 8th bit set to 1 should be encoded somehow.

7-bit means to encoding needed at all.

8-bit means it is a 8bit-clean message and the SMTP server didn't alter the message. it doesn't need decoding on the recipient end.
but if you want to send the message using SMTP, you have to encode it as '7bit' or 'quote-printable' (to not use the 8th bit on the message)

so you don't need this line at all:

$body = mb_convert_encoding($body, 'utf-8', $bodyEncoding);

but after decoding these encoding, according to RFC2047 there maybe some string in the message body or even header need to be MIME Decode, so you have to pass the string to that method if data is not a blob (not attachment file):