从电子邮件转换为 utf-8

发布于 2024-12-23 11:50:23 字数 1073 浏览 4 评论 0原文

我使用 Zend_Mail_Storage_Pop3 来检索邮件消息。

我的邮件主题是 Foo/æøå

$message->getHeader('content-type') 给我 text/plain;字符集=ISO-8859-1; format=flowed

在任何编码之前,我的 $message->subject 看起来像这样

Foo/µ°Õ - 2h - comment

然后我尝试做一个 iconv关于主题

$message->subject = iconv('ISO-8859-1','UTF-8', $message->subject);

现在我的主题看起来像这样

Foo/├ª├©├Ñ - 2h - comment

这不是 utf-8 :)

那么我该怎么办? 我还尝试过 utf8_encode 和 mb_convert_encoding 但这些给出了相同的结果

我明白了 - 但它有点混乱,但它有效

$this->mails = new Zend_Mail_Storage_Pop3(...);
$currentMessageId = $this->mails->getNumberByUniqueId($this->mails->getUniqueId($messageId));
$raw = $this->mails->getRawHeader($currentMessageId);
$l = explode("\n", $raw);
foreach($l AS $m) {
    if (strpos($m, 'Subject: ') === 0) {
        $subject = trim(str_replace('Subject: ', '', $m));
        break;
    }
}

$subject = str_replace("_"," ", mb_decode_mimeheader($subject));

Im using Zend_Mail_Storage_Pop3 to retrieve mail messages.

My subject on a mail is
Foo/æøå

$message->getHeader('content-type')
gives me text/plain; charset=ISO-8859-1; format=flowed

Before any encoding my $message->subject looks like this

Foo/µ°Õ - 2h - comment

Then I try to do a iconv on the subject

$message->subject = iconv('ISO-8859-1','UTF-8', $message->subject);

Now my subject looks like this

Foo/├ª├©├Ñ - 2h - comment

Which is not utf-8 :)

So what should I do?
I also tried with utf8_encode and mb_convert_encoding
but these gives the same result

Well I got it - but its a bit messy, but it works

$this->mails = new Zend_Mail_Storage_Pop3(...);
$currentMessageId = $this->mails->getNumberByUniqueId($this->mails->getUniqueId($messageId));
$raw = $this->mails->getRawHeader($currentMessageId);
$l = explode("\n", $raw);
foreach($l AS $m) {
    if (strpos($m, 'Subject: ') === 0) {
        $subject = trim(str_replace('Subject: ', '', $m));
        break;
    }
}

$subject = str_replace("_"," ", mb_decode_mimeheader($subject));

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

盗心人 2024-12-30 11:50:23

content-type字段通常保存消息正文的编码,而不是标头的编码。您能看一下原始格式的消息吗? ISO 8859-1 中的字段应如下所示:

=?ISO-8859-1?Q?Graphgr=F6=DFen?=

而 UTF8 编码的标头应如下所示:

=?UTF-8?B?w5xtbMOkdXRlIGluIFVURjg=?=

The content-type-field usually holds the encoding for the message body, not for the header. Can you have a look at the message in it's raw format? A field in ISO 8859-1 should look like this:

=?ISO-8859-1?Q?Graphgr=F6=DFen?=

while an UTF8 encoded header should look like this:

=?UTF-8?B?w5xtbMOkdXRlIGluIFVURjg=?=
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文