电子邮件主题分隔符间距中的重音单词 - 如何阻止这种情况?

发布于 2024-08-02 08:49:32 字数 3122 浏览 2 评论 0原文

我们有一个自定义的 php 电子邮件营销应用程序,还有一个有趣的问题: 如果邮件的主题行包含带有重音符号的单词,则会“吞掉”该单词与下一个单词之间的空格。 示例:短语

Ángel Ríos escucha y sorprende

显示(至少通过 gmail 和 Lotus Notes)为

ÁngelRíos escucha y sorprende

消息源中的特定行显示:

< code>主题:=?ISO-8859-1?Q?=C1ngel?= =?ISO-8859-1?Q?R=EDos?= escucha y sorprende

(半完整标头):

Delivered-To: [email protected]
Received: {elided}
Return-Path: <return@path>
Received: {elided}
Received: (qmail 23734 invoked by uid 48); 18 Aug 2009 13:51:14 -0000
Date: 18 Aug 2009 13:51:14 -0000
To: "Adriano" <[email protected]>
Subject: =?ISO-8859-1?Q?=C1ngel?= =?ISO-8859-1?Q?R=EDos?= escucha y sorprende
MIME-Version: 1.0
From: {elided}
X-Mailer: PHP
X-Lista: 1290
X-ID: 48163
Content-Type: text/html; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
Message-ID: <[email protected]>

编辑:

该应用程序使用旧版本的 Html Mime Mail 来准备消息,我将尝试升级到较新的版本。无论如何,这是对主题进行编码的函数: 这

/**
 * Function to encode a header if necessary
 * according to RFC2047
 */
function _encodeHeader($input, $charset = 'ISO-8859-1')
{
    preg_match_all('/(\w*[\x80-\xFF]+\w*)/', $input, $matches);
    foreach ($matches[1] as $value) {
        $replacement = preg_replace('/([\x80-\xFF])/e', '"=" . strtoupper(dechex(ord("\1")))', $value);
        $input = str_replace($value, '=?' . $charset . '?Q?' . $replacement . '?=', $input);
    }

    return $input;
}

是对主题进行编码的代码:

if (!empty($this->headers['Subject'])) {
    $subject = $this->_encodeHeader($this->headers['Subject'],
                                    $this->build_params['head_charset']);
    unset($this->headers['Subject']);
}

总结

问题是,事实上,程序没有对案例中的空间进行编码提及。 接受的答案< /a> 经过轻微修改(在该答案的注释中提到)后解决了我的问题,因为安装的 PHP 版本不支持特定的实现细节。

最终答案

虽然接受的答案确实解决了问题,但我们发现它与数千封电子邮件一起占用了服务器上的所有可用内存。我检查了这个电子邮件框架的原始开发者的网站,发现该功能已更新为以下内容:

function _encodeHeader($input, $charset = 'ISO-8859-1') {
        preg_match_all('/(\w*[\x80-\xFF]+\w*)/', $input, $matches);
        foreach ($matches[1] as $value) {
            $replacement = preg_replace('/([\x80-\xFF])/e', '"=" . strtoupper(dechex(ord("\1")))', $value);
            $input = str_replace($value, $replacement , $input);
        }
        if (!empty($matches[1])) {
            $input = str_replace(' ', '=20', $input);
            $input = '=?' . $charset .  '?Q?' .$input . '?=';
        }
        return $input;
    }

巧妙地解决了问题并保持在内存限制之下。

We have a custom php email marketing app, and an interesting problem:
If the subject line of the message contains a word with accents, it 'swallows' the spaces between it and the following word.
An example: the phrase

Ángel Ríos escucha y sorprende

is shown (by at least gmail and lotus notes) as

ÁngelRíos escucha y sorprende

The particular line in the message source shows:

Subject: =?ISO-8859-1?Q?=C1ngel?= =?ISO-8859-1?Q?R=EDos?= escucha y sorprende

(semi-full headers):

Delivered-To: [email protected]
Received: {elided}
Return-Path: <return@path>
Received: {elided}
Received: (qmail 23734 invoked by uid 48); 18 Aug 2009 13:51:14 -0000
Date: 18 Aug 2009 13:51:14 -0000
To: "Adriano" <[email protected]>
Subject: =?ISO-8859-1?Q?=C1ngel?= =?ISO-8859-1?Q?R=EDos?= escucha y sorprende
MIME-Version: 1.0
From: {elided}
X-Mailer: PHP
X-Lista: 1290
X-ID: 48163
Content-Type: text/html; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
Message-ID: <[email protected]>

EDIT:

The app uses an old version of Html Mime Mail to prepare messages, I'll try to upgrade to a newer version. Anyway, this is the function that encodes the subject:

/**
 * Function to encode a header if necessary
 * according to RFC2047
 */
function _encodeHeader($input, $charset = 'ISO-8859-1')
{
    preg_match_all('/(\w*[\x80-\xFF]+\w*)/', $input, $matches);
    foreach ($matches[1] as $value) {
        $replacement = preg_replace('/([\x80-\xFF])/e', '"=" . strtoupper(dechex(ord("\1")))', $value);
        $input = str_replace($value, '=?' . $charset . '?Q?' . $replacement . '?=', $input);
    }

    return $input;
}

And here it's the code where the subject is encoded:

if (!empty($this->headers['Subject'])) {
    $subject = $this->_encodeHeader($this->headers['Subject'],
                                    $this->build_params['head_charset']);
    unset($this->headers['Subject']);
}

Wrap-up

The problem was that, indeed, the program wasn't encoding the space in the case mentioned. The accepted answer solved my problem, after a slight modification (mentioned in the comments to that answer) because the installed version of PHP didn't support a particular implementation detail.

Final answer

Although the accepted answer did solve the problem, we found that it, combined with many thousands of emails, was chewing all the available memory on the server. I checked the website of the original developer of this email framework, and found that the function had been updated to the following:

function _encodeHeader($input, $charset = 'ISO-8859-1') {
        preg_match_all('/(\w*[\x80-\xFF]+\w*)/', $input, $matches);
        foreach ($matches[1] as $value) {
            $replacement = preg_replace('/([\x80-\xFF])/e', '"=" . strtoupper(dechex(ord("\1")))', $value);
            $input = str_replace($value, $replacement , $input);
        }
        if (!empty($matches[1])) {
            $input = str_replace(' ', '=20', $input);
            $input = '=?' . $charset .  '?Q?' .$input . '?=';
        }
        return $input;
    }

which neatly solved the problem and stayed under the mem limit.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

蒗幽 2024-08-09 08:49:32

您还需要对之间的空格进行编码(请参阅 RFC 2047 ):

(=?ISO-8859-1?Q?a?= =?ISO-8859-1?Q?b?=) (ab)

不显示相邻“编码字”之间的空白。

[…]

<前><代码>(=?ISO-8859-1?Q?a_b?=) (ab)

为了使空格显示在编码文本的一部分中,必须将空格编码为“编码字”的一部分。

(=?ISO-8859-1?Q?a?= =?ISO-8859-2?Q?_b?=) (ab)

为了使空格显示在两个编码文本字符串之间,空格可以被编码为“编码字”之一的一部分。

所以应该这样做:

Subject: =?ISO-8859-1?Q?=C1ngel=20R=EDos?= escucha y sorprende

编辑    尝试这个功能:

function _encodeHeader($str, $charset='ISO-8859-1')
{
    $words = preg_split('/(\s+)/', $str, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
    $func = create_function('$match', 'return $match[0] === " " ? "_" : sprintf("=%02X", ord($match[0]));');
    $encoded = false;
    foreach ($words as $key => &$word) {
        if (!ctype_space($word)) {
            $tmp = preg_replace_callback('/[^\x21-\x3C\x3E-\x5E\x60-\x7E]/', $func, $word);
            if ($tmp !== $word) {
                if (!$encoded) {
                    $word = '=?'.$charset.'?Q?'.$tmp;
                } else {
                    $word = $tmp;
                    if ($key > 0) {
                        $words[$key-1] = preg_replace_callback('/[^\x21-\x3C\x3E-\x5E\x60-\x7E]/', $func, $words[$key-1]);
                    }
                }
                $encoded = true;
            } else {
                if ($encoded) {
                    $words[$key-2] .= '?=';
                }
                $encoded = false;
            }
        }
    }
    if ($encoded) {
        $words[$key] .= '?=';
    }
    return implode('', $words);
}

You need to encode the space in between as well (see RFC 2047):

(=?ISO-8859-1?Q?a?= =?ISO-8859-1?Q?b?=)     (ab)

White space between adjacent 'encoded-word's is not displayed.

[…]

(=?ISO-8859-1?Q?a_b?=)                      (a b)

In order to cause a SPACE to be displayed within a portion of encoded text, the SPACE MUST be encoded as part of the 'encoded-word'.

(=?ISO-8859-1?Q?a?= =?ISO-8859-2?Q?_b?=)    (a b)

In order to cause a SPACE to be displayed between two strings of encoded text, the SPACE MAY be encoded as part of one of the 'encoded-word's.

So this should do it:

Subject: =?ISO-8859-1?Q?=C1ngel=20R=EDos?= escucha y sorprende

Edit    Try this function:

function _encodeHeader($str, $charset='ISO-8859-1')
{
    $words = preg_split('/(\s+)/', $str, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
    $func = create_function('$match', 'return $match[0] === " " ? "_" : sprintf("=%02X", ord($match[0]));');
    $encoded = false;
    foreach ($words as $key => &$word) {
        if (!ctype_space($word)) {
            $tmp = preg_replace_callback('/[^\x21-\x3C\x3E-\x5E\x60-\x7E]/', $func, $word);
            if ($tmp !== $word) {
                if (!$encoded) {
                    $word = '=?'.$charset.'?Q?'.$tmp;
                } else {
                    $word = $tmp;
                    if ($key > 0) {
                        $words[$key-1] = preg_replace_callback('/[^\x21-\x3C\x3E-\x5E\x60-\x7E]/', $func, $words[$key-1]);
                    }
                }
                $encoded = true;
            } else {
                if ($encoded) {
                    $words[$key-2] .= '?=';
                }
                $encoded = false;
            }
        }
    }
    if ($encoded) {
        $words[$key] .= '?=';
    }
    return implode('', $words);
}
旧伤还要旧人安 2024-08-09 08:49:32

添加:

$input = str_replace('?', '=3F', $input);

在此片段中

if (!empty($matches[1])) {
$input = str_replace('?', '=3F', $input);
$input = str_replace(' ', '=20', $input);
$input = '=?' . $charset .  '?Q?' .$input . '?=';
}

add

$input = str_replace('?', '=3F', $input);

in this fragment:

if (!empty($matches[1])) {
$input = str_replace('?', '=3F', $input);
$input = str_replace(' ', '=20', $input);
$input = '=?' . $charset .  '?Q?' .$input . '?=';
}
镜花水月 2024-08-09 08:49:32

查找 mbstring 和 UTF 转换。非英语语言中的许多特殊字符都是在 UTF8 字符集中处理的。

将主题字符串转换为 UTF8 并确保按此方式发送电子邮件应正确呈现主题行。

至少当我们在发送电子邮件时遇到类似问题时,它对我们有用

Look up mbstring and UTF conversions. Many of the special characters in non-English languages are dealt with in the UTF8 character set.

Converting your subject string to UTF8 and ensuring that the email is sent as such should render the subject lines correctly.

At least it did for us when we had a similar issue sending email

神爱温柔 2024-08-09 08:49:32

看来您最好发送 Subject: =?ISO-8859-1?Q?=C1ngel R=EDos escucha y sorprende?= ,因为问题出现在 ?= 编码末端附近。

It would appear you'd better send Subject: =?ISO-8859-1?Q?=C1ngel R=EDos escucha y sorprende?= , as the problem appears near the ?= encoding end.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文