电子邮件地址可以包含国际(非英语)字符吗?

发布于 2024-07-17 10:00:39 字数 53 浏览 8 评论 0原文

如果可能的话,我是否应该接受用户发来的此类电子邮件?当我向此类地址发送邮件时会出现什么问题?

If it's possible, should I accept such emails from users and what problems to expect when I will be sending mails to such addresses?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

抚笙 2024-07-24 10:00:39

正式按照 RFC 6532 -

如需快速解释,请查看有关该主题的维基百科

Officially, per RFC 6532 - Yes.

For a quick explanation, check out wikipedia on the subject.

相权↑美人 2024-07-24 10:00:39

2015 年更新:使用 RFC 6532

实验性 5335 已被废弃: 6532
后者已设置为“类别:标准轨道”,
使其成为标准。

第 3.2 节RFC 5322) 已将大多数文本字段更新为
包含(正确的)UTF-8。

The following rules extend the ABNF syntax defined in [RFC5322] and
[RFC5234] in order to allow UTF-8 content.

VCHAR   =/  UTF8-non-ascii
ctext   =/  UTF8-non-ascii
atext   =/  UTF8-non-ascii
qtext   =/  UTF8-non-ascii
text    =/  UTF8-non-ascii
             ; note that this upgrades the body to UTF-8
dtext   =/  UTF8-non-ascii

The preceding changes mean that the following constructs now
allow UTF-8:
   1.  Unstructured text, used in header fields like
       "Subject:" or "Content-description:".
   2.  Any construct that uses atoms, including but not limited
       to the local parts of addresses and Message-IDs. This
       includes addresses in the "for" clauses of "Received:"
       header fields.
   3.  Quoted strings.
   4.  Domains.

Note that header field names are not on this list; these are still
restricted to ASCII.

请注意显式包含域。
并显式排除标头名称

另请注意NFKC

The UTF-8 NFKC normalization form SHOULD NOT be used because
it may lose information that is needed to correctly spell
some names in some unusual circumstances.

以及第 3 节开始:

Also note that messages in this format require the use of the
SMTPUTF8 extension [RFC6531] to be transferred via SMTP.

Update 2015: Use RFC 6532

The experimental 5335 has been Obsoleted by: 6532 and
this later has been set to "Category: Standards Track",
making it the standard.

The Section 3.2 (Syntax Extensions to RFC 5322) has updated most text fields to
include (proper) UTF-8.

The following rules extend the ABNF syntax defined in [RFC5322] and
[RFC5234] in order to allow UTF-8 content.

VCHAR   =/  UTF8-non-ascii
ctext   =/  UTF8-non-ascii
atext   =/  UTF8-non-ascii
qtext   =/  UTF8-non-ascii
text    =/  UTF8-non-ascii
             ; note that this upgrades the body to UTF-8
dtext   =/  UTF8-non-ascii

The preceding changes mean that the following constructs now
allow UTF-8:
   1.  Unstructured text, used in header fields like
       "Subject:" or "Content-description:".
   2.  Any construct that uses atoms, including but not limited
       to the local parts of addresses and Message-IDs. This
       includes addresses in the "for" clauses of "Received:"
       header fields.
   3.  Quoted strings.
   4.  Domains.

Note that header field names are not on this list; these are still
restricted to ASCII.

Please note the explicit inclusion of Domains.
And the explicit exclusion of header names.

Also Note about NFKC:

The UTF-8 NFKC normalization form SHOULD NOT be used because
it may lose information that is needed to correctly spell
some names in some unusual circumstances.

And Section 3 start:

Also note that messages in this format require the use of the
SMTPUTF8 extension [RFC6531] to be transferred via SMTP.
深居我梦 2024-07-24 10:00:39

问题是,某些邮件客户端(服务器工具和/或桌面工具)不支持它,并在您尝试将邮件发送到包含变音符号的地址时抛出“无效电子邮件”异常。

如果您想要全面支持,您可以将电子邮件地址部分转换为“punycode”。 这允许用户以通常的方式输入他们的地址,但您可以以支持级别的方式保存它。

示例:müller.com » xn--mller-kva.com

两者都指向同一事物。

The problem is that some mail clients (server-tools and / or desktop tools) don't support it and throw an 'invalid email' exception when you try to send a mail to an address which contains umlauts for example.

If you want full support, you could do the trick with converting the email-address parts to "punycode". This allows users to type in their addresses the usual way but you save it the supported-level way.

Example: müller.com » xn--mller-kva.com

Both points to the same thing.

世界和平 2024-07-24 10:00:39

我认为是的,因为许多顶级域已经允许非 ascii
域的字符,并且由于域是电子邮件地址的一部分,因此它是
完全有可能。 此类域名的一个示例是 www.öko.de

I would assume yes since a number of top level domains already allow non ascii
characters for domains and since the domain is part of an email address, it's
perfectly possible. An example for such a domain would be www.öko.de

花辞树 2024-07-24 10:00:39

简短的回答:是的,

不仅在用户名中而且在域名中都是允许的。

short answer: yes

not only in the username but also in the domain name are allowed.

抱着落日 2024-07-24 10:00:39

答案是肯定的,但是它们需要特殊编码。

看看这个。 阅读涉及电子邮件标头和 RFC 2047 的部分。

The answer is yes, but they need to be encoded specially.

Look at this. Read the part that refers to email-headers and RFC 2047.

轮廓§ 2024-07-24 10:00:39

还没有。 IEEE 计划这样做:
H-在线文章:IEFT规划国际化电子邮件地址,这里是 RfC:国际化电子邮件地址的 SMTP 扩展

引自 H-Online(当它被删除时):

互联网工程任务组 (IETF) 发布了三份关于电子邮件地址标头标准化的重要文档
包括 ASCII 字符集之外的符号。 这意味着
很快你就能使用汉字、法语口音和
电子邮件地址以及正文中的德语变音符号
信息。 因此,如果您的名字是 Zoë 并且您在一家公司工作,该公司生产
façades,您可能对新电子邮件地址感兴趣。 但
供应商代表已经在抱怨。 他们说会有
如果 Unicode 标准 UTF-8 是要升级的话,就需要“升级狂热”
取代美国信息交换标准代码 (ASCII)
目前用作通用电子邮件语言。

RFC 5335 指定在几乎所有电子邮件标头中使用 UTF-8。
必须对 SMTP 客户端、SMTP 服务器、邮件用户进行更改
代理 (MUA)、邮件列表软件、其他媒体网关、
以及处理或传递电子邮件的其他地方。 RFC 5336
扩展了 SMTP 电子邮件传输协议。 在水平上
协议,扩展标记为UTF8SMTP。

将添加一个新的标头字段作为一种“紧急降落伞”
确保 UTF-8 电子邮件在被丢弃后能够软着陆
在未升级的系统到达收件人之前。
“OldAddress”是一个纯 ASCII 地址。 但 OldAddress 不是
用作第二次传输尝试的通道,而是使
确保将反馈发送回家。

最后,RFC5337 确保发送与以下内容相关的正确消息:
非 ASCII 电子邮件的传送状态。 正确的地址
无法到达的收件人必须被送回,即使进一步的运输已完成
被拒绝。 电子邮件地址国际化 (EAI) 正在运行
该小组还正在研究一些“降级机制”
各种标头字段和信封。 如果可能的话,原始标题
信息是要被“打包”并保存的。

德国的 DeNIC(“.de”域名的注册商)仍然是
泰然处之。 “我们能做的真的不多”,
DeNIC 发言人 Klaus Herzig 解释道。 DeNIC 正在付费
更多关注 IETF 正在为
国际域名标准 – RFC3490 或 IDNA2003
有时已知。 “我们对此并不太高兴,因为没有
向后兼容性,”Herzig 解释道。当更新到来时,
DeNIC 表示,它将全力支持“ß”符号 - 同时
被称为 estzett - 迄今为止一直被忽视。 德国人
注册商还表示,在开启灯光之前可能会稍等一下
缺乏向后兼容性。 一旦新标准出台
运行稳定并且注册商和提供商已采用它, ß
将添加。

相比之下,专家认为,中国和中国的中国注册商
台湾将迅速实施国际化电子邮件的变革。
CNIC 和TWNIC 的代表是该标准的作者。
中国用户目前必须在 ASCII 的左侧书写电子邮件
@ 及其右侧的汉字为中文
域,这些域已经国际化。

(莫妮卡·埃尔默特)

Not yet. The IEEE plans to do this:
H-Online article: IEFT planning internationalised email addresses, here is the RfC: SMTP Extension for Internationalized Email Addresses

Quote from H-Online (as it went down):

The Internet Engineering Task Force (IETF) has published three crucial documents for the standardisation of email address headers
that include symbols outside the ASCII character set. This means that
soon you'll be able to use Chinese characters, French accents, and
German umlauts in email addresses as well as just in the body of the
message. So if your name is Zoë and you work for a company that makes
façades, you might be interested in a new email address. But
representatives of providers are already moaning. They say there would
need to be an "upgrade mania" if the Unicode standard UTF-8 is to
replace the American Standard Code for Information Interchange (ASCII)
currently used as the general email language.

RFC 5335 specifies the use of UTF-8 in practically all email headers.
Changes would have to be made to SMTP clients, SMTP servers, mail user
agents (MUAs), software for mailing lists, gateways to other media,
and everywhere else where email is processed or passed along. RFC 5336
expands the SMTP email transport protocol. At the level of the
protocol, the expansion is labelled UTF8SMTP.

A new header field will be added as a sort of "emergency parachute" to
ensure that UTF-8 emails have a soft landing if they are thrown out
before reaching the recipient by systems that have not been upgraded.
The "OldAddress" is a purely ASCII address. But OldAddress is not to
be used as a channel for a second transfer attempt, but rather to make
sure that feedback is sent home.

Finally, RFC5337 ensures that correct messages are sent pertaining to
the delivery status of non-ASCII emails. The correct address of an
unreachable addressee must be sent back, even if further transport has
been refused. The email Address Internationalization (EAI) working
group is also working on a number of "downgrade mechanisms" for
various header fields and the envelope. If possible, original header
information is to be "packaged" and preserved.

Germany's DeNIC, the registrar for the ".de" domain, is nonetheless
taking this in its stride. "There is really not much we can do",
explained DeNIC spokesperson Klaus Herzig. DeNIC is instead paying
more attention to the update that the IETF is working on for the
standard of international domains – RFC3490, or IDNA2003 as it's
sometimes known. "We are not that happy about it because there is no
backwards compatibility," Herzig explained. When the update comes,
DeNIC says it will be throwing its weight behind the symbol "ß" - also
known as estzett - which has been overlooked up to now. The German
registrar also says that it may wait a bit before switching in light
of the lack of backward compatibility. Once the new standard is
running stably and registrars and providers have adopted it, the ß
will be added.

In contrast, experts believe that Chinese registrars in China and
Taiwan will quickly implement the change for internationalised email.
Representatives of CNIC and TWNIC are authors of the standards.
Chinese users currently have to write emails in ASCII to the left of
the @ and in Chinese characters to the right of it for Chinese
domains, which have already been internationalized.

(Monika Ermert)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文