Python 电子邮件标头解析 get_all()

发布于 2024-11-26 12:55:48 字数 774 浏览 8 评论 0原文

我正在使用 Python 解析邮箱文件,并在尝试使用 get_all() 获取所有“To:”标头时偶然发现了一个奇怪的行为:

tos = message.get_all('to', [])
if tos:
    tos = getaddresses(tos)
    for to in tos:
        receiver = EmailInformant()
        receiver_email = to[1]

get_all() gets all “to :" 值,用逗号分隔,afaik。然后,getaddresses 将单个接收者拆分为名称和电子邮件值。 对于以下“To:”标头,它无法按我的预期工作:

To: [email protected] <[email protected]>

这里,电子邮件地址作为名称和电子邮件值提供,但解析器将其视为两个单独的“To:”条目,运行 for 循环两次。这是一个错误吗?

I'm parsing mailbox files with Python and stumbled upon a strange behvior when trying to get all "To:" headers with get_all():

tos = message.get_all('to', [])
if tos:
    tos = getaddresses(tos)
    for to in tos:
        receiver = EmailInformant()
        receiver_email = to[1]

get_all() gets all "to:" values, which are separated by commas, afaik. getaddresses then splits the single receivers in a name and an email value.
For the following "To:" header, it does not work as I would expect:

To: [email protected] <[email protected]>

Here, the email address is provided as name and email value, but the parser treats this as two separate "To:" entries, running the for-loop twice. Is this a bug?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

冬天旳寂寞 2024-12-03 12:55:48

解析电子邮件非常困难,因为有多种不同的规范,许多行为定义不明确,并且实现不遵循规范。其中许多在某些方面存在冲突。

我知道标准库中的电子邮件模块目前正在针对 Python 3.3 进行重写,请参阅 http://www.bitdance。 com/blog/。重写应该解决这样的问题;如果您有以下选项,它目前可在 Python 3.2 的 pypi 上使用: http://pypi.python.org/ pypi/电子邮件

同时,尝试 tos = set(getaddresses(tos)) 消除重复项。

Parsing emails is very hard, as there are several different specifications, many behaviors that are or were poorly defined, and implementations that don't follow the specifications. Many of them conflict in some ways.

I know the email module in the standard library is currently being rewritten for Python 3.3, see http://www.bitdance.com/blog/. The rewrite should solve problems like this; it is currently available on pypi for Python 3.2 if you have that option: http://pypi.python.org/pypi/email.

Meanwhile, try tos = set(getaddresses(tos)) to eliminate duplicates.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文