查找“不良电子邮件地址”在邮箱里
我可以通过编程方式访问 POP3 邮箱以及存储在数据库中的存档电子邮件。 目
- 我
- 的
-
- 或者 postmaster
有没有办法在不使用“启发式”的情况下过滤掉此类电子邮件?很容易扫描主题中的“无法投递”等词语或“邮件守护程序”等发件人,但我想要一个更好的解决方案(如果有)。
请注意,我可以访问所有 POP3/数据库存档电子邮件的邮件标头。有一些我可以使用的标头吗?
I have programmatic access to a POP3 mailbox plus access to archived emails stored in a database. My objective to to find out bad email addresses -- the email addresses from which emails were returned (bounced) with status or messages such as:
- Undeliverable mail
- Delivery Status Notification (Failure)
- Undelivered mail returned to sender
- Emails from people such as mailer-daemon or postmaster
Is there are way to filter out such emails without using "heuristics"? Its easy to scan the subject for words like "undeliverable" or senders such as "mailer-daemon" but I want a better solution, if any.
Note that I have access to mail headers for all POP3/database archived emails. Is there some header that I can use?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
某些邮件服务器实施 RFC 3464。那些这样做的通常会生成传递状态通知,其消息头内容类型为 multipart/report 和三个组成部分(text/plain、message/delivery-status 和 message/rfc822)。因此您可以检测消息的这些特征并进行相应的处理。该消息通常如下所示:
来自:“邮件传送系统”[电子邮件受保护]>
主题:递送状态通知(失败)
内容类型:多部分/报告;报告类型=交付状态
内容类型:文本/纯文本
交付状态通知的人类可读解释。
内容类型:消息/传递状态
递送状态通知的结构化机器可读原因。
内容类型:消息/rfc822
原来的消息。
对于那些以非结构化格式生成传递状态通知的邮件服务器,可能仍然需要通过分析 From: 和Subject: 消息标头的文本来检测其通知。
Some mail servers implement RFC 3464. Those that do will typically generate Delivery Status Notifications with a message header Content-Type of multipart/report and three component parts (text/plain, message/delivery-status and message/rfc822). So you could detect those characteristics of the message and process accordingly. The message will generally look like this:
From: "Mail Delivery System" <[email protected]>
Subject: Delivery Status Notification (Failure)
Content-Type: multipart/report; report-type=delivery-status
Content-Type: text/plain
A human readable explanation of the Delivery Status Notification.
Content-Type: message/delivery-status
A structured machine readable reason for the Delivery Status Notification.
Content-Type: message/rfc822
The original message.
For those mail servers that generate Delivery Status Notifications in an unstructured format, it is probably still necessary to detect their notifications by analysing the text of the From: and Subject: message headers.