退回邮件解析

发布于 2024-08-11 18:19:57 字数 284 浏览 6 评论 0原文

我目前在捕获、解析和排序退回的电子邮件方面遇到了麻烦。我已经很好地设置了基础知识,并且它满足了我的要求,这很好...问题是退回的电子邮件中返回的消息似乎没有标准。

例如,某些服务器返回 RFC 1893 指定的错误代码,我十有八九可以通过简单的正则表达式来获取该错误代码。但有时服务器只是回复说电子邮件已被退回,要么没有给出任何理由,要么理由的措辞与任何标准完全不同。

所以我想我的问题是,有人能解决这个问题吗?老实说,我不想在返回的电子邮件中搜索十亿个可能的字符串。然而,如果不必求助于“未知原因”或类似的东西,那就太好了。

I'm currently having a mess about with catching, parsing and sorting bounced emails. I have the basics set up nicely and it does what I want, which is nice... problem being is that there seems to be no standard to the messages returned in the bounced email.

For example, some servers return the error code as specified by RFC 1893 and I can nine times out of ten pick that up via a simple regex thing. But sometimes servers just respond saying that the email has bounced, with either no reason given or a reason worded entirely different to any standards.

So I guess my question is, has anyone got any solution to this? I don't want to be searching for a billion and one possible strings in the email returned to be honest. Yet it would be nice to not have to resort to 'reason unknown' or something similar.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

凉月流沐 2024-08-18 18:19:57

您可以设置系统,让操作员查看消息、选择字符串,然后从那里进行分类。最终,您可能希望将十分之一降至百分之一或千分之一。然而,这里总会出现越来越多的极端情况。

You could set up system lets an operator review messages, select strings, and then categorize from there. Eventually, you could hope to get that 1 in 10 down to 1 in 100 or 1 in 1,000. There are always going to be more and more corner cases here however.

挽容 2024-08-18 18:19:57

也不是一个明确的答案,但本着与凯尔的回应类似的精神,您可以使用基于贝叶斯/令牌的垃圾邮件过滤器来“学习”退回邮件,然后自动将它们路由到您想要处理退回邮件的任何地方。

换句话说,您有一个帐户,您可以在其中训练 spamassassin 或 spamprobe 或其他任何东西,一堆不同的退回邮件(并且只有退回邮件)是“垃圾”,然后让该垃圾邮件系统成为您的第二道过滤线发达。

因此,假设您的解决方案(第一个过滤器)找到了 90% 的退回邮件。您让系统执行通常对退回邮件执行的操作,然后将它们保存到退回邮件邮箱,该邮箱由 spamassasin/spamprobe 定期扫描,以将这些邮件识别为“垃圾”。

然后,您还可以使用 spamassassin 或 spamprobe 或其他任何东西作为第二个过滤器(在您的任何未标记为反弹的过滤器上运行)对反弹性进行自己的估计,以及它认为的“垃圾”的任何内容(因为您已经训练过认为反弹=垃圾),您还可以路由到您的程序等。

仍然需要一点手动审查,但理论上它应该随着时间的推移变得越来越好,因为您依靠垃圾邮件系统的学习来解释边缘情况。

Also not a definitive answer, but in a similar spirit to Kyle's response, you could use a bayes/token based spam filter to "learn" about bounce messages and then automatically route them to whatever you want to handle the bounced mail.

In other words, you have an account where you train spamassassin or spamprobe or whatever that a bunch of different bounce messages (and only bounce messages) are "junk", then let that spam system be a second line of filtering after whatever you've developed.

So, let's say your solution, the first filter, finds 90% of bounced messages. You have your system do whatever it normally does with bounces, then save them to a bounce-messages mailbox, which is periodically scanned by spamassasin/spamprobe to learn those messages as "junk".

You also then have spamassassin or spamprobe or whatever as a second filter (run on anything yours doesn't flag as a bounce) do its own estimation of bounced-ness, and whatever it considers "junk" (because you've trained to to think bounce = junk), you also route to your program etc.

Still requires a little bit of manual review, but in theory it should get better and better over time as you rely on the spam system's learning to account for the edge cases.

后eg是否自 2024-08-18 18:19:57

我们面临着同样的问题,但都没有找到任何“完美”的解决方案。我认为您

  • 可以使用一些服务提供商(具有适当的邮件 API) - 这可以让您“外包”问题并为您提供较高的检测率,或者
  • 使用一些简单的过滤器来捕获至少(例如)80% 的退回邮件。 在我们的设置中,这足以让我们的数据库处于合理状态。

We are facing the same problem, but neither did not find any "perfect" solution. I think you

  • could either use some service provider (with a proper mail API) - this would let you "outsource" the problem and give you a high detection rate or
  • use some simple filter to catch at least (say) 80% of the bounces. In our setup, this was enough to keep our database in a reasonable state.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文