在网站上解析并显示 MIME 多部分电子邮件
我有一封原始电子邮件(MIME 多部分),我想在网站上显示它(例如,在 iframe 中,带有 HTML 部分和纯文本部分的选项卡等)。是否有任何 CPAN 模块或 Template::Toolkit 插件可以帮助我实现这一目标?
目前,看起来我必须使用 Email::MIME 解析消息,然后迭代所有部分,并为所有不同的 mime 类型编写处理程序。
这是一个很遥远的事情,但我想知道是否有人已经完成了这一切?如果我自己尝试编写处理程序,这将是一个漫长且容易出错的过程。
感谢您的任何帮助。
I have a raw email, (MIME multipart), and I want to display this on a website (e.g. in an iframe, with tabs for the HTML part and the plain text part, etc.). Are there any CPAN modules or Template::Toolkit plugins that I can use to help me achieve this?
At the moment, it's looking like I'll have to parse the message with Email::MIME, then iterate over all the parts, and write a handler for all the different mime types.
It's a long shot, but I'm wondering if anyone has done all this already? It's going to be a long and error prone process writing handlers if I attempt it myself.
Thanks for any help.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
实际上我几个月前才处理过这个问题。我在我工作的产品中添加了电子邮件功能,包括发送和接收。第一部分是向用户发送提醒,但我们不想为客户管理员管理退回邮件,我们决定有一个消息收件箱,管理员可以在没有我们的情况下看到退回邮件和回复,并且管理员可以处理调整电子邮件地址(如果需要)。
因此,我们接受发送到我们查看的收件箱的所有电子邮件。我们使用 VERP 将电子邮件与用户关联,并存储整个电子邮件就像数据库中一样。然后,当管理员请求查看电子邮件时,我们必须解析电子邮件。
我的第一次尝试与之前的答案非常相似。如果其中一个部分是 html,则显示它。如果是文本,请显示它。否则,显示原始的原始电子邮件。这很快就崩溃了,有几封电子邮件不是由 sendmail 生成的。 Outlook、Exchange 和其他一些电子邮件系统不这样做,它们使用多部分来发送电子邮件。经过大量的挖掘和讨论后,我发现这个问题似乎没有得到很好的记录。在浏览 MHonArc 并阅读 RFC(RFC2045 和 RFC2046)的帮助下,我确定了以下解决方案。我决定不使用 MHonArc,因为我无法轻松地重新使用解析和显示功能。我不会说这是完美的,但我们使用它已经足够好了。
首先,获取消息并使用 Email::MIME 解析它。然后使用 Email::MIME 为您提供的部分数组调用名为 get_part 的函数 ->parts()。
get_part,对于传递的每个部分,解码内容类型,在哈希中查找它,如果存在,则调用与该内容类型关联的函数。如果解码器能够给我们一些东西,请将其放入结果数组中。
最后一个难题是解码器数组。基本上,它定义了我可以处理的内容类型:
我按原样返回的非多部分部分。对于混合、相关和替代,我只需在该 MIME 节点上调用 get_parts 并返回结果。因为alternative比较特殊,所以在调用get_parts之后有一些额外的代码。如果它有 html 部分,它只会返回 html,否则它只会返回它有文本部分的文本部分。如果两者都没有,则不会返回任何有效内容。
有效内容类型哈希的优点是我可以根据需要轻松添加更多部分的逻辑。当 get_parts 完成时,您应该拥有一个包含您关心的所有内容的数组。
我还应该提到一项。作为其中的一部分,我们创建了一个单独的域来实际服务这些消息。管理员所在的主域将拒绝提供消息并将浏览器重定向到我们的用户内容域。第二个域将仅提供用户内容。这是为了帮助浏览器正确地将内容沙箱化,远离我们的主域。请参阅同源策略 (http://en.wikipedia.org/wiki/Same_origin_policy)
I actually just dealt with this problem just a few months ago. I added an email feature to the product I work for, both sending and receiving. The first part was sending reminders to users, but we didn't want to manage the bounce backs for our customer admins, we decided to have a message inbox that the admins could see bounces and replies without us, and the admins can deal with adjusting email addresses if they needed to.
Because of this, we accept all email that is sent to an inbox we watch. We use VERP to associate an email with a user, and store the entire email as is in the database. Then, when the admin requests to see the email, we have to parse the email.
My first attempt was very similar to an earlier answer. If one of the parts is html, show it. If it's text, show it. Otherwise, show the original, raw email. This broke down real fast with a few emails not generated by sendmail. Outlook, Exchange, and a few other email systems don't do that, they use multiparts to send the email. After a lot of digging and cussing, I discovered that the problem doesn't appear to be well documented. With the help of looking through MHonArc and reading the RFC's (RFC2045 and RFC2046), I settled on the solution below. I decided on not using MHonArc, since I couldn't easily resuse the parsing and display functionality. I wouldn't say this is perfect, but it's been good enough that we used it.
First, take the message and use Email::MIME to parse it. Then call a function called get_part with the array of parts Email::MIME gives you with ->parts().
get_part, for each part it was passed, decodes the content type, looks it up in a hash, and if it exists, call the function associated with that content type. If the decoder was able to give us something, put it on a result array.
The last piece of the puzzle is this decoder array. Basically, it defines the content types I can deal with:
The non-multipart sections I return as is. With mixed, related and alternative, I merely call get_parts on that MIME node and returns the results. Because alternative is special, it has some extra code after calling get_parts. It will only return html if it has an html part, or it will return only the text part of it has a text part. If it has neither, it won't return anything valid.
The advantage with the hash of valid content types is that I can easily add logic for more parts as needed. And by the time you get_parts is done, you should have an array of all content you care about.
One more item I should mention. As a part of this, we created a separate domain that actually serves these messages. The main domain that an admin works on will refuse to serve the message and redirect the browser to our user content domain. This second domain will only serve user content. This is to help the browser properly sandbox the content away from our main domain. See same origin policy (http://en.wikipedia.org/wiki/Same_origin_policy)
对我来说这听起来并不困难:
It doesn't sound like a difficult job to me:
重用现有的完整软件。 MHonArc 邮件到 HTML 转换器 具有出色的 MIME 支持。
Reuse existing complete software. The MHonArc mail-to-HTML converter has excellent MIME support.