有一天,一位客户向我提出了一个有趣的请求:从 HTML 电子邮件自动创建格式化的 PDF。基本上,他们每晚发送一份时事通讯,并希望将一个“机器人”电子邮件地址添加到接收电子邮件的列表中,将其转换为格式化的 PDF 并将该 PDF 上传到 Box.net 上的文件夹。该过程需要通过 PHP 完成。
一旦我从电子邮件中获取 HTML,我认为之后的步骤不会太麻烦(我可能只使用 dompdf 将 HTML 转换为 PDF,格式并不复杂)。我的具体问题是检索电子邮件并从中获取 HTML 的步骤。有没有办法设置一个邮件帐户,每次收到电子邮件时都会运行 PHP 脚本?如果是这样,我如何通过 PHP 访问邮箱和/或电子邮件的内容?定期检查邮箱中是否有新电子邮件会更容易或更有意义吗?
对此的任何想法将不胜感激。
A client came to me with an interesting request the other day: automatically creating a formatted PDF from an HTML email. Basically, they send out a nightly newsletter and would like to add a "bot" email address to the list that takes the email, converts it to a formatted PDF and uploads that PDF to a folder on Box.net. The process needs to be done via PHP.
Once I get the HTML from the email, I don't think the steps after that will be too much trouble (I'll probably just use dompdf to convert the HTML to PDF, the formatting isn't anything complicated). My specific question is on the steps to take for retrieving an email and grabbing the HTML out of it. Is there a way to set up a mail account where a PHP script is run every time an email is received? If so, how do I access the content of a mailbox and/or email via PHP? Would it be easier, or make more sense to periodically check the mailbox for new emails?
Any thoughts on this would be most appreciated.
发布评论
评论(2)
我编写了一个名为 email2pdf 的脚本,它的作用非常类似于将 HTML(和纯文本)电子邮件转换为PDF。它使用 python,而不是 PHP,并且设计为与 getmail 协同工作来实际获取邮件。
更多信息请参见自述文件。
(抱歉有点自我推销,但我认为这是相关的)。
I have written a script called email2pdf which does something very similar to convert HTML (and plain-text) email to PDF. It uses python, rather than PHP, and is designed to work in tandem with getmail to actually fetch the mail.
More information in the README.
(sorry for the slight self-promotion, but I think this is relevant).
如果不知道您正在使用什么平台,就很难知道该提供什么建议。
在 Unix 和 Linux 环境中,Fetchmail 是从 POP 或 IMAP 服务器抓取邮件的常用工具。一旦 Fetchmail 获取您的邮件,您可以将其保存到文件中,通过程序进行管道传输等,然后使用各种转换工具来实现自动化。
如果您不想在 cron 作业中使用 fetchmail 来“轮询”您的邮箱,那么您可以在邮件服务器本身上触发转换。运行 Sendmail 或 Postfix(或其他类似软件),因为 MTA 可能使用 Procmail作为“本地送货代理”。 Procmail 包含一种灵活的语言,可用于识别电子邮件中的模式并“执行”操作。如果 Procmail 正在传递您的邮件,您可以轻松地为其编写一个“配方”,该“配方”将识别符合您条件的传入 HTML 邮件,然后通过转换程序传输 HTML 部分。并寻呼某人,或运行其他程序,或其他什么。
将 HTML 转换为 PS/PDF 的实际过程也实际上取决于您没有指定的平台。请记住,在 UNIX 环境中从 PS 到 PDF 的转换是微不足道的,因此如果您找到 PS 转换,您可以轻松地将其转换为 PDF。查看来自 w3.org 的建议列表,或者寻求支持平台。我使用 FreeBSD,其中存在 html2ps-letter 。
It's difficult to know what to advise without knowing what platform you're using.
In Unix and Linux environments, Fetchmail is an old favourite for grabbing mail from a POP or IMAP server. Once Fetchmail fetches your mail, you can save it to a file, pipe it through a program, etc., and figure out your automation with various conversion tools from there.
If you don't want to have to "poll" your mailbox using fetchmail in a cron job, then you may be able to trigger conversions on the mail server itself. A Unix or Linux mail server running Sendmail or Postfix (or other similar software) as the MTA may use Procmail as a "local delivery agent". Procmail includes an flexible language that can be used to recognize patterns in email and "do" things. If Procmail is delivering your mail, you could easily write a "recipe" for it that would recognize incoming HTML mail that matches your criteria, then pipe the HTML part through a conversion program. And page somebody, or run some other program, or whatever.
The actual process for converting HTML to PS/PDF also really depends on your platform, which you haven't specified. Bear in mind that conversion from PS to PDF is trivial in unix environments, so if you find a conversion to PS, you can easily make that into PDF. Have a look at the list of recommendations from w3.org, or ask for support for your platform. I use FreeBSD, where html2ps-letter exists.