MimeMessage 中的智能引号未在 Outlook 中正确显示
我们的应用程序从网络表单中获取文本并通过电子邮件将其发送给适当的用户。 然而,当有人从 Word 中复制/粘贴臭名昭著的“智能引号”或其他特殊字符时,事情就会变得很棘手。
用户输入
他对我说“你好”——这不是很好吗?
但是,当该邮件出现在 Outlook 2003 中时,结果如下:
他向我打招呼不是很好吗?
其代码是:
Session session = Session.getInstance(props, new MailAuthenticator());
Message msg = new MimeMessage(session);
//removed setting to/from addresses to simplify
msg.setSubject(subject);
msg.setText(text);
msg.setHeader("X-Mailer", MailSender.class.getName());
msg.setSentDate(new Date());
Transport.send(msg);
经过一番研究,我认为这可能是字符编码问题,并尝试将内容移至 UTF-8。 所以,我这样更新了代码:
Session session = Session.getInstance(props, new MailAuthenticator());
MimeMessage msg = new MimeMessage(session);
//removed setting to/from addresses to simplify
msg.setHeader("X-Mailer", MailSender.class.getName());
msg.addHeader("Content-Type", "text/plain");
msg.addHeader("charset", "UTF-8");
msg.setSentDate(new Date());
Transport.send(msg);
这让我更接近,但没有雪茄:
他对我说“你好”——这不是很好吗?
我无法想象这是一个不常见的问题——我错过了什么?
Our application takes text from a web form and sends it via email to an appropriate user. However, when someone copy/pastes in the infamous "smart quotes" or other special characters from Word, things get hairy.
The user types in
he said “hello” to me—isn’t that nice?
But when the message appears in Outlook 2003, it comes out like this:
he said hello to meisnt that nice?
The code for this was:
Session session = Session.getInstance(props, new MailAuthenticator());
Message msg = new MimeMessage(session);
//removed setting to/from addresses to simplify
msg.setSubject(subject);
msg.setText(text);
msg.setHeader("X-Mailer", MailSender.class.getName());
msg.setSentDate(new Date());
Transport.send(msg);
After a little research, I figured this was probably a character encoding issue and attempted to move things to UTF-8. So, I updated the code thusly:
Session session = Session.getInstance(props, new MailAuthenticator());
MimeMessage msg = new MimeMessage(session);
//removed setting to/from addresses to simplify
msg.setHeader("X-Mailer", MailSender.class.getName());
msg.addHeader("Content-Type", "text/plain");
msg.addHeader("charset", "UTF-8");
msg.setSentDate(new Date());
Transport.send(msg);
This got me closer, but no cigar:
he said “hello” to me—isn’t that nice?
I can't imagine this is an uncommon problem--what have I missed?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您的表单页面是否也使用 UTF-8 或不同的字符集? 如果您不指定网页字符集,则任何人都可以猜测传入脚本的数据格式。
编辑:消息中的字符集应如下设置:
因为字符集不是单独的标头,而是 Content-type 的选项
Is the page with your form also using UTF-8, or a different charset? If you don't specify the webpage charset, the format of data coming to your script is anyone's guess.
Edit: the charset in the message should be set like this:
since charset is not a separate header, but an option to Content-type
为什么不把漂亮的引言替换成普通的引言呢?
Why don't you replace the nice quotes with regular prime quotes?
我会检查从浏览器接收的数据是否正确 - 转储 Unicode 代码点并根据 图表检查它们:
例如,符号双左引号(“)是字符U+201C。
我已经很长时间没有使用邮件 API 了,但是 MimeMessage.html.setText(text, charset) 方法可能值得一看。 setText(String) 表示它使用默认字符集(如果您使用的是 English/Latin-1 Windows,则可能是 windows-1252)。
I would check that the data being received from the browser is correct - dump the Unicode code points and check them against the charts:
For example, the symbol DOUBLE LEFT QUOTATION MARK (“) is character U+201C.
It has been a long time since I used the mail API, but the MimeMessage.html.setText(text, charset) method might be worth a look. The documentation on setText(String) says it uses the default character set (probably windows-1252 if you're using English/Latin-1 Windows).
IIRC,MS Office 报价发现字符集“iso-8859-1”。
IIRC, MS Office quotes are found characterset "iso-8859-1".