关于字符解码和mime解码
我用java开发了一个程序,它从电子邮件帐户中获取电子邮件信息的主题、发件人、发件人和日期时间。我已经使用 html 解析器和 httpclient 做到了这一点。我有两个问题。
当我解析电子邮件的主题字符串时,有时会收到一些奇怪的字符。例如,如果主题是“Hi Mr. müller”,我收到的主题字符串为“Hi Mr. müller”。如您所见,它没有正确赋予 ü 字符。知道这是什么编码吗?是UTF-8吗?如何解码它以获取原始字符串?
我还通过 pop3 从 yahoo 帐户收到了电子邮件信息,如主题、发件人、收件人、日期时间等。我注意到,当发件人电子邮件 ID 包含 ü 或 ue 时(例如 reva.mü[ email protected]),它的编码方式如下('=?iso-8859-1?Q?=22Reva_M=FCller=22?= ')。知道这是什么编码吗?是mime编码吗?如何在 java 中对其进行解码以获得正确的发件人字符串?
我真的很感激任何帮助......
I have developed a program in java which fetches subject, sender, from and datetime of email information from an email account. I have done that using html parser and httpclient. I have two problems.
When I parse a subject string of the email I get some wiered character sometimes. for e.g. if subject is "Hi Mr. müller", I receive subject string as "Hi Mr. müller". As you can see it's not giving ü character properly. Any idea which encoding is this ? Is it UTF-8 ? How do I decode it to get the original string ?
I have also received email information like subject, sender, receiver, datetime etc. from yahoo account with pop3. In that I have noticed when the sender email id contains ü or ue (for e.g. reva.mü[email protected]), it encodes it like ('=?iso-8859-1?Q?=22Reva_M=FCller=22?= '). Any idea about which encoding is this ? Is it mime encoding ? How do I decode it in java to get correct sender string ?
I would really appreciate any help.....
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您需要阅读 RFC:http://www.ietf.org/rfc/rfc2045.txt< /a>.它将告诉您如何解释这些 = 符号。
请参阅“6.7. 引用可打印内容传输编码”。
还要查找 Content-Type 标头来提示您编码。
You need to read the RFC: http://www.ietf.org/rfc/rfc2045.txt. It will tell you how to interpret those = signs.
See "6.7. Quoted-Printable Content-Transfer-Encoding".
Also look for a Content-Type header to clue you in on the encoding.