电子邮件中的 HTML 特殊字符
我编写了一个脚本来从邮箱中读取电子邮件。
在一些电子邮件中,我收到一些数据,这些数据被转换为有线字符,这破坏了我的进一步处理。
这些字符看起来像这样 http://brucejohnson.ca/HTMLCharacters13.html
知道如何转换它们融入原创内容。
I had written a script to read email from a mailbox.
in some email i am getting some data being converted into wiered characters that are breaking my further processing.
those character looks something like this http://brucejohnson.ca/HTMLCharacters13.html
Any idea how to convert them into original content.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果脚本给你这些字符,那么你有两个选择,按原样查看该字符,或查看该字符的等效数字(以各种基数 - 八进制、十六进制等)。
您确定您的脚本没有尝试读取加密邮件,并且您的脚本工作正常吗?
尝试将一些虚拟测试数据放入您编写的函数/脚本中,看看它是否产生您期望的输出。
希望这有帮助
if the script is giving you those characters, then you have two options, see the character as is, or see the numerical equivalent of that character (in various bases - octal, hex etc).
Are you sure that your script isn't trying to read an encrypted mail, and that your script works fine?
Try putting some dummy test data through the functions/script you've written to see if it produces the output you expect.
Hope this helps
您需要首先检查电子邮件标头中的字符集编码。
完成此操作后,您可以选择两种方法中的一种,更改 HTML 中的字符集或将字符集(如果可能)更改为您已经使用的字符集(可能是 UTF-8)
。 header 那么你最大的问题是用户需要在浏览器设置中指定正确的字符集,例如我的浏览器设置为UTF-8,但是我的电子邮件采用ISO-8859-1,所以如果我每次都采用此方法当我查看该网站时,我需要更改浏览器字符集,但我的一个朋友将 ISO-8859-1 作为他的正常字符集,因此他不会有任何问题。
如果您将字符编码为 UTF-8(例如 php 中的 utf8_encode),您需要确保内容不是 UTF-8,否则您可能会发现编码函数会创建其他无效字符。
我处理这个问题的方法基本上是解码电子邮件的 mime 标头,然后使用 PHP 中的 preg_match 来检测正在使用的字符集,从那里我将编码运行为 UTF-8 或不运行。
有时,这是一项非常复杂的活动,根据电子邮件的发件人处理邮件和各种字符集,您事先并不知道将使用什么字符集,因此您需要真正了解各种字符集,以及如何最好地存储它们存储它们以及它们的最佳显示方式,然后您需要将其转化为您的应用程序和目标市场。
祝你的应用好运
You need to check the charset encoding in the email headers first.
Once you have done this you then chose 1 of 2 methods, change the charset in the HTML or change the charset (where possible) to the charset you're already using (probably UTF-8)
If you dynamically change the HTML charset in the header then your biggest problem is the users will need to specify the correct charset in their browser settings, for example mine is set to UTF-8 however my emails are in ISO-8859-1 so if I was to employ this method every time I look at the site I would need to change my browser charset but a friend of mine has ISO-8859-1 as his normal charset so he would have no problems.
If you encode the characters to UTF-8 (e.g. utf8_encode in php) you need to ensure the content isn't already in UTF-8 otherwise you may find the encode function creates other invalid characters.
The way I handle this is basically to decode the mime header of the email, then use preg_match in PHP to detect the charset being used, from there I run the encoding to UTF-8 or not.
This is a very complicated activity at times dealing mail and various charsets based on the sender of the email, you don't really know in advance what charset will be used so you need to really understand the various charsets, how they are best stored if storing them and how they are best displayed, you then need to translate this to your app and target market.
GOod luck with your app
你检查过字符编码吗?它必须是
UTF-8
。如果是西欧则更改为UTF-8
have u checked the character encoding It must be
UTF-8
. If it iswestern europian then change to UTF-8