将 UTF-8 格式的 HTML 实体转换为 SHIFT_JIS
我正在开发一个网站,该网站需要针对不支持 Unicode 的旧式日本手机。问题是,该站点的文本作为 HTML 实体(即 Ӓ)保存在数据库中。该数据库绝对不能更改,因为它用于数百个网站。
我需要做的是将这些实体转换为实际字符,然后在发送之前转换字符串编码,因为手机会渲染实体而不先转换它们。
我尝试过 mb_convert_encoding 和 iconv,但它们所做的只是转换实体的编码,而不是创建文本。
提前致谢
编辑:
我也尝试过html_entity_decode
。它产生相同的结果 - 未转换的字符串。
这是我正在使用的示例数据。
期望的结果:shieraton・ヌーサリゾート&supaHTML
代码: <代码>シェラトン・ヌー サリゾート&スパ
html_entity_decode([上面的字符串],ENT_COMPAT,'SHIFT_JIS');
的输出与输入字符串相同。
I am working with a website that needs to target old, Japanese mobile phones, that are not Unicode enabled. The problem is, the text for the site is saved in the database as HTML entities (ie, Ӓ). This database absolutely cannot be changed, as it is used for several hundred websites.
What I need to do is convert these entities to actual characters, and then convert the string encoding before sending it out, as the phones render the entities without converting them first.
I've tried both mb_convert_encoding
and iconv
, but all they are doing is converting the encoding of the entities, but not creating the text.
Thanks in advance
EDIT:
I have also tried html_entity_decode
. It is producing the same results - an unconverted string.
Here is the sample data I am working with.
The desired result: シェラトン・ヌーサリゾート&スパ
The HTML Codes: シェラトン・ヌーサリゾート&スパ
The output of html_entity_decode([the string above],ENT_COMPAT,'SHIFT_JIS');
is identical to the input string.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
请注意,您正在从实体中创建正确的代码点。如果原始编码是 UTF-8 例如:
Just take care you're creating the right codepoints out of the entities. If the original encoding is UTF-8 for example:
我在 上找到了这个函数php.net,它适用于我的示例:
I found this function on php.net, it works for me with your example:
我认为你只需要
html_entity_decode
。编辑:基于您的编辑:
请注意,这只是将实体转换为实际字符的第一步。
I think you just need
html_entity_decode
.Edit: Based on your edit:
Note that this is just your first step, to convert your entities to the actual characters.
只是为了参与,因为我在编码时遇到了某种编码错误,我建议这个片段:
也许对于大量数据来说不是最好的,但仍然有效。
just to participate as I encountered some kind of encoding bug while coding, I would suggest this snippet :
Maybe not the best for a large amount of data, but still works.