PHP 使用 unicode 进行编码
如何从 \u00e4
(代表 &aauml;
(ä))中获取 HTML 实体?
出于转义原因,我在字符串中有反斜杠。当我删除斜杠时,我会得到类似 u00e4
的内容。我必须剥掉睫毛才能将其存储并恢复到会话中。
How can I get an HTML entity out of a thing like \u00e4
, which stands for &aauml;
(ä)?
I have backslashes in the string, for escape reason. When I strip slashes I get something like u00e4
. I have to strips lashes to be able to store and restore it to the session.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
使用htmlentities():
但是,值得注意的是:
<
和>
)。所以这不会解决你的问题,它只会隐藏它;-)
更新
我忽略了原始问题中对
\00e4
的引用。 ä 字符对应于U+00E4
Unicode 代码点。但是,PHP 不支持 Unicode 代码点。如果您需要在 PHP 代码中键入它,并且您的键盘没有此类符号,您可以将文档另存为 UTF-8,然后提供 UTF-8 字节(c3 a4) 与 双引号语法:不过,这与会话或 HTML 无关。我不明白你的具体问题是什么。
第二次更新
所以serialize()无法处理关联数组,并且json_decode()无法提供json_encode()的输出......
在
我看来,您正在为一个简单的脚本添加几层复杂性,因为您正在制作关于某些 PHP 函数如何工作的假设,而不是检查手册或自己测试。此时,所提供的信息与原始问题几乎没有相似之处,而且我们仍然没有看到一行代码。
到目前为止,我的建议是,尝试停止整个应用程序的调试,将其分成更小的部分,并使用 var_dump() 找出每个部分实际生成的内容。不要假设事情:自己测试一下。另外,请考虑到 PHP 并不像其他语言那样原生支持 Unicode。涉及双字节字符串处理的每个任务都必须使用适当的多字节函数仔细实现,这通常需要对字符编码进行硬编码。
With htmlentities():
However, it's worth noting that:
<
and>
).So this won't fix your problem, it will just hide it ;-)
Update
I had overlooked the reference to
\00e4
in the original question. The ä character corresponds to theU+00E4
Unicode code point. However, PHP does not support Unicode code points. If you need to type it in your PHP code and your keyboard does not have such symbol, you can save the document as UTF-8 and then provide the UTF-8 bytes (c3 a4) with the double quote syntax:Still, this has no relation to sessions or HTML. I can't understand what your exact problem is.
Second update
So serialize() cannot handle associative arrays and json_decode() cannot be fed with json_encode()'s output...
...
It appears to me that you are adding several layers of complexity to a simple script because you are making assumptions about how some PHP functions work instead of checking the manual or testing yourself. At this point, the information provided hardly resembles the original question and we still haven't seen a single line of code.
My advice so far is that you try to stop debugging your app as a whole, divide it into smaller pieces and use var_dump() to find out what each of these parts actually generate. Don't assume things: test stuff yourself. Also, take into account that PHP doesn't Unicode natively as others languages do. Every single task that involves double-byte string handling must be carefully implemented with the appropriate multi-byte functions, which often require to hard-code the character encoding.
您的意思是您在重新加载时遇到问题?
您将其输出到 HTML 页面吗?在这种情况下,您可能设置了错误的字符集。
至于使用实体,请查看:
htmlentitites
How do you mean you have problems reloading it?
Do you output it to a HTML page? In that case, you might have set the wrong charset.
As for using entities, check this out:
htmlentitites
我不确定这个对你有帮助,但看看 WordPress 的 sanitize_title 功能,您可以在其中找到一些巨大的字符表。
I'm not sure about this one helps you but take a look at Wordpress' sanitize_title function where you can find some huge character tables.
正如您在讨论和答案中看到的那样,这是一个问题,php 无法处理本机(或者到目前为止,这里没有人知道),
我建议使用这个非常重要的功能...我的意思是,这是我迄今为止的解决方案,我非常不喜欢。
As you can see in the discussions, and answeres, it is a problem, which php can't handle native (or until now nobody here knows)
i suggest using this very havy function ... i mean, this is my solution so far, which i do not like, very much.