如何处理 Unicode 代码点?
假设用户提交了一条评论,我想获取其值的 Unicode 代码点数组,选择哪些代码点无效并丢弃它们,然后保存评论。 我怎样才能做到这一点?
例如,
用户提交“hello”,我想获取一个具有以下值的数组$codepoints
:
$codepoints[0] = 0068
$codepoints[1] = 0065
$codepoints[2] = 006C
$codepoints[3] = 006C
$codepoints[4] = 006F
而且,出于某种奇怪的原因,我不想允许字母“l”,所以我想丢弃代码点 U+006C 的字符。所以保存的评论将是“heo”。 这可能吗?
提前致谢!
Let's say a user submits a comment and I want to obtain the array of Unicode code points of its value, select what code points are invalid and discard them, and save the comment.
How can I do that?
e.g.
The user submits "hello", and I want to obtain an array $codepoints
with the following values:
$codepoints[0] = 0068
$codepoints[1] = 0065
$codepoints[2] = 006C
$codepoints[3] = 006C
$codepoints[4] = 006F
And, for some strange reason, I don't want to allow the letter "l", so I want to discard the characters with the code point U+006C. So the saved comment would be "heo".
Is this even possible?
Thanks in advance!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这是一个带有 unicode 文字的示例。
这将输出字符串
Test
。如果您想以十六进制编写代码点,这个答案 可能有用。
Here's an example with unicode literals.
This will output the string
Test
.If you'd rather write the code points in hex, this answer may be useful.