获取 PHP 中所有 UTF-8 空白字符的完整列表的最简单方法
在 PHP 中,获取所有 Unicode 的完整列表(字符串数组)的最优雅方法是什么空白字符,用utf8编码?
我需要它来生成测试数据。
In PHP, what's the most elegant way to get the complete list (array of strings) of all the Unicode whitespace characters, encoded in utf8?
I need that to generate test data.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
多年后,在 Google 上查找 unicode 空白字符时,这个问题仍然是最热门的结果。 devio 的答案很好,但不完整。截至撰写本文时(2017 年 10 月),维基百科在此处提供了空白字符列表:https://en.wikipedia。 org/wiki/Whitespace_character
该列表指定了 25 个代码点,而当前接受的答案列出了 18 个。包括其他 7 个代码点,该列表为:
Years later, this question still has top results on Google when looking for unicode whitespace characters. devio's answer is great, but incomplete. As of this writing (October 2017) Wikipedia has a list of whitespace characters here: https://en.wikipedia.org/wiki/Whitespace_character
This list has specifies 25 code points, whereas the currently accepted answer lists 18. Including the seven other code points, the list is:
此电子邮件(存档此处)包含以 UTF-8、UTF-16 和 HTML 编码的所有 Unicode 空白字符的列表。
在存档链接中查找“utf8_whitespace_table”函数。
This email (archived here) contains a list of all Unicode whitespace characters encoded in UTF-8, UTF-16, and HTML.
In the archived link look for the 'utf8_whitespace_table' function.
http://en.wikipedia.org/wiki/Space_%28punctuation%29#Spaces_in_Unicode不幸
的是,它不提供 UTF-8,但它确实在网页中提供了该字符,因此您可以剪切并粘贴到编辑器中(如果它以 UTF-8 保存)。或者, http://www.fileformat.info/info/unicode/char /180E/index.htm 给出 UTF-8(将“180E”替换为您正在查找的十六进制 UTF-16 值)。
这也提供了 @devio 的优秀答案所遗漏的一些额外字符。
http://en.wikipedia.org/wiki/Space_%28punctuation%29#Spaces_in_Unicode
Unfortunately, it doesn't give UTF-8, but it does have the character in the web page, so you could cut and paste into your editor (if it saves in UTF-8). Alternatively, http://www.fileformat.info/info/unicode/char/180E/index.htm gives UTF-8 (replace "180E" with the hex UTF-16 value you are looking up).
This also gives a couple extra characters that @devio's excellent answer misses.