检测 PHP 中的 CJK 字符
我有一个允许 UTF8 字符的输入框 - 我可以以编程方式检测这些字符是中文、日文还是韩文(也许是某些 Unicode 范围的一部分)吗?我会根据 MySQL 的全文搜索是否有效(它不适用于 CJK 字符)来更改搜索方法。
谢谢!
I've got an input box that allows UTF8 characters -- can I detect whether the characters are in Chinese, Japanese, or Korean programmatically (part of some Unicode range, perhaps)? I would change search methods depending on if MySQL's fulltext searching would work (it won't work for CJK characters).
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
CJK 字符仅限于某些 Unicode 块。您需要检查字符是否在这些块内,并且还应该考虑代理(32 位字符)。
CJK characters are restricted to certain Unicode Blocks. You need to check the characters if they are inside these blocks, and should consider surrogates (32bit characters) too.
您想检测一个字符是否是(中文或日文或韩文)字符吗?或者你想区分汉字和日文字符吗?前者很容易;由于汉族统一,后者在许多情况下是不可能的。
Do you want to detect whether a character is a (Chinese or Japanese or Korean) character? Or do you want to tell Chinese characters apart from Japanese characters? The former is easy; the latter is in many cases impossible, due to Han Unification.