如何使用 strtr() 翻译多字节/重音/变音字符?
有人有 strtr()
函数的多字节变体吗?
所需用法示例:
Example: $from = 'ľľščťžýáíŕďňäô'; // these chars are in UTF-8 $to = 'llsctzyairdnao'; // input - in UTF-8 $str = 'Kŕdeľ ďatľov učí koňa žrať kôru.'; $str = mb_strtr( $str, $from, $to ); // output - str without diacritic // $str = 'Krdel datlov uci kona zrat koru.';
Does anyone have a multibyte variant of the strtr()
function?
Example of desired usage:
Example: $from = 'ľľščťžýáíŕďňäô'; // these chars are in UTF-8 $to = 'llsctzyairdnao'; // input - in UTF-8 $str = 'Kŕdeľ ďatľov učí koňa žrať kôru.'; $str = mb_strtr( $str, $from, $to ); // output - str without diacritic // $str = 'Krdel datlov uci kona zrat koru.';
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
可能使用 str_replace 是一个很好的解决方案。另一种选择:
使用 PHP 5.2 在我的机器上打印:
Probably using str_replace is a good solution. An alternative:
Prints on my machine using PHP 5.2:
strtr()
有两个有效签名用于接收其参数.您实现
strtr()
的方式执行逐字节转换——这显然不适合您的多字节字符。正确的实现是向函数提供一个要翻译的关联字符数组——这是多字节安全的方式。 (演示)
还应该注意的是,已经开发了一些库和本机函数来处理此类任务。
strtr()
has two valid signatures for receiving its parameters.The way that you have implemented
strtr()
performs byte-by-byte translations -- this is obviously inappropriate for your multibyte characters.The correct implementation is to feed the function an associative array of characters to translate -- this is the multibyte-safe way. (Demo)
It should also be noted that there are libraries and native functions developed to handle such a task.
我相信
strtr
是多字节安全,无论哪种方式,因为str_replace
是多字节安全的,您可以将其包装:由于没有
mb_str_split
函数,您还需要编写您自己的(使用mb_substr
和mb_strlen
),或者您可以只使用 PHP UTF-8 实现(略有更改):但是,如果您正在寻找一个函数来删除字符串中的所有(拉丁语?)重音符号,您可能会发现以下函数很有用:
I believe
strtr
is multi-byte safe, either way sincestr_replace
is multi-byte safe you could wrap it:Since there is no
mb_str_split
function you also need to write your own (usingmb_substr
andmb_strlen
), or you could just use the PHP UTF-8 implementation (changed slightly):However if you're looking for a function to remove all (latin?) accentuations from a string you might find the following function useful: