PHP iconv_strlen问题
我想知道当 icon_strlen 在错误的字符序列上失败时意味着什么,特别是字符序列
。谢谢
What does it mean when the icon_strlen fails on bad character sequences specifically character sequences
is what I want to know. Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
字符序列
是一系列字节。使用 UTF-8 时,并非所有字节组合都有效。字节序列
\xc2\xbc
形成 Unicode 字符U+00BC
,当使用 UTF- 8 编码。字节序列
\xe2\x88\x9c
形成 Unicode 字符U+221C
,它是使用 UTF- 时的FOURTH ROOT
符号 (∜) 8 编码。UTF-8 编码的错误字符序列是任何不符合 UTF-8 字节流,例如字节序列
\xbc\xbc
是非法的,因为两个字节字符的第一个字节必须是110xxxxx
,但\xbc
是10111100
写为位。A
character sequence
is a series of bytes. When using UTF-8 not all combinations of bytes are valid.The byte sequence
\xc2\xbc
forms the Unicode characterU+00BC
which is theVULGAR FRACTION ONE QUARTER
symbol (¼) when using UTF-8 encoding.The byte sequence
\xe2\x88\x9c
forms the Unicode characterU+221C
which is theFOURTH ROOT
symbol (∜) when using UTF-8 encoding.A bad character sequence for UTF-8 encoding would be any byte combination that doesn't fit into the required schema for UTF-8 byte streams, e.g. the byte sequence
\xbc\xbc
would be illegal because two byte characters must have110xxxxx
in the first byte but\xbc
is10111100
written as bits.