PHP - 我可以使用 mb_detect_encoding () + iconv() 将字符串转换为 UTF-8?
所有,
我正在构建一个小型网络应用程序,它将接受用户生成的内容。此内容将通过 enctype="multipart/form-data"
表单和表单上传。浏览按钮。
据我了解,如果字符串不包含不重叠的字符,则它无法区分作为彼此子集的字符集。这是有道理的。
我的问题是:这有关系吗?如果我使用 mb_detect_encoding() 来获得 PHP 的最佳猜测,然后使用该最佳猜测使用 iconv() 编码为 UTF-8,我会遇到麻烦吗,如果是的话为什么?
换句话说,如果 mb_detect_encoding() 对完全位于 2 个字符集重叠范围内的小字符串进行了错误的编码,那么当我执行 iconv( 时,我会得到不同的结果吗? ) 如果我在函数中使用了正确的输入编码,会怎么样?
编辑:我重写了问题以专门解决通过浏览按钮上传的文本文件。
All,
I am building a small web app that will accept user-generated content. This content will be uploaded via a enctype="multipart/form-data"
form & a browse button.
As I understand mb_detect_encoding()
, it cannot distinguish between character sets that are subsets of each others if the string doesn't include characters that are not in the overlap. That makes sense.
My question is: does this matter? If I use mb_detect_encoding()
to get PHP's best guess, and then use this best guess to encode into UTF-8 using iconv()
, am I going to run into trouble, and if so why?
In other words, if mb_detect_encoding()
come up with the wrong encoding on a small string that is entirely inside the overlap of 2 charsets, will I get a different result when I then do iconv()
than I would if I had used the proper input encoding in the function?
EDIT: I rewrote the question to specifically address text files uploaded via browse buttons.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论