php readdir 日语文件名问题
我有以下代码
<?php
if ($handle = opendir('C:/xampp/htdocs/movies')) {
while (false !== ($file = readdir($handle))) {
if ($file != "." && $file != "..") {
echo $file."<br />\n";
}
}
closedir($handle);
}
?>
当它确实有 mb 语言(例如日语)时,它无法正确显示,而是显示为 kyuukyoku Choujin R ??????~? 而不是 kyuukyoku Choujin R 研究极超人あ~る
无论如何要让它显示正确的名称或让它仍然可以被其他人下载?
谢谢你帮助我:)
I have the following code
<?php
if ($handle = opendir('C:/xampp/htdocs/movies')) {
while (false !== ($file = readdir($handle))) {
if ($file != "." && $file != "..") {
echo $file."<br />\n";
}
}
closedir($handle);
}
?>
When it does have mb language such as japanese, it doesn't display properly instead it display like kyuukyoku Choujin R ?????~? rather then kyuukyoku Choujin R 究極超人あ~る
Anyway to make it display the correct name or make it still download-able by others?
Thanks for helping me :)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
我不能明确地说 PHP,但我怀疑这与 Python 2 的基本问题相同(后来添加了对 Unicode 字符串文件名的特殊支持)。
我相信 PHP 使用标准 C 库“open”-et-al 函数处理文件名,这些函数是基于字节的。 在 Windows (NT) 上,它们尝试使用系统代码页对真实的 Unicode 文件名进行编码。 对于西方机器来说,这可能是 cp1252(类似于 ISO-8859-1),对于日本机器来说,可能是 cp932(类似于 Shift-JIS)。 对于系统代码页中不存在的任何字符,您将得到一个“?” 字符,您将无法引用该文件。
为了解决这个问题,PHP 必须像 Python 3.0 一样,开始使用 Unicode 字符串作为文件名(以及其他所有内容),使用 '_wopen'-et-al 函数在 Windows 下对文件名进行本机 Unicode 访问。 我预计这会在 PHP6 中发生,但目前您可能已经吃饱了。 您可以将系统代码页更改为 cp932 以访问文件名,但您仍然会得到“?” 任何其他不在 Shift-JIS 中的 Unicode 字符,并且在任何情况下,您真的不想希望使应用程序的内部字符串全部为 Shift-JIS,因为这是一种非常糟糕的编码。
如果是您自己的脚本选择如何存储文件,我强烈建议在本地使用简单的基于主键的文件名,例如“4356”,将真实的文件名放入数据库中,并使用重写/尾随路径部分来提供文件网址。 即使不必担心 Unicode,将用户提供的文件名保留在您自己的本地文件名中也很困难,并且会引发安全灾难。
I can't speak definitively for PHP, but I suspect it's the same basic problem as with Python 2 had (before later adding special support for Unicode string filenames).
My belief is that PHP is dealing with filenames using the standard C library ‘open’-et-al functions, which are byte-based. On Windows (NT) these try to encode the real Unicode filename using the system codepage. That might be cp1252 (similar to ISO-8859-1) for Western machines, or cp932 (similar to Shift-JIS) on Japanese machines. For any characters that don't exist in the system codepage you will get a ‘?’ character, and you'll be unable to refer to that file.
To get around this problem PHP would have to do the same as Python 3.0 and start using Unicode strings for filenames (and everything else), using the ‘_wopen’-et-al functions to get native-Unicode access to the filenames under Windows. I expect this will happen in PHP6, but for the moment you're probably pretty much stuffed. You could change the system codepage to cp932 to get access to the filenames, but you'd still get ‘?’ characters for any other Unicode characters not in Shift-JIS, and in any case you really don't want to make your application's internal strings all Shift-JIS as it's quite a horrible encoding.
If it's your own scripts choosing how to store files, I'd strongly suggest using simple primary-key-based filenames like ‘4356’ locally, putting the real filename in a database, and serving the files up using rewrites/trailing path parts in the URL. Keeping user-supplied filenames in your own local filenames is difficult and a recipe for security disasters even without having to worry about Unicode.
正如 @bobince 提到的,PHP 以系统区域设置的指定编码返回文件名,该编码由不支持 Unicode 的应用程序使用。 如果当前系统编码中不存在该字符,则文件名将包含“?” 相反,将无法访问。
您可以尝试在 https://github.com/ 安装
php-wfio.dll
kenjiuno/php-wfio,并通过wfio://
协议引用文件。As @bobince mentioned, PHP returns filenames in the specified encoding for System Locale, which is used by non-Unicode aware applications. If the character doesn't exist in the current system encoding, the filename will contain '?' instead and will not be accessible.
You can try installing
php-wfio.dll
at https://github.com/kenjiuno/php-wfio, and refer to files via thewfio://
protocol.你错过了另外两个对 $file 变量的引用,伙计,但这是为了更好,因为我想我可能已经发现了一个稍微更有效的方法; 尝试一下:
You missed two other references to the $file variable, mate, but that's for the better as I think I may've discovered a slightly more efficient method; give this a try:
抱歉:)
尝试这个:
\n";
}
}
关闭($句柄);
}
?>
sorry :)
tries this:
<?php
if ($handle = opendir('C:/xampp/htdocs/movies')) {
while (false !== ($file = readdir($handle))) {
$filename_utf16 = iconv( "iso-8859-1", "utf-16", $file);
if ($filename_utf16 != "." && $filename_utf16 != "..") {
echo $filename_utf16 . "<br />\n";
}
}
closedir($handle);
}
?>
将 $file 的任何实例替换为 mb_substr($file, mb_strrpos($file, '/') + 1) 就可以了。 多字节编码万岁!
Replace any instance of $file with mb_substr($file, mb_strrpos($file, '/') + 1) and you should be good to go. Huzzah for multi-byte encoding!
我认为 Windows 使用 UTF-16 作为文件名。 因此,请尝试使用
mb_convert_encoding
函数从内部编码转换为输出编码:也许您必须先更改一些设置(请参阅
mb_get_info
)。I think Windows uses UTF-16 for file names. So try the
mb_convert_encoding
function to convert from the internal encoding to your output encoding:Maybe you have to change some settings first (see
mb_get_info
).