在 php 中使用 UTF-8 字符集
我已经在 UTF-8 字符集上苦苦挣扎了很长一段时间,而且我仍然对一些事情感到困惑。
我有一个网页,允许客户端在服务器上创建 HTML 文件和目录。目录名称可以是任何语言。 再见,días,章节,级别等。创建的目录稍后将用作创建的 HTML 文件的 URL。假设用户创建了一个目录 Adiós
,然后创建了一个名为 welcome.html
的文件。要查看此文件,客户端单击一个链接,为此我获取目录和文件名以创建路径 Adiós/welcome.html
。现在我对这些事情很困惑。
在php中创建目录时,我应该
urlencode()
每个文件和目录名称吗?如果我对目录名称进行
urlencode
,浏览器能够打开我的 HTML 页面吗?而不是href="Adiós/welcome.html"
它将是href="Adi%C3%B3s/welcome.html"
。有时我的网页上会有一张图像,我会将其源代码为
“Adi%C3%B3s/ing.jpg”
;这行得通吗?地址栏中的 url 是否应显示非 ASCII 字符?
实际上,我urlencode()
d 了一切,但遇到了第 2 点和第 3 点中描述的问题,所以我想知道在使用英语以外的语言时,正确的目录命名方法是什么!
I have been struggling with the UTF-8 charset for quite a while now, and I am still confused about some things.
I have a web page which allow clients to create HTML files and directories on server. The directory name can be in any language. Adiós, días, chapter, level etc. The directories created are later on used as a URL for the HTML files created. Let’s say the user created a directory Adiós
and then a file called welcome.html
. To view this file, the client clicks a link and for that I get the directory and file name to create a path Adiós/welcome.html
. Now I am confused about these things.
When making the directory in php, should I
urlencode()
every file and directory name?If I do
urlencode
the directory name, will the browser be able to open my HTML page? Instead ofhref="Adiós/welcome.html"
it will behref="Adi%C3%B3s/welcome.html"
.There’s sometimes an image on my web page which I will src as
"Adi%C3%B3s/ing.jpg"
; is this going to work?Should the url in address bar show non‐ASCII characters?
I actually urlencode()
d everything but ran into issues as described in point 2 and 3, so I wanted to know what the right approach is for directory naming when working with languages other than English!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果您在文件系统中保存经过 urlencode 编码的名称,则如果您想绕过 PHP 直接访问它们,则必须对链接和图像源进行双重 urlencode 编码。或者,您可以在不进行任何类型的 urlencoding 的情况下保存名称,在这种情况下,链接将需要一次传递。但是,最后一个选项在 Windows 上不可用,因为文件系统函数不支持 Unicode。
或者,如果您仍然想绕过 PHP,您可以使用重写规则在 Apache 对名称进行 url 解码后对其进行重新编码。
最后,您应该注意,您的方法是危险的——在没有安全隐患的情况下很难做到正确。您应该考虑使用单个 PHP 文件来服务您的页面并将它们保存在数据库中。您仍然可以使用 PATH_INFO 变量来保留漂亮的文件名。如果此解决方案的性能成为问题,您还可以在 PHP 前面添加一个缓存层。
If you save the names urlencoded in the filesystem, you must double urlencode the links and image sources if you want to access them directly, bypassing PHP. Alternatively, you could save the names without any kind of urlencoding, in which case the links would need one pass. However, this last option isn't available on Windows, where Unicode is not supported in the filesystem functions.
Alternatively, if you still want to bypass PHP, you can use rewrite rules to reencode the names once they have urldecoded by Apache.
Finally, you should take note that your approach is dangerous -- difficult to get right without security implications. You should consider have a single PHP file serving your pages and saving them in a database. You could still keep pretty filenames by using the PATH_INFO variable. You could also add a caching layer in front of PHP if performance becomes an issue with this solution.
/tülüvkrü.htm
这样的文件,我不知道 MS 是如何处理的IE 处理这些事情;示例:http://tülüvkrü.de/中华人民共和国.htm(应显示“It Works!”)
/tülüvkrü.htm
, I don't how MS IE handles such things;Example: http://tülüvkrü.de/中华人民共和国.htm (should display "It works!")
这是错误的想法。
将他们的文件存储在数据库中并模拟目录结构。
编辑
由于评论中的这些愚蠢的指控,我必须澄清:
我正在谈论具有奇特名称的 HTML 文件的这种情况,而不是一般的二进制文件。
使满意?
That's wrong idea.
Store their files in the database and emulate directory structure as well.
EDIT
because of these silly accusations in the comments I have to clarify:
I am talking of this very case of HTML files with fancy names in particular, not of binary files in general.
satisfied?