rawurldecode 的 utf-8 问题和浏览器地址栏问题
我在使用土耳其语字符集进行 rawurldecode 时遇到一些问题。
我有一个土耳其语单词(yeşil 意思是绿色),需要作为 GET 参数传递。
这是我生成的链接。
search.php?renk=ye%C5%9Fil
当我单击此链接时,浏览器地址栏会显示这样的内容。 (它被正确解码)
search.php?renk=yeşil
问题从这里开始。当我修改浏览器地址栏中的 url(例如添加额外的 get 参数)并按 Enter 浏览器修改关键字时,它会生成如下所示的 url。
search.php?renk=ye%FEil
此后,服务器端代码将不再处理参数并生成错误的结果。有没有任何标准方法可以避免这种情况?
谢谢。
I have some problems with rawurldecode with Turkish character set.
I have a turkish word (yeşil means green) which needs to be passed as GET parameter.
Here is my generated link.
search.php?renk=ye%C5%9Fil
When I clicked this link browser address bar shows it like that. (It is decoded properly)
search.php?renk=yeşil
And the problem starts from here. When I modify url in browser address bar (like adding extra get parameter) and hit enter browser modifies keyword and it generates url like below.
search.php?renk=ye%FEil
After this point server side code doesn't handle parameter and generates wrong results. Is there any standard way of avoiding this?
Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
看起来您的浏览器将链接转换为 iso-8859-9 编码或类似的编码。
%FE 是来自 iso-8859-9 编码的 urlencoded ş。
我尝试过 iconv("iso8859-9", "utf-8", rawurldecode("search.php?renk=ye%FEil")) 并且成功了。
Looks like your browser converts link to iso-8859-9 encoding, or something similar.
%FE is urlencoded ş from iso-8859-9 encoding.
I've tried
iconv("iso8859-9", "utf-8", rawurldecode("search.php?renk=ye%FEil"))
and it worked.网址始终使用 US-Ascii !
请参阅 RFC:http://www.ietf.org/rfc/rfc1738.txt
现在你遇到了很多问题。
如果将 url 粘贴到浏览器中,则 url 字段有时依赖于操作系统区域设置。
浏览器可能会转换它。
有时防火墙和代理可能会过滤网址!
下一个重要问题是:
Web 服务器如何解释这些高位字符。
它如何将其传输到 php(取决于网关)。 php 自动解码 url,如果你的高字符会发生什么? php 不关心编码。
在我看来,这是唯一可以保存的解决方案。
将您的 unicodestring 编码为 base64 编码的字符串。
这将保存在 url 中 - 因为它是 ascii。
在您的脚本中,您可以对其进行解码,并将其返回到您之前设置的编码中。
Urls are always using US-Ascii !
See RFC: http://www.ietf.org/rfc/rfc1738.txt
now you are running into lots of problems.
if you paste a url into the browser, the url field sometimes relies on OS locales.
the browser may convert it.
sometimes firewalls and proxys may filter urls!
the next important question is:
how does the web server interpret those high chars.
how does it transfer it to php (depending on gateway). php decodes urls automatically, what will happen there with you high chars? php doesn't take care about encoding.
in my opinion the is only one solution to be save.
encode your unicodestring into a base64encoded string.
this will be save within the url - because it is ascii.
within your script you can decode it and you have it back in your encoding you set before.