将网页从不同的字符集迁移到 UTF-8
过去几年我在 Win XP SP2 上使用 Notepad++。 正如我刚才所看到的,Notepad++ 中的设置是将新文件以“Windows 格式”中的“ANSI”进行编码。基本上我硬盘上的所有文件都应该是 ANSI 文件,但我不确定。 大多数 .html 文件的字符集标签为“text/html; charset=iso-8859-1”,但有些文件没有。 其他文件,特别是我用 Firefox XPCOM 系统存储的文本文件(例如关键字列表),我不知道它们当前是如何编码的。
在服务器端,我有 Apache、PHP 和 MySql。 对于上传,我使用了 Filezilla。
现在的问题是:我想使用日语标志(或阿拉伯语等)。这只能部分起作用。 我可以让我自制的 Firefox 应用程序不断地写入或读取 UTF-8。但我无法每次都检查哪些旧文件是哪种编码。
刚刚阅读了 Joel Spolsky 关于 UTF-8 的旧文章,这强化了我的观点:我只需将整个系统尽可能地更改为 UTF-8。 只要我让它在本地硬盘上以这种方式运行,我就可以将所有内容重新上传到服务器。
那么:如何将本地所有文件传输为 UTF-8? 并且:是否有可能让 Win XP SP2 在所有地方不断使用 UTF-8?或者我是否必须对每个程序进行检查,甚至更糟糕的是对每个文件进行检查,以确保使用正确的编码。 我通过电子邮件或 USB 记忆棒获得的文件,或者以 zip 文件形式下载的文件怎么样? (或者还有一千种可能性。)
更新:
1.-4。到目前为止一切顺利。我首先尝试使用 BOM,但不使用似乎更好。
所以到5。)我也必须在那里改变一些东西。我按照 3.) 中的方式更改了 html-template-file 中的字符集,并且来自模板的文本可以正确显示。但来自MySql/Php 的文本目前在某些地方显示了UnknownChar 符号,即应该有Umlaute äöü 的地方。 我已通过 phpmyadmin 将 MySql 数据库中文本字段的所有排序规则更改为“utf8_unicode_ci”,但这并没有解决问题。 这是一个 php 问题,还是我只需要以某种方式转换 MySql 数据库中的数据一次?
For the last years I used Notepad++ on Win XP SP2.
As I just have seen, the setting in Notepad++ is to encode new files in "ANSI" in "Windows Format". Basically all files on my harddisk should be ANSI files then, but I'm not sure.
Most .html-files have a charset-tag as "text/html; charset=iso-8859-1", but some have none.
Other files, especially text-files (for example keyword-lists) I stored with Firefox XPCOM-system, I don't know how they are currently encoded.
On Server-side I have Apache with PHP and MySql.
For Upload I used Filezilla.
Now the problem is: I want to use Japanes signs (or arabic, etc.). This only works partly.
I can get my selfmade Firefox-Application to constantly write or read UTF-8. But I can't check everytime which of the old files is which encoding.
Having just read Joel Spolsky's old article about UTF-8 strengthens my view that I simply have to get my whole system changed as much as possible to UTF-8.
As long as I have it running that way locally on my Hard-Disk I could just re-upload everything to the server.
So: How do I get all my files locally transfered to UTF-8?
And: Is it possible at all to have Win XP SP2 using constantly UTF-8 everywhere? Or do I have to check it with every program, or even worse with every file, that the right encoding is to be used.
How about files I get for example in E-Mails or via an USB-stick, or that I download in zip-files? (Or a thousand possibilities more.)
Update:
1.-4. went OK so far. I tried first with BOM, but without seems to be better.
So to 5.) Something I have to change there too. I changed as in 3.) the charset in the html-template-file, and the text coming from the template is displayed correctly. But the text coming from MySql/Php shows the UnknownChar-sign at some places currently, i.e. where there should be Umlaute äöü.
I have changed all collations for text fields in the MySql-Database via phpmyadmin to "utf8_unicode_ci", but that didn't do the trick.
Is it a php-issue, or do I only have to convert somehow the data in the MySql-Database once?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)