netbeans utf8 编码混乱 - 根据编码搜索源文件并修复它们的工具
我用netbeans 6.7.1编辑了几个ISO-8859-15编码的php源文件,但它把它们(没有问我!!!!)转换为utf-8,并且我在这个过程中丢失了几个德语字符......
我正在寻找一个工具来查找目录中的所有 utf8 编码文件(我很难判断哪个文件已损坏)。
我还需要一个工具来翻译它们
我正在尝试使用 gedit 来修复整个问题,它可以识别并尊重每个文件的字符集,但不会让我将 utf8 文件保存为 iso-8859-15,因为它说有些字符不会被转换...
所以,我需要:
一个搜索utf8编码文件的工具
一个允许我去的编辑器从一种编码到另一种编码
哦,是的!,一种告诉 netbeans 不要弄乱我的文件的方法!!!
(我已经尝试过 编辑 /etc/netbeans.conf 并添加 -J-Dfile.encoding=UTF-8 或 -J-Dfile.encoding=ISO-8859-15 但运气不佳) http://wp.uberdose.com/2007/05 /07/netbeans-and-utf-8/ http://ditoinfo.wordpress.com/2007 /02/26/netbeans-and-utf8-encoding-2/
非常感谢
编辑:(
嗯我刚刚发现了这个 http://wiki.netbeans.org/FaqI18nProjectEncoding 其中说如何修改项目的字符编码,我会尝试一下 这里解释了 Netbeans 造成的混乱
对于新的 IDE 安装,UTF-8 new 的编码是默认的 项目,因为这种编码可以处理 任何 Unicode 字符,使其成为 大多数人的最佳选择。当你 创建一个新项目,IDE 最初默认给它 与上一个项目相同的编码 您设置的编码。如果你 想要其他编码,只需更改它 在属性对话框中。
我从现有的 php 源创建了一个新项目,我想这就是问题所在...... )
I have edited several files ISO-8859-15 encoded php source files with netbeans 6.7.1, but it converted them (without asking me!!!!) to utf-8,and I lost several german characters in that process...
I'm looking for a tool to find all the utf8 encoded files inside a directory (It's hard for me to tell which file has been broken).
I'd also need a tool to translate them
I'm trying to fix the whole thing with gedit, which recognizes and respects the charset of each file, but won't let me save utf8 files as iso-8859-15, because it says there characters that won't be converted...
so, I need:
a tool o search for utf8 encoded files
an editor that allows me to go from one encoding to another
oh yes!, a way to tell netbeans not to mess with my files!!!
(i have already tried with
editing /etc/netbeans.conf and adding -J-Dfile.encoding=UTF-8, or -J-Dfile.encoding=ISO-8859-15 with no luck)
http://wp.uberdose.com/2007/05/07/netbeans-and-utf-8/
http://ditoinfo.wordpress.com/2007/02/26/netbeans-and-utf8-encoding-2/
thanks a lot
edit:
(mmm I've just found this
http://wiki.netbeans.org/FaqI18nProjectEncoding
which says haw to modify characters encoding for a project, I'll give it a try
here it explains the mess netbeans did
For a new IDE installation, UTF-8
encoding is the default for new
projects, as this encoding can handle
any Unicode characters, making it the
best choice for most people. When you
create a new project, the IDE
initially defaults to giving it the
same encoding as the last project on
which you set the encoding. If you
want another encoding, just change it
in the properties dialog.
and I created a new project from existing php sources, I guess that's what went wrong...
)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我将按顺序阐述您的观点:
如果您了解一点 Python,我建议您查看 decodeh。 py。它将使用最低公分母的策略。因此,如果 iso-8859-15 文件的任何字符都不位于 iso-8859-1 范围之外,则它们可能会被识别为 iso-8859-1。我已经在 utf-8、iso-8859-1 和 iso-8859-15 文件上尝试过,大部分是正确的。它使用字节顺序标记和启发式方法来猜测编码。
这个问题的答案很简单:emacs。使用
Mx
describe-current-coding-system< /code> 查看 emacs 认为文件的编码。 Emacs 在这方面从来没有让我失望过。使用
Mx set-buffer-file-coding-system
来设置缓冲区应使用哪种编码写入文件。对于 netbeans 6.7,您可以根据项目配置默认编码。打开项目后,转到文件->项目属性,在左侧菜单和右侧底部选择源在手动面板中,您将看到一个标题为“编码”的下拉框
I'll take your points in order:
If you know a bit of python, I recommend lokking at decodeh.py. It'll use a strategy of the lowest common denominator. So iso-8859-15 files might be recognized as iso-8859-1 if none of their characters lies outside the iso-8859-1 scope. I have tried it on utf-8, iso-8859-1 and iso-8859-15 files and it is mostly correct. It uses the byte order mark and heuristics to guess the encoding.
The answer to this is easy: emacs. Use
M-x
describe-current-coding-system
to see what emacs thinks is the encoding of the file. Emacs have never failed me in this respect. UseM-x set-buffer-file-coding-system
to set which encoding the buffer should be written to a file with.For netbeans 6.7, you can configure the default encoding per project basis. When you have a project open, go to File->Project Properties, choose Sources in the left menu and in the bottom of the right hand panel, you'll see a dropdown box with the title 'Encoding'