将代码源从代码页转换为 UTF-8 的工具?
我正在开发一个开源项目。 原始项目包含俄语注释,并且使用代码页 1251。我使用代码页 1252,俄语注释在 Visual Studio Express 2008 中无法正确显示,不太好,但无论如何我无法阅读俄语。 有人使用代码页 950(繁体中文)尝试编译该项目,但由于代码页的原因而无法编译! 现在真的很烦人。
我认为使用unicode(更准确地说是带签名的UTF-8)作为代码源的文件格式是要走的路。
问题:如何轻松转换整个源代码?
我已经考虑过:
让 Visual Studio 将源代码保存为 UTF-8。 但是:我的计算机使用的是代码页 1252,并且我找不到方法告诉 VS 原始代码源使用的是代码页 1251,因此转换将不正确。
编辑:正如“LicenseQ”所指出的,有一种方法可以在 VS 中使用另一种编码打开单个文件:单击打开对话框中“打开”按钮附近的箭头,选择“打开方式”,然后选择“代码编辑器(带编码)”。
当然,我可以在转换时更改计算机的代码页。 但这是 Windows 中的全局设置,您需要重新启动计算机,以便我寻找更友好的解决方案。
我找到了一个名为 CodePageConverter 的工具,它的功能完全符合我的要求需要,但不能作为批处理作业来完成。
有谁知道另一个工具(命令行工具将是完美的)从代码页转换为 UTF-8?
编辑:正如 tkotitan 所建议的,似乎 iconv 是解决方案我在寻找。 有一个 windows 版本的 iconv。 现在我知道了这个工具的名称,我能够找到 查看 stackoverflow 上处理类似问题的帖子。
I'm working on an open source project. The original project contains comments in russian and is using codepage 1251. I'm using codepage 1252 and the russian comments aren't displayed correctly in Visual Studio Express 2008, not nice but anyway I can't read russian. Someone using codepage 950 (traditional chinese) tried to compile the project and was unable to do it, because of the code page! Now it is really annoying.
I think that using unicode (and more exactly UTF-8 with signature) as file format for the code source is the way to go.
Problem: how to convert the whole source code easily?
I have already though about:
Let Visual Studio save the source code as UTF-8. But: My computer is using codepage 1252 and I found no way to tell VS that the original code source is using codepage 1251 so that the conversion won't be correct.
Edit: As pointed by "LicenseQ" there is a way to open a single file in VS with another encoding: click Arrow near Open button in open dialog, chose "Open With" and then chose "Code Editor (with encoding)".
Of course I could change the codepage of my computer for the time of the conversion. But it's a global setting in Windows and you need to reboot the computer so that I'm looking for a more friendly solution.
I've found a tool called CodePageConverter which do exactly what I need, but can't a do it as batch job.
Does anyone know another tool (a command line tool would be perfect) to convert from a codepage to UTF-8?
Edit: As suggest by tkotitan seems iconv to be the solution I was looking for. There is a windows version of iconv. And now that I know the name of this tool, I was able to find over posts on stackoverflow dealing with analogous issues.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
在 UNIX 世界中,该实用程序称为 iconv。
不确定是否有 Windows 等效项。
In a unix world the utility is called iconv.
Not sure if there is a windows equivalent.
您可以要求 VS 2008 使用编码打开文件(单击打开对话框中“打开”按钮附近的箭头)
或者您可以更改区域设置以将俄罗斯区域添加为默认值;)
You can ask VS 2008 to open file with encoding (click Arrow near Open button in open dialog)
Or you can change regional settings to add russian region as default ;)