InstallShield 期望非拉丁字母字符串表条目使用什么编码?
我开发的应用程序通过包含多个本地化的单个安装程序进行分发。构建过程包括一个脚本,该脚本使用每种受支持语言的翻译来更新 .ism 字符串表。
这对于法语和德语等语言来说效果很好。但是,当用日语测试安装程序时,文本显示为一系列方块。这不太可能是字体问题,因为 InstallShield 提供的字符串显示得很好;只有字符串表条目被破坏。所以问题似乎是字符串的编码错误。
.ism 采用 XML 格式,声明为 UTF-8 作为其编码,因此我假设字符串也需要采用 UTF-8 编码。他们实际上需要使用目标平台的编码吗?那么,对于具有不同编码的目标(即中国系统使用一种 GB 编码与另一种编码)是否存在任何担忧?在这里做什么是正确的?
编辑:使用InstallShield 2009,因为它和2010之间显然有区别。
I work on an app that gets distributed via a single installer containing multiple localizations. The build process includes a script that updates the .ism string table with translations for each supported language.
This works fine for languages like French and German. But when testing the installer in, i.e. Japanese, the text shows up as a series of squares. It's unlikely to be a font problem, since the InstallShield-supplied strings show up fine; only the string table entries are mangled. So the problem seems to be that the strings are in the wrong encoding.
The .ism is in XML format, with UTF-8 declared as its encoding, so I assumed the strings needed to be UTF-8 encoded as well. Do they actually need to use the encoding of the target platform? Is there any concern, then, about targets having different encodings, i.e. Chinese systems using one GB-encoding versus another? What is the right thing to do here?
Edit: Using InstallShield 2009, since there is apparently a difference between that and 2010.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
在InstallShield 2009 及更早版本中,编码是特定于相关语言的ANSI 编码中的二进制字符串的base-64 编码(例如日语的CP932)。在 InstallShield 2010 及更高版本中,它仍会接受该格式或使用 UTF-8,具体取决于该表中的其他列。
In InstallShield 2009 and earlier, the encoding is a base-64 encoding of the binary string in the ANSI encoding specific to the language in question (e.g. CP932 for Japanese). In InstallShield 2010 and later, it will still accept that or use UTF-8, depending on other columns in that table.
感谢迈克尔·乌尔曼(对他的回答投了赞成票),他为我们指明了正确的方向。但这是实际工作(使用 InstallShield 2009)算法,由同事进行逆向工程:
请注意,使用 uuencode 字典进行 base-64 转换与使用 uuencode 算法不同。标准 uuencode 生成一组换行符分隔的行,包括页眉、页脚和一个或多个数据行,每行都以长度字符开头。如果您使用 uuencode 编解码器来实现此功能,则需要删除所有这些内容。
Thanks (up-voted his answer) go to Michael Urman, for pointing us in the right direction. But this is the actual working (with InstallShield 2009) algorithm, reverse-engineered by a co-worker:
Be aware that base-64ing using the uuencode dictionary is not the same as using the uuencode algorithm. Standard uuencode produces a set of newline-separated lines, including a header, footers and one or more data lines, each of which begins with a length-character. If you're implementing this using a uuencode codec, you'll need to strip all of that off.
我也在试图解决这个问题...
我已经继承了一些Installshield 12(2009年之前)项目,其中的字符串表条目包含base64“目标”字符范围之外的字符。
例如,日语字符串之一是:
4P!H
&$
9!O
'<4
!R&\
=!E
&,=``@
$(80!C
&L=0!P
"00!G`&4`;@!T`)(PI##S,+DPR##\,.LP5S!^,%DP`C
经过大量搜索后,我偶然发现 Base85 编码,看起来更接近合理,但尚未验证这是否是解决方案。
I'm also trying to figure this out...
I've inhereted some Installshield 12 (which is pre-2009) projects with string table entries containing characters outside the range of base64 'target' characters.
For example, one of the Japanese strings is:
4P!H
&$
9!O
'<4
!R&\
=!E
&,=``@
$(80!C
&L=0!P
"00!G`&4`;@!T`)(PI##S,+DPR##\,.LP5S!^,%DP`C
After much searching I happened upon Base85 encoding, which looks much closer to being plausible, but have not yet verified this to be the solution.