使用Tycho SureFire插件编码问题
我一直在讨论这个问题,找不到有关这里发生的事情的解释。 我正在使用Tycho SureFire插件来构建一组Eclipse插件并执行一些单元测试。 这是环境:
tycho-surefire-plugin: version 0.19.0 //very old I know, but I'm stuck with legacy code
maven 3.5.2
jdk 8
windows 10
我从一个简单的测试用例开始,以测试一种用简单版本替换特殊字符的方法:
String input = "á:Á-é:É-í:Í-ó:Ó-ú:Ú-ñ:Ñ ";
System.out.println("testing input: " + input);
Assert.assertEquals("a A e E i I o O u U n N ",
Utils.sanitize(input, true));
这里的问题是,直接在Eclipse上执行Junit时,我得到了预期的结果,因此测试通过了,但是当我执行tycho构建时,我会得到:
testing input: ?:?-?:?-?:?-?:?-?:?-?:?
Failed tests: testSanitizeWithSpaces(com.fja.eos.automation.UtilsTest): expected:<[a A e E i I o O u U n N] > but was:<[o o o o o o o o o o o o] >
在两种情况下的值
System.out.println("Default charset: " + Charset.defaultCharset());
都是相同的:
Default charset: windows-1252
我的下一个尝试是从文件中读取输入值,使用以下方式控制charset:
InputStream is = UtilsTest.class
.getResourceAsStream("sanitationTestSubjects.xml");
InputSource source = new InputSource(is);
source.setEncoding("ISO-8859-1");
Document doc = builder.parse(is);
在读取这样的输入
<?xml version="1.0" encoding="ISO-8859-1" ?>
<SanitationTestSubjects>
<Subject input="á:Á-é:É-í:Í-ó:Ó-ú:Ú-ñ:Ñ " expected="a A e E i I o O u U n N " />
</SanitationTestSubjects>
时,我得到了一个稍有不同的结果:
testing input: á:?-é:É-í:?-ó:?-ú:?-ñ:Ñ
但仍然不正确。如果我尝试使用逃脱的输入,
StringEscapeUtils.escapeJava(elem.getAttribute("input"))
我将获得似乎是正确的Unicode序列:
Escaped input: \u00E1:\u00C1-\u00E9:\u00C9-\u00ED:\u00CD-\u00F3:\u00D3-\u00FA:\u00DA-\u00F1:\u00D1
我尝试在Tycho-surefire-Plugin上设置所有字符编码选项,而没有任何行为的更改:
<build>
<plugins>
<plugin>
<groupId>org.eclipse.tycho</groupId>
<artifactId>tycho-surefire-plugin</artifactId>
<version>${tycho-version}</version>
<configuration>
<appArgLine>-Dfile.encoding=ISO-8859-1</appArgLine>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-resources-plugin</artifactId>
<version>3.2.0</version>
<configuration>
<encoding>ISO-8859-1</encoding>
</configuration>
</plugin>
</plugins>
</build>
一项测试在蚀和maven上产生二进制均等文件。
更新: Java文件本身的编码设置为CP1252,将其更改为ISO-8859-1之后,我得到的结果与从文件中读取值相同
。到错误的一面。 有人可以帮忙吗?
I've been turning my head around with this issue and cannot find an explanation for what is happening here.
I'm using tycho surefire plugin to build a set of eclipse plugins and execute some unit tests.
Here's the environment:
tycho-surefire-plugin: version 0.19.0 //very old I know, but I'm stuck with legacy code
maven 3.5.2
jdk 8
windows 10
I've started with a simple test case to test a method that replaces special characters with their simple version:
String input = "á:Á-é:É-í:Í-ó:Ó-ú:Ú-ñ:Ñ ";
System.out.println("testing input: " + input);
Assert.assertEquals("a A e E i I o O u U n N ",
Utils.sanitize(input, true));
the problem here is that when executing the junit directly on eclipse I get the expected result, so the test passes, but when I execute the tycho build I get:
testing input: ?:?-?:?-?:?-?:?-?:?-?:?
Failed tests: testSanitizeWithSpaces(com.fja.eos.automation.UtilsTest): expected:<[a A e E i I o O u U n N] > but was:<[o o o o o o o o o o o o] >
The value for
System.out.println("Default charset: " + Charset.defaultCharset());
is the same in both scenarios:
Default charset: windows-1252
my next attempt was to read the input value from a file, controlling the charset using:
InputStream is = UtilsTest.class
.getResourceAsStream("sanitationTestSubjects.xml");
InputSource source = new InputSource(is);
source.setEncoding("ISO-8859-1");
Document doc = builder.parse(is);
for the file
<?xml version="1.0" encoding="ISO-8859-1" ?>
<SanitationTestSubjects>
<Subject input="á:Á-é:É-í:Í-ó:Ó-ú:Ú-ñ:Ñ " expected="a A e E i I o O u U n N " />
</SanitationTestSubjects>
while reading the input like this I got a slightly different result:
testing input: á:?-é:É-í:?-ó:?-ú:?-ñ:Ñ
but still not correct. If I try to get the escaped input with
StringEscapeUtils.escapeJava(elem.getAttribute("input"))
I get what it seems to be the correct unicode sequence:
Escaped input: \u00E1:\u00C1-\u00E9:\u00C9-\u00ED:\u00CD-\u00F3:\u00D3-\u00FA:\u00DA-\u00F1:\u00D1
I've tried setting all character encoding options on the tycho-surefire-plugin without any change on behavior:
<build>
<plugins>
<plugin>
<groupId>org.eclipse.tycho</groupId>
<artifactId>tycho-surefire-plugin</artifactId>
<version>${tycho-version}</version>
<configuration>
<appArgLine>-Dfile.encoding=ISO-8859-1</appArgLine>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-resources-plugin</artifactId>
<version>3.2.0</version>
<configuration>
<encoding>ISO-8859-1</encoding>
</configuration>
</plugin>
</plugins>
</build>
one more test, compiling the files on eclipse and with maven results in binary equal files..
UPDATE:
the encoding of the java file itself was set to cp1252, after changing it to ISO-8859-1 I got the same result as reading the value from a file.. still not there..
I'm really feeling that I'm looking to the wrong side of the problem.
can anyone please help?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论