1 字节 UTF-8 序列的字节 1 无效

发布于 2024-08-26 03:39:04 字数 2294 浏览 3 评论 0原文

我有一个 MyFaces Facelets 应用程序,其中页面编码有点粗糙。不管怎样,它是用 Eclipse 开发的,用 Ant 构建的,在 Tomcat 2.0.26 中运行得很好。到目前为止,一切都很好。

现在,我宁愿使用 Maven 进行构建,因此我制作了几个 pom 文件,在 Netbeans 中打开它们并进行构建,现在我有了一个可以正常部署的 war 文件。然而,在任何 Facelet 页面上,它都会抛出“

com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence.
        at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(UTF8Reader.java:684)
        at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:554)
        at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1742)

所以,我尝试了很多不同的东西,并且该应用程序实际上运行没有 Facelet 内容的简单页面。但是,如果我只是用 Ant 构建,一切都会运行......所以我的问题是:ant 构建和 Maven 构建之间最有可能导致这种情况的区别是什么?

似乎即使我在 Netbeans 和 pom 文件中配置了 UTF-8,Netbeans 最终还是在经过一些编辑后将 Facelet 文件报告为 ISO-8859-1。

我已经确保大多数中央库具有相同的版本(尤其是 xerces 2.3.0),我添加了一个没有效果的编码 servlet 过滤器。

而且,我宁愿修复 Maven 构建并保留有错误的页面,而不是相反……我的目的是介绍 Naven,而不是修复有错误的页面。

以下是 pom.xml 关于编码的内容:

基本上 pom.xml 有以下设置......

 <plugins>

            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>2.0.2</version>
                <configuration>
                    <source>1.6</source>
                    <target>1.6</target>
                    <encoding>${project.build.sourceEncoding}</encoding>>
                </configuration>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-resources-plugin</artifactId>
                <version>2.2</version>
                <configuration>
                    <encoding>${project.build.sourceEncoding}</encoding>
                </configuration>
            </plugin>

....

    <properties>
        <netbeans.hint.deploy.server>Tomcat60</netbeans.hint.deploy.server>
        <project.build.sourceEncoding>utf-8</project.build.sourceEncoding>
    </properties>

I have a MyFaces Facelets application, where the page coding is a bit rugged. Anyway, it's developed with Eclipse and built with Ant, and kindof runs ok in Tomcat 2.0.26. So far so good.

Now, I'd rather build with Maven, so I made a couple of pom-files, opened them in Netbeans and built, and now I have a war file that deploys ok. However, on any facelet page it barfs out with

com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence.
        at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(UTF8Reader.java:684)
        at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:554)
        at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1742)

So, I've tried a lot of different things, and the application actually run simple pages without facelet stuff. But, everything runs if I just build with Ant instead ... So my question is: What's the most likely difference between an ant build and a maven build that may cause this?

It also seems that even though I've configured for UTF-8 in Netbeans and pom-files, Netbeans eventually ends up reporting the facelet files as ISO-8859-1 after some editing.

I've made sure that most central libs are of same version (especially xerces 2.3.0), I've added an encoding servlet filter that had no effect.

And, I'd rather fix the maven build and keep the buggy pages, than the other way around ... it's my intention to introduce Naven, not fix buggy pages.

Here is what the pom.xml says about encoding:

Basically the pom.xml has the following set ...

 <plugins>

            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>2.0.2</version>
                <configuration>
                    <source>1.6</source>
                    <target>1.6</target>
                    <encoding>${project.build.sourceEncoding}</encoding>>
                </configuration>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-resources-plugin</artifactId>
                <version>2.2</version>
                <configuration>
                    <encoding>${project.build.sourceEncoding}</encoding>
                </configuration>
            </plugin>

....

    <properties>
        <netbeans.hint.deploy.server>Tomcat60</netbeans.hint.deploy.server>
        <project.build.sourceEncoding>utf-8</project.build.sourceEncoding>
    </properties>

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

情归归情 2024-09-02 03:39:04

com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException:1 字节 UTF-8 序列的字节 1 无效。

原因是不是 UTF-8 的文件被解析为 UTF -8。解析器很可能遇到 FE-FF 范围内的字节值。这些值在 UTF-8 编码中无效。

可以通过将文件的 XML 声明更改为正确的编码或将文件重新编码为 UTF-8 来解决该问题。

com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence.

The cause of this is a file that is not UTF-8 is being parsed as UTF-8. It is likely that the parser is encountering a byte value in the range FE-FF. These values are invalid in the UTF-8 encoding.

The problem could probably be solved by changing the XML declaration of the file to be the correct encoding or re-encoding the file to UTF-8.

唯憾梦倾城 2024-09-02 03:39:04

在 Windows 上这非常简单。如果没有 Notepad++,请获取它,然后使用“编码”菜单更改编码。

On Windows it's very easy. Get Notepad++ if you don't have it, and change the encoding using the "encoding" menu.

在梵高的星空下 2024-09-02 03:39:04

我也有同样的问题!

我使用以下代码解决了这个问题:

String str = new String(oldstring.getBytes("UTF-8"));

I had the same problem!

I had solved it using the following piece of code:

String str = new String(oldstring.getBytes("UTF-8"));
予囚 2024-09-02 03:39:04

我在 Windows 机器上使用 Maven 运行一些单元测试时遇到了这个错误。

文件以默认的 Windows-1252 格式写入,然后在尝试将它们读取为 UTF-8 时,一些测试失败。

解决方案是对单元测试中写入的文件强制执行项目源编码:

    <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-surefire-plugin</artifactId>
        <version>2.20</version>
        <configuration>
            <argLine>-Dfile.encoding=${project.build.sourceEncoding}</argLine>
        </configuration>
        <dependencies>
            <dependency>
                <groupId>org.apache.maven.surefire</groupId>
                <artifactId>surefire-junit47</artifactId>
                <version>2.20</version>
            </dependency>
        </dependencies>
    </plugin>

其中 project.build.sourceEncoding 在 pom 属性中定义:

  <properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
  </properties>

I encountered this error when running some unit tests using maven on a Windows machine.

Files were being written out in the default Windows-1252 format and then some tests were failing when trying to read them as UTF-8.

The solution was to enforce project source encoding for files being written out in the unit tests:

    <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-surefire-plugin</artifactId>
        <version>2.20</version>
        <configuration>
            <argLine>-Dfile.encoding=${project.build.sourceEncoding}</argLine>
        </configuration>
        <dependencies>
            <dependency>
                <groupId>org.apache.maven.surefire</groupId>
                <artifactId>surefire-junit47</artifactId>
                <version>2.20</version>
            </dependency>
        </dependencies>
    </plugin>

Where project.build.sourceEncoding was defined in the pom properties:

  <properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
  </properties>
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文