android utf-8文件解析

发布于 2024-12-11 18:13:11 字数 1868 浏览 0 评论 0原文

我有一些以 UTF-8 编码的 .xml 文件。但每当我尝试在平板电脑（idea pad、lenovo、android 3.1）上解析它们时，我都会收到相同的错误：

org.xml.SAXParseException: Unexpected token (position: TEXT @1:2 in 
java.io.StringReader@40bdaef8).

这些是引发异常的行：

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
InputSource inputSource = new InputSource();
inputSource.setCharacterStream(new StringReader(xmlData));
Document doc = db.parse(inputSource); // This line throws exception

这是我的输入：

public String getFromFile(ASerializer aserializer) {
    String filename = aserializer.toLocalResource();
    String data = new String();
    try {
        InputStream stream = _context.getResources().getAssets().open(filename);
        BufferedReader reader = new BufferedReader(new InputStreamReader(stream));
        StringBuilder str = new StringBuilder();
        String line = null;
        while((line = reader.readLine()) != null) {
            str.append(line);
        }
            stream.close();
            data = str.toString();
   }

           catch(Exception e) {
       }
       return data;
    }

XML 文件：

<Results>
    <Result title="08/07/2011">
        <Field title="Company one" value="030589674"/>
        <Field title="Company two" value="081357852"/>
        <Field title="Company three" value="093587125"/>
        <Field title="Company four" value="095608977"/>
    </Result>
    <Result title="11/07/2011">
        <Field title="Company one" value="030589674"/>
        <Field title="Company two" value="081357852"/>
    </Result>
</Results>

我不想转换它们到 ANSI，那么有什么方法可以使 db.parse() 工作吗？

原文

I have some .xml files that are encoded in UTF-8. But whenever I try to parse them on my tablet (idea pad, lenovo, android 3.1), I get the same error:

org.xml.SAXParseException: Unexpected token (position: TEXT @1:2 in 
java.io.StringReader@40bdaef8).

These are the lines that throw the exception:

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
InputSource inputSource = new InputSource();
inputSource.setCharacterStream(new StringReader(xmlData));
Document doc = db.parse(inputSource); // This line throws exception

Here is my input:

public String getFromFile(ASerializer aserializer) {
    String filename = aserializer.toLocalResource();
    String data = new String();
    try {
        InputStream stream = _context.getResources().getAssets().open(filename);
        BufferedReader reader = new BufferedReader(new InputStreamReader(stream));
        StringBuilder str = new StringBuilder();
        String line = null;
        while((line = reader.readLine()) != null) {
            str.append(line);
        }
            stream.close();
            data = str.toString();
   }

           catch(Exception e) {
       }
       return data;
    }

XML File:

<Results>
    <Result title="08/07/2011">
        <Field title="Company one" value="030589674"/>
        <Field title="Company two" value="081357852"/>
        <Field title="Company three" value="093587125"/>
        <Field title="Company four" value="095608977"/>
    </Result>
    <Result title="11/07/2011">
        <Field title="Company one" value="030589674"/>
        <Field title="Company two" value="081357852"/>
    </Result>
</Results>

I don't want to convert them to ANSI, so is there any way to make the db.parse() work?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

红衣飘飘貌似仙 2024-12-18 18:13:11

在这一行：

BufferedReader reader = new BufferedReader(new InputStreamReader(stream));

您正在使用平台默认编码从 stream 读取内容。这几乎肯定不是您想要的。您需要检查 XML 的实际编码，正确的方法是有点复杂。

幸运的是，每个正常的 XML 解析器（包括 Java/Android 解析器）都可以自己完成这一任务。要让 XML 解析器执行此操作，只需传入 stream 本身，而不是尝试手动读取它。

InputSource inputSource = new InputSource(stream);

At this line:

BufferedReader reader = new BufferedReader(new InputStreamReader(stream));

You're reading from stream using the platform default encoding. That's almost certainly not what you want. You'd need to check the XML for for the actual encoding and the correct way to do that is somewhat complicated.

Luckily, every sane XML parser (including the Java/Android one) can do that on its own. To make the XML parser do that, simply pass in the stream itself instead of trying to read it manually.

InputSource inputSource = new InputSource(stream);

回复收藏 0 原文

一桥轻雨一伞开 2024-12-18 18:13:11

您很可能使用带有 BOM 标记（字节顺序标记）的 XML 文件。

使用从 BOM 检测编码的 API

Java：如何确定流的正确字符集编码

或者，预处理文件以便不存在 BOM。

回复收藏 0 原文

忱杏 2024-12-18 18:13:11

默认情况下，您的 java 字符串采用 UTF-16 编码。如果您无法按照@Joachim Sauer的建议使用InputStream，请尝试以下操作：

Document doc = db.parse(new ByteArrayInputStream(xmlData.getBytes()));

Your java string is in an UTF-16 encoding be default. If you can't use InputStream as @Joachim Sauer suggested, then try this:

Document doc = db.parse(new ByteArrayInputStream(xmlData.getBytes()));

回复收藏 0 原文

~没有更多了~

关于作者

尬尬

暂无简介

0 文章

0 评论

22 人气

关注发私信

友情链接

文江博客

android utf-8文件解析

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

醉城メ夜风

远昼

平生欢

微凉

Honwey

qq_ikhFfg

友情链接

android utf-8文件解析

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

醉城メ夜风

远昼

平生欢

微凉

Honwey

qq_ikhFfg

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。