为什么使用 inputStream 无法读取外来字符？

发布于 2024-11-12 04:37:37 字数 1042 浏览 10 评论 0原文

我有一个文本文件，其中包含需要预加载到 SQLite 数据库中的数据。我保存在 res/raw 中。

我使用 readTxtFromRaw() 读取整个文件，然后使用 StringTokenizer 类逐行处理文件。

但是，readTxtFromRaw 返回的String 不显示文件中的外来字符。我需要这些，因为有些文字是西班牙语或法语。我错过了什么吗？

代码：

String fileCont = new String(readTxtFromRaw(R.raw.wordstext));
StringTokenizer myToken = new StringTokenizer(fileCont , "\t\n\r\f");

readTxtFromRaw 方法是：

private String readTxtFromRaw(Integer rawResource) throws IOException
{
    InputStream inputStream = mCtx.getResources().openRawResource(rawResource);
    ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();

    int i = inputStream.read();
    while (i != -1)
    {
        byteArrayOutputStream.write(i);
        i = inputStream.read();
    }
    inputStream.close();

    return byteArrayOutputStream.toString();
}

该文件是使用 Eclipse 创建的，并且所有字符在 Eclipse 中都显示正常。

这可能与 Eclipse 本身有关吗？我设置了一个断点并在“监视”窗口中检查了 myToken。我尝试手动将奇怪的字符替换为正确的字符（例如 í 或 é），但它不允许我这么做。

原文

I have a text file which contains data I need to preload into a SQLite database. I saved in in res/raw.

I read the whole file using readTxtFromRaw(), then I use the StringTokenizer class to process the file line by line.

However the String returned by readTxtFromRaw does not show foreign characters that are in the file. I need these as some of the text is Spanish or French. Am I missing something?

Code:

String fileCont = new String(readTxtFromRaw(R.raw.wordstext));
StringTokenizer myToken = new StringTokenizer(fileCont , "\t\n\r\f");

The readTxtFromRaw method is:

private String readTxtFromRaw(Integer rawResource) throws IOException
{
    InputStream inputStream = mCtx.getResources().openRawResource(rawResource);
    ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();

    int i = inputStream.read();
    while (i != -1)
    {
        byteArrayOutputStream.write(i);
        i = inputStream.read();
    }
    inputStream.close();

    return byteArrayOutputStream.toString();
}

The file was created using Eclipse, and all characters appear fine in Eclipse.

Could this have something to do with Eclipse itself? I set a breakpoint and checked out myToken in the Watch window. I tried to manually replace the weird character for the correct one (for example í, or é), and it would not let me.

分享到QQ

分享到微博