使用 FileChannel 和 ByteArray 读取 ASCII 文件

发布于 2024-07-05 02:32:54 字数 668 浏览 7 评论 0原文

我有以下代码:

        String inputFile = "somefile.txt";
        FileInputStream in = new FileInputStream(inputFile);
        FileChannel ch = in.getChannel();
        ByteBuffer buf = ByteBuffer.allocateDirect(BUFSIZE);  // BUFSIZE = 256

        /* read the file into a buffer, 256 bytes at a time */
        int rd;
        while ( (rd = ch.read( buf )) != -1 ) {
            buf.rewind();
            for ( int i = 0; i < rd/2; i++ ) {
                /* print each character */
                System.out.print(buf.getChar());
            }
            buf.clear();
        }

但是字符显示在?处。 这和Java使用Unicode字符有关系吗? 我该如何纠正这个问题?

I have the following code:

        String inputFile = "somefile.txt";
        FileInputStream in = new FileInputStream(inputFile);
        FileChannel ch = in.getChannel();
        ByteBuffer buf = ByteBuffer.allocateDirect(BUFSIZE);  // BUFSIZE = 256

        /* read the file into a buffer, 256 bytes at a time */
        int rd;
        while ( (rd = ch.read( buf )) != -1 ) {
            buf.rewind();
            for ( int i = 0; i < rd/2; i++ ) {
                /* print each character */
                System.out.print(buf.getChar());
            }
            buf.clear();
        }

But the characters get displayed at ?'s. Does this have something to do with Java using Unicode characters? How do I correct this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

寄意 2024-07-12 02:32:55

您以这种方式读取文件是否有特殊原因?

如果您正在读取 ASCII 文件,那么您确实应该使用 Reader。

我会这样做:

File inputFile = new File("somefile.txt");
BufferedReader reader = new BufferedReader(new FileReader(inputFile));

然后使用 readLine 或类似的方法来实际读取数据!

Is there a particular reason why you are reading the file in the way that you do?

If you're reading in an ASCII file you should really be using a Reader.

I would do it something like:

File inputFile = new File("somefile.txt");
BufferedReader reader = new BufferedReader(new FileReader(inputFile));

And then use either readLine or similar to actually read in the data!

缘字诀 2024-07-12 02:32:55

是的,它是统一码。

如果您的文件中有 14 个字符,则只会得到 7 个“?”。

解决方案待定。 仍然在想。

Yes, it is Unicode.

If you have 14 Chars in your File, you only get 7 '?'.

Solution pending. Still thinking.

想你只要分分秒秒 2024-07-12 02:32:55

buf.getChar() 期望每个字符 2 个字节,但您只存储 1 个字节。使用:

 System.out.print((char) buf.get());

buf.getChar() is expecting 2 bytes per character but you are only storing 1. Use:

 System.out.print((char) buf.get());
情话墙 2024-07-12 02:32:55

将打印语句更改为:

System.out.print((char)buf.get());

似乎有帮助。

Changing your print statement to:

System.out.print((char)buf.get());

Seems to help.

ゞ记忆︶ㄣ 2024-07-12 02:32:55

根据 somefile.txt 的编码,字符实际上可能不是由两个字节组成。 此页面提供有关如何读取流的更多信息使用正确的编码。

令人遗憾的是,文件系统不会告诉您文件的编码,因为它不知道。 就其而言,它只是一堆字节。 您必须找到某种方法将编码传达给程序,以某种方式检测它,或者(如果可能)始终确保编码相同(例如 UTF-8)。

Depending on the encoding of somefile.txt, a character may not actually be composed of two bytes. This page gives more information about how to read streams with the proper encoding.

The bummer is, the file system doesn't tell you the encoding of the file, because it doesn't know. As far as it's concerned, it's just a bunch of bytes. You must either find some way to communicate the encoding to the program, detect it somehow, or (if possible) always ensure that the encoding is the same (such as UTF-8).

如果没结果 2024-07-12 02:32:54

您必须知道文件的编码是什么,然后使用该编码将 ByteBuffer 解码为 CharBuffer。 假设文件是​​ ASCII:

import java.util.*;
import java.io.*;
import java.nio.*;
import java.nio.channels.*;
import java.nio.charset.*;

public class Buffer
{
    public static void main(String args[]) throws Exception
    {
        String inputFile = "somefile";
        FileInputStream in = new FileInputStream(inputFile);
        FileChannel ch = in.getChannel();
        ByteBuffer buf = ByteBuffer.allocateDirect(BUFSIZE);  // BUFSIZE = 256

        Charset cs = Charset.forName("ASCII"); // Or whatever encoding you want

        /* read the file into a buffer, 256 bytes at a time */
        int rd;
        while ( (rd = ch.read( buf )) != -1 ) {
            buf.rewind();
            CharBuffer chbuf = cs.decode(buf);
            for ( int i = 0; i < chbuf.length(); i++ ) {
                /* print each character */
                System.out.print(chbuf.get());
            }
            buf.clear();
        }
    }
}

You have to know what the encoding of the file is, and then decode the ByteBuffer into a CharBuffer using that encoding. Assuming the file is ASCII:

import java.util.*;
import java.io.*;
import java.nio.*;
import java.nio.channels.*;
import java.nio.charset.*;

public class Buffer
{
    public static void main(String args[]) throws Exception
    {
        String inputFile = "somefile";
        FileInputStream in = new FileInputStream(inputFile);
        FileChannel ch = in.getChannel();
        ByteBuffer buf = ByteBuffer.allocateDirect(BUFSIZE);  // BUFSIZE = 256

        Charset cs = Charset.forName("ASCII"); // Or whatever encoding you want

        /* read the file into a buffer, 256 bytes at a time */
        int rd;
        while ( (rd = ch.read( buf )) != -1 ) {
            buf.rewind();
            CharBuffer chbuf = cs.decode(buf);
            for ( int i = 0; i < chbuf.length(); i++ ) {
                /* print each character */
                System.out.print(chbuf.get());
            }
            buf.clear();
        }
    }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文