使用 FileChannel 和 ByteArray 读取 ASCII 文件

发布于 2024-07-05 02:32:54 字数 668 浏览 12 评论 0原文

我有以下代码：

        String inputFile = "somefile.txt";
        FileInputStream in = new FileInputStream(inputFile);
        FileChannel ch = in.getChannel();
        ByteBuffer buf = ByteBuffer.allocateDirect(BUFSIZE);  // BUFSIZE = 256

        /* read the file into a buffer, 256 bytes at a time */
        int rd;
        while ( (rd = ch.read( buf )) != -1 ) {
            buf.rewind();
            for ( int i = 0; i < rd/2; i++ ) {
                /* print each character */
                System.out.print(buf.getChar());
            }
            buf.clear();
        }

但是字符显示在？处。这和Java使用Unicode字符有关系吗？我该如何纠正这个问题？

原文

I have the following code:

        String inputFile = "somefile.txt";
        FileInputStream in = new FileInputStream(inputFile);
        FileChannel ch = in.getChannel();
        ByteBuffer buf = ByteBuffer.allocateDirect(BUFSIZE);  // BUFSIZE = 256

        /* read the file into a buffer, 256 bytes at a time */
        int rd;
        while ( (rd = ch.read( buf )) != -1 ) {
            buf.rewind();
            for ( int i = 0; i < rd/2; i++ ) {
                /* print each character */
                System.out.print(buf.getChar());
            }
            buf.clear();
        }

But the characters get displayed at ?'s. Does this have something to do with Java using Unicode characters? How do I correct this?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

寄意 2024-07-12 02:32:55

您以这种方式读取文件是否有特殊原因？

如果您正在读取 ASCII 文件，那么您确实应该使用 Reader。

我会这样做：

File inputFile = new File("somefile.txt");
BufferedReader reader = new BufferedReader(new FileReader(inputFile));

然后使用 readLine 或类似的方法来实际读取数据！

Is there a particular reason why you are reading the file in the way that you do?

If you're reading in an ASCII file you should really be using a Reader.

I would do it something like:

File inputFile = new File("somefile.txt");
BufferedReader reader = new BufferedReader(new FileReader(inputFile));

And then use either readLine or similar to actually read in the data!

回复收藏 0 原文

缘字诀 2024-07-12 02:32:55

是的，它是统一码。

如果您的文件中有 14 个字符，则只会得到 7 个“？”。

解决方案待定。仍然在想。

回复收藏 0 原文

想你只要分分秒秒 2024-07-12 02:32:55

buf.getChar() 期望每个字符 2 个字节，但您只存储 1 个字节。使用：

 System.out.print((char) buf.get());

buf.getChar() is expecting 2 bytes per character but you are only storing 1. Use:

 System.out.print((char) buf.get());

回复收藏 0 原文

情话墙 2024-07-12 02:32:55

将打印语句更改为：

System.out.print((char)buf.get());

似乎有帮助。

Changing your print statement to:

System.out.print((char)buf.get());

Seems to help.

回复收藏 0 原文

ゞ记忆︶ㄣ 2024-07-12 02:32:55

根据 somefile.txt 的编码，字符实际上可能不是由两个字节组成。此页面提供有关如何读取流的更多信息使用正确的编码。

令人遗憾的是，文件系统不会告诉您文件的编码，因为它不知道。就其而言，它只是一堆字节。您必须找到某种方法将编码传达给程序，以某种方式检测它，或者（如果可能）始终确保编码相同（例如 UTF-8）。

回复收藏 0 原文

如果没结果 2024-07-12 02:32:54

您必须知道文件的编码是什么，然后使用该编码将 ByteBuffer 解码为 CharBuffer。假设文件是 ASCII：

import java.util.*;
import java.io.*;
import java.nio.*;
import java.nio.channels.*;
import java.nio.charset.*;

public class Buffer
{
    public static void main(String args[]) throws Exception
    {
        String inputFile = "somefile";
        FileInputStream in = new FileInputStream(inputFile);
        FileChannel ch = in.getChannel();
        ByteBuffer buf = ByteBuffer.allocateDirect(BUFSIZE);  // BUFSIZE = 256

        Charset cs = Charset.forName("ASCII"); // Or whatever encoding you want

        /* read the file into a buffer, 256 bytes at a time */
        int rd;
        while ( (rd = ch.read( buf )) != -1 ) {
            buf.rewind();
            CharBuffer chbuf = cs.decode(buf);
            for ( int i = 0; i < chbuf.length(); i++ ) {
                /* print each character */
                System.out.print(chbuf.get());
            }
            buf.clear();
        }
    }
}

You have to know what the encoding of the file is, and then decode the ByteBuffer into a CharBuffer using that encoding. Assuming the file is ASCII:

import java.util.*;
import java.io.*;
import java.nio.*;
import java.nio.channels.*;
import java.nio.charset.*;

public class Buffer
{
    public static void main(String args[]) throws Exception
    {
        String inputFile = "somefile";
        FileInputStream in = new FileInputStream(inputFile);
        FileChannel ch = in.getChannel();
        ByteBuffer buf = ByteBuffer.allocateDirect(BUFSIZE);  // BUFSIZE = 256

        Charset cs = Charset.forName("ASCII"); // Or whatever encoding you want

        /* read the file into a buffer, 256 bytes at a time */
        int rd;
        while ( (rd = ch.read( buf )) != -1 ) {
            buf.rewind();
            CharBuffer chbuf = cs.decode(buf);
            for ( int i = 0; i < chbuf.length(); i++ ) {
                /* print each character */
                System.out.print(chbuf.get());
            }
            buf.clear();
        }
    }
}

回复收藏 0 原文

~没有更多了~