如何用Java读取PGM图像？

发布于 2024-09-17 06:53:57 字数 4475 浏览 6 评论 0原文

我觉得我在这里错过了一些简单的东西（像往常一样）。

我正在尝试使用 Java 读取 PGM 图像。 Matlab 做得很好 - 在 Matlab 中输出图像像素（例如，一个小的 32x32 图像）给了我这样的结果：

1 0 11 49 94 118 118 106 95 88 85 96 124 143 142 133

然而，我的 Java 阅读器输出这样：

1 0 11 49 94 118 118 106 95 88 85 96 124 65533 65533 65533

看起来 127 以上的像素值被填充为 65533，不过它确实得到了一些不正确的随机值，甚至将几乎整个底行分配给值 -1。

这是我正在使用的代码：

filePath = 'imagepath.pgm';
FileInputStream fileInputStream = new FileInputStream(filePath);
BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(fileInputStream));

// read the header information ...

int [][] data2D = new int [picWidth] [picHeight];

for (int row = 0; row < picHeight; row++) {
  for (int col = 0; col < picWidth; col++) {
    data2D[row][col] = bufferedReader.read();
    System.out.print(data2D[row][col] + " ");
  }
  System.out.println();
}

fileInputStream.close();

任何想法将不胜感激。

编辑以下是未签名的 PGM 值：

     1     0    11    49    94   118   118   106    95    88    85    96   124   143   142   133
    30    26    29    57    96   122   125   114   102    94    91   101   127   146   145   136
    96    85    70    75   101   128   136   126   111   106   106   112   131   149   153   147
   163   147   114    93    99   120   132   123   110   113   124   129   137   154   166   168
   215   195   149   105    88    99   114   111   106   123   148   158   160   174   191   197
   245   224   173   115    81    82   100   109   117   144   179   194   194   205   222   230
   235   217   170   115    78    78   113   117   100    83    80   212   214   226   244   253
   178   167   135    93    68    78   123   129   106    77    69   202   204   222   244   255
   114   110    92    64    54    81   107   105    83    59    56   182   184   201   222   231
    79    80    71    52    55    97    67    55    41    33    42   184   179   181   185   183
    62    66    65    52    63   115    29    16    12    17    30   209   197   174   150   132
    40    47    52    44    55   109   171   196   188   186   208   229   218   179   136   107
    31    38    44    37    43    89   145   167   158   159   191   223   219   179   133   105
    48    52    56    51    57    91   128   133   117   120   157   196   200   168   128   105
    64    67    70    73    87   114   127   107    79    81   118   159   173   154   123   104
    63    67    73    83   107   132   129    91    54    54    88   130   153   146   123   106

标头如下所示：

P5
# MatLab PGMWRITE file, saved 27-Jun-2002
16 16
255

编辑#2

以下是概念验证代码的完整输出：

Skipping unknow token: ""
Skipping unknow token: "1^vvj_XU`|���"
Skipping unknow token: ""
Skipping unknow token: "9`z}rf^[e���`UFKe��~ojjp������r]cx�{nq|������ÕiXcroj{��������sQRdmu��������٪sNNqudSP�����]DN{�jME�����rn\@6QkiS;8�����OPG47aC7)!*�����>BA4?s"
Skipping unknow token: ""
Skipping unknow token: ""
Skipping unknow token: "�Ů��(/4,7m�ļ���ڳ�k"
Skipping unknow token: "&,%+Y������۳�i04839[��ux��Ȩ�i@CFIWrkOQv���{h?CISk��[66X���{j"
Exception in thread "main" java.util.NoSuchElementException
    at java.util.Scanner.throwFor(Scanner.java:838)
    at java.util.Scanner.next(Scanner.java:1347)
    at Test.main(Test.java:49)

抛出的第 49 行中提到的例外是：

System.out.println(String.format("Skipping unknow token: \"%s\"", scan.next()));

我确信，问题与这些图像文件由 ASCII 文本/数字以及二进制图像数据组成这一事实有关。但如果 Java 读取 PNG 没有问题，为什么缺乏对 PGM 的支持呢？

编辑 3

好的，我找到了一个有效的实现...不幸的是，它已被弃用：

  filePath = "imagepath.pgm"
  FileInputStream fileInputStream = new FileInputStream(filePath);
  DataInputStream dis = new DataInputStream(fileInputStream);
  StreamTokenizer streamTokenizer = new StreamTokenizer(dis);

  // read header text using StreamTokenizer.nextToken()

  data2D = new int [picWidth] [picHeight];
  for (int row = 0; row < picHeight; row++) {
    for (int col = 0; col < picWidth; col++) {
      data2D[row][col] = dis.readUnsignedByte();
      System.out.print(data2D[row][col] + " ");
    }
    System.out.println();
  }

根据 Java 文档，StreamTokenizer(InputStream) 构造函数已被弃用，因为 < code>DataInputStream.readLine() 方法无法正确地将原始字节转换为字符。然而，它似乎适用于标题上的这种特定情况，并且显然适用于随后的二进制图像数据。

不幸的是，它仍然被弃用，并且按照文档建议混合使用 BufferedReader 似乎只会在读取标头并尝试使用 DataInputStream 后导致 EOFException 读取原始字节。仍在寻找解决方案...

原文

I feel like I'm missing something simple here (as usual).

I'm trying to read PGM images using Java. Matlab does it just fine - outputting the image pixels (for example, a small 32x32 image) in Matlab gives me something like this:

1 0 11 49 94 118 118 106 95 88 85 96 124 143 142 133

My Java reader, however, outputs this:

1 0 11 49 94 118 118 106 95 88 85 96 124 65533 65533 65533

It seems like pixel values above 127 are filled in with 65533, though it does get some random values incorrect, and even assigns almost the entire bottom row to the value of -1.

Here's the code I'm using:

filePath = 'imagepath.pgm';
FileInputStream fileInputStream = new FileInputStream(filePath);
BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(fileInputStream));

// read the header information ...

int [][] data2D = new int [picWidth] [picHeight];

for (int row = 0; row < picHeight; row++) {
  for (int col = 0; col < picWidth; col++) {
    data2D[row][col] = bufferedReader.read();
    System.out.print(data2D[row][col] + " ");
  }
  System.out.println();
}

fileInputStream.close();

Any ideas would be greatly appreciated.

Edit Here are the unsigned PGM values:

     1     0    11    49    94   118   118   106    95    88    85    96   124   143   142   133
    30    26    29    57    96   122   125   114   102    94    91   101   127   146   145   136
    96    85    70    75   101   128   136   126   111   106   106   112   131   149   153   147
   163   147   114    93    99   120   132   123   110   113   124   129   137   154   166   168
   215   195   149   105    88    99   114   111   106   123   148   158   160   174   191   197
   245   224   173   115    81    82   100   109   117   144   179   194   194   205   222   230
   235   217   170   115    78    78   113   117   100    83    80   212   214   226   244   253
   178   167   135    93    68    78   123   129   106    77    69   202   204   222   244   255
   114   110    92    64    54    81   107   105    83    59    56   182   184   201   222   231
    79    80    71    52    55    97    67    55    41    33    42   184   179   181   185   183
    62    66    65    52    63   115    29    16    12    17    30   209   197   174   150   132
    40    47    52    44    55   109   171   196   188   186   208   229   218   179   136   107
    31    38    44    37    43    89   145   167   158   159   191   223   219   179   133   105
    48    52    56    51    57    91   128   133   117   120   157   196   200   168   128   105
    64    67    70    73    87   114   127   107    79    81   118   159   173   154   123   104
    63    67    73    83   107   132   129    91    54    54    88   130   153   146   123   106

The header looks like this:

P5
# MatLab PGMWRITE file, saved 27-Jun-2002
16 16
255

Edit #2

Here's the full output to the proof of concept code below:

Skipping unknow token: ""
Skipping unknow token: "1^vvj_XU`|���"
Skipping unknow token: ""
Skipping unknow token: "9`z}rf^[e���`UFKe��~ojjp������r]cx�{nq|������ÕiXcroj{��������sQRdmu��������٪sNNqudSP�����]DN{�jME�����rn\@6QkiS;8�����OPG47aC7)!*�����>BA4?s"
Skipping unknow token: ""
Skipping unknow token: ""
Skipping unknow token: "�Ů��(/4,7m�ļ���ڳ�k"
Skipping unknow token: "&,%+Y������۳�i04839[��ux��Ȩ�i@CFIWrkOQv���{h?CISk��[66X���{j"
Exception in thread "main" java.util.NoSuchElementException
    at java.util.Scanner.throwFor(Scanner.java:838)
    at java.util.Scanner.next(Scanner.java:1347)
    at Test.main(Test.java:49)

Line 49 referred to in the thrown exception is:

System.out.println(String.format("Skipping unknow token: \"%s\"", scan.next()));

The problem, I'm sure, has something to do with the fact that these image files consist of both ASCII text/numbers as well as binary image data. But if Java has no problem reading PNGs, why the lack of support for PGMs?

Edit 3

Ok, I found an implementation that works...unfortunately, it's deprecated:

  filePath = "imagepath.pgm"
  FileInputStream fileInputStream = new FileInputStream(filePath);
  DataInputStream dis = new DataInputStream(fileInputStream);
  StreamTokenizer streamTokenizer = new StreamTokenizer(dis);

  // read header text using StreamTokenizer.nextToken()

  data2D = new int [picWidth] [picHeight];
  for (int row = 0; row < picHeight; row++) {
    for (int col = 0; col < picWidth; col++) {
      data2D[row][col] = dis.readUnsignedByte();
      System.out.print(data2D[row][col] + " ");
    }
    System.out.println();
  }

According to the Java documentation, the StreamTokenizer(InputStream) constructor is deprecated, because the DataInputStream.readLine() method does not correctly convert raw bytes to characters. However, it seems to work in this specific case on the header, and obviously works for the ensuing binary image data.

Unfortunately, it's still deprecated, and it seems that by intermixing a BufferedReader as the documentation suggests only results in EOFExceptions after reading the header and attempting to use the DataInputStream to read the raw bytes. Still looking for a solution...

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

苍白女子 2024-09-24 06:53:57

您的代码的问题是您使用了错误的类从文件中读取原始数据。正如 BufferedReader 文档所述：

public int read() 抛出 IOException
读取单个字符。
返回：读取的字符，作为 0 到 65535 (0x00-0xffff) 范围内的整数，如果已到达流末尾，则返回 -1

因此，每次调用 read() 方法BufferedReader 实际上消耗了输入流中的一两个字节（基于字符编码），这不是您想要的。这也解释了为什么你会得到很多 -1：流结束的时间比你想象的要早得多。

由于 PGM 包含 ASCII 十进制值，因此可以使用扫描仪类。

这是一个几乎未经测试的代码，它展示了如何读取 PGM 图像，假设：

它在幻数后面包含一个注释（即，除了第二个之外，它没有以 # 开头的行
） PGM 文件正好有 4 行长。

代码如下：

String filePath = "image.pgm";
fileInputStream = new FileInputStream(filePath);
Scanner scan = new Scanner(fileInputStream);
// Discard the magic number
scan.nextLine();
// Discard the comment line
scan.nextLine();
// Read pic width, height and max value
int picWidth = scan.nextInt();
int picHeight = scan.nextInt();
int maxvalue = scan.nextInt();

fileInputStream.close();
            
 // Now parse the file as binary data
 fileInputStream = new FileInputStream(filePath);
 DataInputStream dis = new DataInputStream(fileInputStream);
 
 // look for 4 lines (i.e.: the header) and discard them
 int numnewlines = 4;
 while (numnewlines > 0) {
     char c;
     do {
         c = (char)(dis.readUnsignedByte());
     } while (c != '\n');
     numnewlines--;
 }

 // read the image data
 int[][] data2D = new int[picHeight][picWidth];
 for (int row = 0; row < picHeight; row++) {
     for (int col = 0; col < picWidth; col++) {
         data2D[row][col] = dis.readUnsignedByte();
         System.out.print(data2D[row][col] + " ");
     }
     System.out.println();
 }

需要实现：支持注释行、每个元素的值应除以 maxvalue、对格式错误的文件进行错误检查、异常处理。我使用 UNIX 行尾在 PGM 文件上对其进行了测试，但它也应该可以在 Windows 上运行。

让我强调一下，这不是一个健壮且完整的 PGM 解析器实现。该代码的目的只是作为概念证明，也许足以满足您的需求。

如果您确实需要一个强大的 PGM 解析器，您可以使用 Netpbm 提供的工具。

The problem with your code is that you are using the wrong class to read raw data from the file. As the BufferedReader documentation says:

public int read() throws IOException
Reads a single character.
Returns: The character read, as an integer in the range 0 to 65535 (0x00-0xffff), or -1 if the end of the stream has been reached

So each call to the read() method of BufferedReader actually consumes one or two bytes (based on character encoding) from the input stream, which is not what you want. This also explains why you get a lot of -1: the stream ended much earlier than you thought.

Since PGM contains values as ASCII decimal, it is easy to parse using the Scanner class.

Here's an almost untested code that shows how to read a PGM image assuming that:

it contains a single comment after the magic number (i.e. it does not have lines that start with a # except the second one)
the PGM file is exactly 4 lines long.

Here's the code:

String filePath = "image.pgm";
fileInputStream = new FileInputStream(filePath);
Scanner scan = new Scanner(fileInputStream);
// Discard the magic number
scan.nextLine();
// Discard the comment line
scan.nextLine();
// Read pic width, height and max value
int picWidth = scan.nextInt();
int picHeight = scan.nextInt();
int maxvalue = scan.nextInt();

fileInputStream.close();
            
 // Now parse the file as binary data
 fileInputStream = new FileInputStream(filePath);
 DataInputStream dis = new DataInputStream(fileInputStream);
 
 // look for 4 lines (i.e.: the header) and discard them
 int numnewlines = 4;
 while (numnewlines > 0) {
     char c;
     do {
         c = (char)(dis.readUnsignedByte());
     } while (c != '\n');
     numnewlines--;
 }

 // read the image data
 int[][] data2D = new int[picHeight][picWidth];
 for (int row = 0; row < picHeight; row++) {
     for (int col = 0; col < picWidth; col++) {
         data2D[row][col] = dis.readUnsignedByte();
         System.out.print(data2D[row][col] + " ");
     }
     System.out.println();
 }

Need to implement: support for comment lines, values for each element should be divided by maxvalue, error checking for malformed files, exception handling. I tested it on a PGM file using UNIX end-of-lines, but it should work on Windows too.

Let me stress that this is not a robust nor complete implementation of a PGM parser. This code is intended just as proof of concept that maybe accomplishes just enough for your needs.

If you really need a robust PGM parser, you may use the tools provided by Netpbm.

回复收藏 0 原文

~没有更多了~