Tesseract和OpENCV给出错误 - 图像太大

发布于 2025-01-22 03:13:05 字数 833 浏览 2 评论 0原文

我在码头容器中运行AA Spring Boot应用程序，该应用程序安装了Tesseract。

在Java程序中，我正在使用OpenCV如下预处理图像

MatOfByte mat = new MatOfByte(myByteArraySource);
Mat adaptive = new Mat();
Imgproc.adaptiveThreshold(mat, adaptive, 255, Imgproc.ADAPTIVE_THRESH_GAUSSIAN_C, Imgproc.THRESH_BINARY, 13, 7);

// convert to BufferedImage
MatOfByte matOfByte = new MatOfByte();
Imgcodecs.imencode(".png", adaptive, matOfByte);
BufferedImage bf = ImageIO.read(new ByteArrayInputStream(matOfByte.toArray()));

// run tesseract
tesseract.doOCR(bf);

，但运行tesseract.doocr（bf）;> 给出错误：图像太大：（1，146327）

有什么想法我做错了什么？奇怪的是，文件大小只有146kb，所以我不知道为什么Tesseract认为它太大了？

另外，如果我删除AdaptivEthreshold步骤并直接在垫子上执行imencode，则Tesseract扫描可行。

我尝试使用OpenJDK：11和OpenJDK：8-JDK-Alpine，它们都会给出相同的错误。

任何帮助都将受到赞赏。

原文

I have a a spring boot application running in a docker container which has tesseract installed on it.

In the java program, I am using opencv to preprocess an image as follows

MatOfByte mat = new MatOfByte(myByteArraySource);
Mat adaptive = new Mat();
Imgproc.adaptiveThreshold(mat, adaptive, 255, Imgproc.ADAPTIVE_THRESH_GAUSSIAN_C, Imgproc.THRESH_BINARY, 13, 7);

// convert to BufferedImage
MatOfByte matOfByte = new MatOfByte();
Imgcodecs.imencode(".png", adaptive, matOfByte);
BufferedImage bf = ImageIO.read(new ByteArrayInputStream(matOfByte.toArray()));

// run tesseract
tesseract.doOCR(bf);

but running tesseract.doOCR(bf);
gives error: Image too large: (1, 146327)

Any ideas what I am doing wrong?
What's strange is the file size is only 146kb so I don't know why tesseract considers it too large?

Also, if I remove the adaptiveThreshold step and perform imencode on mat directly, then the tesseract scan works.

I have tried with both openjdk:11 and openjdk:8-jdk-alpine, they both give the same error.

Any help is appreciated.

分享到QQ

分享到微博