Mathematica 的 TextRecognize 未达到标准

发布于 2024-12-27 20:26:19 字数 277 浏览 1 评论 0原文

请看一下下面的屏幕截图,看看您能否告诉我为什么这不起作用。 TextRecognize 参考页上的示例看起来非常令人印象深刻,我认为识别这样的单个字母应该不是问题。我尝试过调整字母大小以及锐化图像。

为了方便起见,如果您想自己尝试一下,我在这篇文章的底部添加了我使用的图像。您还可以通过在 Google 图片搜索中搜索“Wordfeud”找到更多类似内容。

Mathematica 截图

Please take a look at the screenshot below and see if you can tell me why this won't work. The examples in on the reference page for TextRecognize look pretty impressive, I don't think recognizing single letters like this should be a problem. I've tried resizing the letters as well as having the image sharpened.

For convenience in case you want to try this yourself I have included the image that I use at the bottom of this post. You can also find plenty more like this by searching for "Wordfeud" in Google Image Search.

Mathematica screenshot

Wordfeud board

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

彡翼 2025-01-03 20:26:19

非常酷的问题!

TextRecognize 使用启发式方法来识别英语中的整个单词。这是
识别单个字母非常困难的问题

考虑以下思路:

s = Import["https://i.sstatic.net/JHYuh.png"];
p = ImagePartition[s, 32]

现在选择字母来组成英语单词“EXIT”:

x = {p[[1, 13]], p[[6, 6]], p[[3, 13]], p[[1, 12]]}

现在稍微清理一下这些图像,如下所示:

d = ImageAssemble[ Map[ImageTake[#, {3, 27}, {2, 20}] &, x ]];

然后返回字符串“EXIT”:

TextRecognize[d]

Mathematica 图形

Very cool question!

TextRecognize uses heuristics to recognize whole words from the English language. This is
the gotcha that makes recognizing single letters very hard

Consider the following line of thought:

s = Import["https://i.sstatic.net/JHYuh.png"];
p = ImagePartition[s, 32]

Now pick letters to form the English word 'EXIT':

x = {p[[1, 13]], p[[6, 6]], p[[3, 13]], p[[1, 12]]}

Now clean up these images a bit, like so:

d = ImageAssemble[ Map[ImageTake[#, {3, 27}, {2, 20}] &, x ]];

Then this returns the string "EXIT":

TextRecognize[d]

Mathematica graphics

爱人如己 2025-01-03 20:26:19

这是一种与使用 TextRecognize 完全不同的方法,因此我将其作为单独的答案发布。它使用与如何使用 Mathematica 查找 Waldo。

首先获取拼图:

wordfeud = Import["https://i.sstatic.net/JHYuh.png"]

Mathematicagraphics

然后获取拼图的各个部分:

Grid[pieces = ImagePartition[s, 32]]

Mathematicagraphics

让我们对字母 E 感兴趣:

LetterE = pieces[[4, 3]]

Mathematicagraphics

获取相关图像:

correlation = 
 ImageCorrelate[wordfeud, Binarize[LetterE], 
 NormalizedSquaredEuclideanDistance]

Mathematicagraphics

并突出显示匹配项:

positions = Dilation[ColorNegate[Binarize[correlation, .1]], DiskMatrix[20]];
found = ImageMultiply[wordfeud, ImageAdd[ColorConvert[positions, "GrayLevel"], .5]]

Mathematicagraphics

和以前一样,这需要对相关图像的二值化进行一些调整,但除了
这应该有助于识别这个难题的各个部分。

This is an approach completely different from using TextRecognize, so I am posting this as a separate answer. It uses the same image recognition technique from the How do I find Waldo with Mathematica.

First get the puzzle:

wordfeud = Import["https://i.sstatic.net/JHYuh.png"]

Mathematica graphics

And then get the pieces of the puzzle:

Grid[pieces = ImagePartition[s, 32]]

Mathematica graphics

Let's be interested in the letter E:

LetterE = pieces[[4, 3]]

Mathematica graphics

Get the correlation image:

correlation = 
 ImageCorrelate[wordfeud, Binarize[LetterE], 
 NormalizedSquaredEuclideanDistance]

Mathematica graphics

And highlight the matches:

positions = Dilation[ColorNegate[Binarize[correlation, .1]], DiskMatrix[20]];
found = ImageMultiply[wordfeud, ImageAdd[ColorConvert[positions, "GrayLevel"], .5]]

Mathematica graphics

As before, this requires a bit of tuning on binarizing the correlation image, but other than
that this should help to identify bits and pieces of this puzzle.

思念满溢 2025-01-03 20:26:19

我认为你的图像质量可能会造成干扰。对图像进行二值化没有帮助:识别为零。我还尝试了非常清晰的黑白图像的填字游戏解决方案。 (见下文)同样,无论是常规格式还是二进制格式,都无法识别任何内容。

crossword Solution

所以我删除了黑色背景,只留下字母和它们的细黑框。同样,认可度约为 0%。

当我从一些字母周围移除框架并对图像进行二值化时,唯一可识别的部分是那些除了字母之外什么也没有的区域。 (见下文)

crossword 2

注意,在下面的输出中,ANTS、TIRES 和 TEXAS 被正确识别(以及 VECTORS) ),但除此之外什么也没有。

还要注意,即使字符串间隔很宽,mma 也会将它们解释为单词,而不是单独的字母。注意“TEXAS”而不是“TEXA S”。

TextRecognize[Binarize@img]

(* output *)
ANTS FFWWW FEEWF
E R o If IU I?
E A FI5F WWWFF 5
5552? L E F F
T s E NTT BT|
H0RWW@0WVlWF;EE F
5 W E   ; OCS
FOFT W W R AL%AE
A TT I T ? _
i iE@W'NF WG%S W
A A EW F I i
SWWTW W ALTFCWD N
H A V 5 A F F
PLATT EWWLIGHT
W N E T
HE TIRES C
TEXAS VECTORS

我没有耐心彻底清理图像。手动重新输入文本会快得多。

结论:不要在 MMA 中使用文本识别,除非您在颜色均匀、明亮(最好是白色)的背景下有绝对清晰的文本。

结果也因所使用的文件格式而异。完全避免使用 .pdf。



编辑

acl 捕获并尝试识别最后 5 行(编辑上方)。他的结果(在下面的评论中):大部分都是胡言乱语。

我决定也这样做。但由于 Prashant 警告说文本大小会产生影响,所以我首先放大,使文本(在我看来)大约为 20 pica。下面是我扫描和TextRecognized的文本图片。


text2


这是未二值化 TextRecognize 的结果(在这么大的尺寸下):

Gliii. Q lk-ii`t`*¥ if EY £\[CloseCurlyDoubleQuote]1\[Euro]'EE \
Di'¥C~E\"P ITF SKI' T»f}!E'!',IL:?E\[CloseCurlyDoubleQuote] I 2 VEEE5\
\[CloseCurlyQuote] LEP \"- \"VE
1. ur e=\\..r.1.»».»\\\\ rw r 1»»\\|a'*r | r .fm -»'-an \
\[OpenCurlyQuote] -.-rr -_.»~|-.'i~-.w~,.-- nv n.w~»-\
\[OpenCurlyDoubleQuote]~"

现在,这是二值化图像的 TextRecognize 结果。原始图像是来自 Jing 的 .png。

I didn't have the patience to completely clean up the image. It would \
have been much faster to retype the
text by hand.
Conclusion: Don't use text recognition in mma unless you have \
absolutely clear text against an even-
colored, bright, preferrably white, background.
The results also varied depending on the file format used. Avoid .pdf \
altogether. 

I thought the quality of your image might be interfering. Binarizing your image did not help : recognition was zilch. I also tried a very sharp black and white image of a crossword puzzle solution. (see below) Again, nothing was recognized whether in regular or binarized format.

crossword solution

So I removed the black background leaving only the letters and their thin black frames. Again, recognition was about 0%.

When I removed the frames from around some of the letters AND binarized the image the only parts that were recognizable were those regions in which there was nothing but letters. (see below)

crossword 2

Notice in the output below, ANTS, TIRES, and TEXAS are correctly identified (as well as VECTORS), but just about nothing else.

Notice also that, even though the strings were widely spaced, mma interpreted them as words, rather than separate letters. Note "TEXAS" instead of "T E X A S".

TextRecognize[Binarize@img]

(* output *)
ANTS FFWWW FEEWF
E R o If IU I?
E A FI5F WWWFF 5
5552? L E F F
T s E NTT BT|
H0RWW@0WVlWF;EE F
5 W E   ; OCS
FOFT W W R AL%AE
A TT I T ? _
i iE@W'NF WG%S W
A A EW F I i
SWWTW W ALTFCWD N
H A V 5 A F F
PLATT EWWLIGHT
W N E T
HE TIRES C
TEXAS VECTORS

I didn't have the patience to completely clean up the image. It would have been much faster to retype the text by hand.

Conclusion: Don't use text recognition in mma unless you have absolutely clear text against an even-colored, bright, preferrably white, background.

The results also varied depending on the file format used. Avoid .pdf altogether.



Edit

acl captured and tried to recognize the last 5 lines (above Edit). His results (in a comment below): mostly gibberish.

I decided to do the same. But since Prashant warned that text size makes a difference, I zoomed in first so that the text appear (to my eyes) to be about 20 pica. Below is the picture of the text I scanned and TextRecognized.


text2


Here's the result of an unbinarized TextRecognize (at that large size):

Gliii. Q lk-ii`t`*¥ if EY £\[CloseCurlyDoubleQuote]1\[Euro]'EE \
Di'¥C~E\"P ITF SKI' T»f}!E'!',IL:?E\[CloseCurlyDoubleQuote] I 2 VEEE5\
\[CloseCurlyQuote] LEP \"- \"VE
1. ur e=\\..r.1.»».»\\\\ rw r 1»»\\|a'*r | r .fm -»'-an \
\[OpenCurlyQuote] -.-rr -_.»~|-.'i~-.w~,.-- nv n.w~»-\
\[OpenCurlyDoubleQuote]~"

Now, here's the result for the TextRecognize of the binarized image. The original image was a .png from Jing.

I didn't have the patience to completely clean up the image. It would \
have been much faster to retype the
text by hand.
Conclusion: Don't use text recognition in mma unless you have \
absolutely clear text against an even-
colored, bright, preferrably white, background.
The results also varied depending on the file format used. Avoid .pdf \
altogether. 
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文