从图像中识别令人难以置信的/拼字游戏字母

发布于 2024-11-16 21:28:25 字数 1436 浏览 3 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

老旧海报 2024-11-23 21:28:25

这取决于您需要多快。
如果您可以隔离字母的正方形并旋转它,以便包含字母的正方形的边是水平和垂直的,那么我建议您:

  • 将图像转换为黑/白(字母为一种颜色,其余部分为黑色)一个模具另一个
  • 制作所有四个可能方向(即直立和旋转 90、180 和 270 度)的所有字母的参考图像的数据集,
  • 使用模板匹配函数,例如 cvMatchTemplate 从数据集中为每个新图像找到最佳匹配图像

这将需要一些时间 。时间,所以优化是可能的,但我认为这会给你一个合理的结果。
如果让它们处于正确的方向很困难,您还可以动态生成新输入的旋转版本,并将其与您的参考数据集进行匹配。

如果字母具有不同的比例,那么我可以想到两个选项:

  • 如果方向不是问题(即您的 boggle 块检测也可以将块置于正确的方向),那么您可以使用具有字母颜色的区域的边界框作为传入图片比例的粗略指标,并将其缩放为与参考图像上的边界框大小相同(每个参考图像可能不同)
  • 如果方向是问题,则只需添加缩放作为您的参考图像的参数搜索空间。因此,您搜索所有旋转(0-360 度)和所有合理的尺寸(您应该能够从您拥有的图像中猜测合理的范围)。

It depends on how fast you need to be.
If you can isolate the square of the letter and rotate it so that the sides of the square containing the letter are horizontal and vertical then I would suggest you:

  • convert the images to black/white (with the letter the one colour and the rest of the die the other
  • make a dataset of reference images of all letters in all four possible orientations (i.e. upright and rotated 90, 180 and 270 degrees)
  • use a template matching function such as cvMatchTemplate to find the best matching image from your dataset for each new image.

This will take a bit of time, so optimisations are possible, but I think it will get you a reasonable result.
If getting them in a proper orientation is difficult you could also generate rotated versions of your new input on the fly and match those to your reference dataset.

If the letters have different scale then I can think of two options:

  • If orientation is not an issue (i.e. your boggle block detection can also put the block in the proper orientation) then you can use the boundingbox of the area that has the letter colour as rough indicator of the scale of the incoming picture, and scale that to be the same size as the boundingbox on your reference images (this might be different for each reference image)
  • If orientation is an issue then just add scaling as a parameter of your search space. So you search all rotations (0-360 degrees) and all reasonable sizes (you should probably be able to guess a reasonable range from the images you have).
咋地 2024-11-23 21:28:25

您可以使用简单的 OCR,例如 Tesseract。它使用简单并且速度相当快。不过,你必须进行 4 次旋转(如@jilles de wit 的答案中所述)。

You can use a simple OCR like Tesseract. It is simple to use and is quite fast. You'll have to do the 4 rotations though (as mentioned in @jilles de wit's answer).

神也荒唐 2024-11-23 21:28:25

我基于 OpenCV 制作了一个 iOS 应用程序来完成此任务。它称为 SnapSolve。我写了一篇关于检测工作原理的博客
基本上,我在每个形状上覆盖所有 26x4 可能的字母 + 旋转,然后查看哪个字母重叠最多。对此进行一点调整是平滑叠加图像,消除字母几乎重叠但不完全重叠的伪影。

I made an iOS-app that does just this, based on OpenCV. It's called SnapSolve. I wrote a blog about how the detection works.
Basically, I overlay all 26x4 possible letters + rotations on each shape, and see which letter overlaps most. A little tweak to this is to smooth the overlay image, to get rid of artefacts where letters almost overlap but not quite.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文