如何使用 Mathematica 查找 Waldo？

发布于 2024-12-20 14:55:53 字数 992 浏览 5 评论 0原文

这周末一直困扰着我：有什么好方法来解决这些沃尔多在哪里？ ['Wally' 北美以外地区] 的谜题，使用 Mathematica（图像处理和其他功能）？

这是我到目前为止所拥有的，一个通过调暗来稍微降低视觉复杂性的功能一些非红色：

whereIsWaldo[url_] := Module[{waldo, waldo2, waldoMask},
    waldo = Import[url];
    waldo2 = Image[ImageData[
        waldo] /. {{r_, g_, b_} /;
          Not[r > .7 && g < .3 && b < .3] :> {0, 0,
          0}, {r_, g_, b_} /; (r > .7 && g < .3 && b < .3) :> {1, 1,
          1}}];
    waldoMask = Closing[waldo2, 4];
    ImageCompose[waldo, {waldoMask, .5}]
]

以及此“有效”的 URL 示例：

whereIsWaldo["http://www.findwaldo.com/fankit/graphics/IntlManOfLiterature/Scenes/DepartmentStore.jpg"]

（Waldo 在收银机旁边）：

原始图像

Mathematica 图形

原文

This was bugging me over the weekend: What is a good way to solve those Where's Waldo? ['Wally' outside of North America] puzzles, using Mathematica (image-processing and other functionality)?

Here is what I have so far, a function which reduces the visual complexity a little bit by dimming
some of the non-red colors:

whereIsWaldo[url_] := Module[{waldo, waldo2, waldoMask},
    waldo = Import[url];
    waldo2 = Image[ImageData[
        waldo] /. {{r_, g_, b_} /;
          Not[r > .7 && g < .3 && b < .3] :> {0, 0,
          0}, {r_, g_, b_} /; (r > .7 && g < .3 && b < .3) :> {1, 1,
          1}}];
    waldoMask = Closing[waldo2, 4];
    ImageCompose[waldo, {waldoMask, .5}]
]

And an example of a URL where this 'works':

whereIsWaldo["http://www.findwaldo.com/fankit/graphics/IntlManOfLiterature/Scenes/DepartmentStore.jpg"]

(Waldo is by the cash register):

Original image

Mathematica graphic

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

奶气 2024-12-27 14:55:53

我找到了沃尔多！

waldo had was found

我是如何做到的

首先，我过滤掉所有颜色接下来，我将计算该图像与简单的

waldo = Import["http://www.findwaldo.com/fankit/graphics/IntlManOfLiterature/Scenes/DepartmentStore.jpg"];
red = Fold[ImageSubtract, #[[1]], Rest[#]] &@ColorSeparate[waldo];

黑白图案的相关性，以找到衬衫中的红色和白色过渡。

corr = ImageCorrelate[red, 
   Image@Join[ConstantArray[1, {2, 4}], ConstantArray[0, {2, 4}]], 
   NormalizedSquaredEuclideanDistance];

我使用二值化来挑选图像中具有足够高相关性的像素，并在它们周围绘制白色圆圈以使用Dilation来强调它们，

pos = Dilation[ColorNegate[Binarize[corr, .12]], DiskMatrix[30]];

我不得不稍微尝试一下等级。如果级别太高，则会挑选出太多误报。

最后我将这个结果与原始图像结合起来得到上面的结果

found = ImageMultiply[waldo, ImageAdd[ColorConvert[pos, "GrayLevel"], .5]]

I've found Waldo!

waldo had been found

How I've done it

First, I'm filtering out all colours that aren't red

waldo = Import["http://www.findwaldo.com/fankit/graphics/IntlManOfLiterature/Scenes/DepartmentStore.jpg"];
red = Fold[ImageSubtract, #[[1]], Rest[#]] &@ColorSeparate[waldo];

Next, I'm calculating the correlation of this image with a simple black and white pattern to find the red and white transitions in the shirt.

corr = ImageCorrelate[red, 
   Image@Join[ConstantArray[1, {2, 4}], ConstantArray[0, {2, 4}]], 
   NormalizedSquaredEuclideanDistance];

I use Binarize to pick out the pixels in the image with a sufficiently high correlation and draw white circle around them to emphasize them using Dilation

pos = Dilation[ColorNegate[Binarize[corr, .12]], DiskMatrix[30]];

I had to play around a little with the level. If the level is too high, too many false positives are picked out.

Finally I'm combining this result with the original image to get the result above

found = ImageMultiply[waldo, ImageAdd[ColorConvert[pos, "GrayLevel"], .5]]

回复收藏 0 原文

栖迟 2024-12-27 14:55:53

我对“防弹方式做到这一点”的猜测（想想中央情报局随时在任何卫星图像中找到沃尔多，而不仅仅是没有竞争元素的单个图像，例如条纹衬衫）......我会训练一个玻尔兹曼机在沃尔多的许多图像上 - 他坐着、站立、遮挡等的所有变化；衬衫、帽子、相机，以及所有的作品。你不需要大量的 Waldos 语料库（也许 3-5 个就足够了），但越多越好。

这会将概率云分配给以正确排列方式出现的各种元素，然后确定（通过分割）平均对象大小是多少，将源图像分割成最类似于个人的对象单元格（考虑可能的遮挡和姿势变化）），但由于 Waldo 图片通常包含很多相同比例的人物，这应该是一项非常简单的任务，然后将这些片段输入到预训练的玻尔兹曼机中。它会给你每个人都是 Waldo 的概率。选取概率最大的一项。

这就是当今 OCR、邮政编码阅读器和无笔画手写识别的工作原理。基本上你知道答案就在那里，你或多或少知道它应该是什么样子，其他一切可能都有共同的元素，但绝对是“不是它”，所以你不用理会“不是它”，你只需查看“它”在所有可能的“你以前见过的”中的可能性（例如，在邮政编码中，你只需训练 BM 1 秒、2 秒、3 秒等，然后将每个每台机器上的数字，然后选择一个最有信心）。这比单个神经网络学习所有数字的特征要好得多。

回复收藏 0 原文

浅沫记忆 2024-12-27 14:55:53

我同意 @GregoryKlopper 的观点，解决在任意图像中查找 Waldo（或任何感兴趣的对象）的一般问题的正确方法是训练一个监督机器学习分类器。使用许多正面和负面标记的示例，诸如支持向量机、Boosted Decision Stump 或玻尔兹曼机可能经过训练以在该问题上实现高精度。 Mathematica 甚至将这些算法包含在其机器学习框架中。

训练 Waldo 分类器的两个挑战是：

确定正确的图像特征变换。这就是 @Heike 的答案有用的地方：红色滤波器和条纹模式检测器（例如，小波或 DCT 分解）是将原始像素转换为分类算法可以学习的格式的好方法。还需要基于块的分解来评估图像的所有子部分……但是，由于 Waldo a）总是大致相同的大小，b）总是在每个图像中只出现一次，因此这变得更容易。
获得足够的训练样本。 SVM 在每个类别至少有 100 个示例时效果最佳。增强的商业应用（例如，数码相机中的面部对焦）是在数百万个正面和负面示例上进行训练的。

快速Google 图像搜索发现了一些很好的数据——我现在就要尝试收集一些训练示例并对其进行编码！

然而，即使是机器学习方法（或 @iND 建议的基于规则的方法）也很难获得像沃尔多斯之地！

回复收藏 0 原文

风吹短裙飘 2024-12-27 14:55:53

我不懂数学。。。太糟糕了。但我最喜欢上面的答案。

仅仅依靠条纹来收集答案仍然存在一个重大缺陷（我个人对手动调整没有问题）。有一个示例（由 Brett Champion 列出，此处）这表明他们有时会打破衬衫的图案。那么它就变成了一个更复杂的模式。

我会尝试一种形状ID和颜色以及空间关系的方法。就像人脸识别一样，您可以寻找彼此之间具有一定比例的几何图案。需要注意的是，这些形状中的一个或多个通常会被遮挡。

在图像上获取白平衡，并从图像中获取红平衡。我相信 Waldo 始终具有相同的值/色调，但图像可能来自扫描或错误的副本。然后始终引用 Waldo 实际的颜色数组：红色、白色、深棕色、蓝色、桃色、{鞋子颜色}。

有衬衫图案，还有定义 Waldo 的裤子、眼镜、头发、脸、鞋子和帽子。此外，相对于图像中的其他人，沃尔多比较瘦。

因此，随机找到人来获得这张照片中人的身高。测量图像中随机点上一堆物体的平均高度（一个简单的轮廓将产生相当多的个人）。如果每件事之间的标准差不在一定范围内，那么它们现在会被忽略。将高度平均值与图像高度进行比较。如果比率太大（例如，1:2、1:4 或类似接近），请重试。运行 10(?) 次，以确保样本都非常接近，排除超出某个标准差的任何平均值。在 Mathematica 中可能吗？

这是您的 Waldo 尺码。 Walalso 很瘦，所以您正在寻找 5:1 或 6:1（或其他） ht:wd 的东西。然而，这还不够。如果 Waldo 部分隐藏，高度可能会发生变化。所以，您正在寻找一块大约 2:1 的红白色块。但必须有更多的指标。

沃尔多戴着眼镜。在红白上方搜索两个 0.5:1 的圆圈。
蓝色裤子。在红白末端和到脚的距离之间的任意距离内，具有相同宽度的任意数量的蓝色。请注意，他的衬衫穿得很短，因此脚并不太近。
帽子。红白色任何距离，直至头顶的两倍。请注意，它下面必须有深色头发，可能还戴着眼镜。
长袖。红白与主红白有一定角度。
深色头发。
鞋子颜色。我不知道颜色。

其中任何一个都可以适用。这些也是对图片中类似人的负面检查——例如，#2 否定穿红白色围裙（离鞋子太近），#5 消除浅色头发。此外，形状只是每个测试的一个指标。。。在指定距离内单独使用颜色即可给出良好的效果。

这将缩小要处理的区域。

存储这些结果将产生一组应该有 Waldo 的区域。排除所有其他区域（例如，对于每个区域，选择一个平均人数两倍大的圆圈），然后运行@Heike 列出的过程，删除除红色以外的所有区域，依此类推。

关于如何编写此代码有什么想法吗？

编辑：

关于如何编写此代码的想法。。。排除除 Waldo 红色以外的所有区域，骨架化红色区域，并将它们修剪成一个点。对 Waldo 头发棕色、Waldo 裤子蓝色、Waldo 鞋子颜色执行相同操作。对于 Waldo 肤色，排除，然后找到轮廓。

接下来，排除非红色，扩大（大量）所有红色区域，然后骨架化和修剪。这部分将给出可能的 Waldo 中心点列表。这将是用于比较所有其他 Waldo 颜色部分的标记。

从这里开始，使用骨架化的红色区域（不是扩张的区域），计算每个区域中的线数。如果有正确的数字（四，对吧？），这肯定是一个可能的区域。如果没有，我想就排除它（作为沃尔多中锋......它可能仍然是他的帽子）。

然后检查上面是否有脸型、上面有头发点、下面有裤子点、下面有鞋子点等等。

还没有代码——仍在阅读文档。

I don't know Mathematica . . . too bad. But I like the answer above, for the most part.

Still there is a major flaw in relying on the stripes alone to glean the answer (I personally don't have a problem with one manual adjustment). There is an example (listed by Brett Champion, here) presented which shows that they, at times, break up the shirt pattern. So then it becomes a more complex pattern.

I would try an approach of shape id and colors, along with spacial relations. Much like face recognition, you could look for geometric patterns at certain ratios from each other. The caveat is that usually one or more of those shapes is occluded.

Get a white balance on the image, and red a red balance from the image. I believe Waldo is always the same value/hue, but the image may be from a scan, or a bad copy. Then always refer to an array of the colors that Waldo actually is: red, white, dark brown, blue, peach, {shoe color}.

There is a shirt pattern, and also the pants, glasses, hair, face, shoes and hat that define Waldo. Also, relative to other people in the image, Waldo is on the skinny side.

So, find random people to obtain an the height of people in this pic. Measure the average height of a bunch of things at random points in the image (a simple outline will produce quite a few individual people). If each thing is not within some standard deviation from each other, they are ignored for now. Compare the average of heights to the image's height. If the ratio is too great (e.g., 1:2, 1:4, or similarly close), then try again. Run it 10(?) of times to make sure that the samples are all pretty close together, excluding any average that is outside some standard deviation. Possible in Mathematica?

This is your Waldo size. Walso is skinny, so you are looking for something 5:1 or 6:1 (or whatever) ht:wd. However, this is not sufficient. If Waldo is partially hidden, the height could change. So, you are looking for a block of red-white that ~2:1. But there has to be more indicators.

Waldo has glasses. Search for two circles 0.5:1 above the red-white.
Blue pants. Any amount of blue at the same width within any distance between the end of the red-white and the distance to his feet. Note that he wears his shirt short, so the feet are not too close.
The hat. Red-white any distance up to twice the top of his head. Note that it must have dark hair below, and probably glasses.
Long sleeves. red-white at some angle from the main red-white.
Dark hair.
Shoe color. I don't know the color.

Any of those could apply. These are also negative checks against similar people in the pic -- e.g., #2 negates wearing a red-white apron (too close to shoes), #5 eliminates light colored hair. Also, shape is only one indicator for each of these tests . . . color alone within the specified distance can give good results.

This will narrow down the areas to process.

Storing these results will produce a set of areas that should have Waldo in it. Exclude all other areas (e.g., for each area, select a circle twice as big as the average person size), and then run the process that @Heike laid out with removing all but red, and so on.

Any thoughts on how to code this?

Edit:

Thoughts on how to code this . . . exclude all areas but Waldo red, skeletonize the red areas, and prune them down to a single point. Do the same for Waldo hair brown, Waldo pants blue, Waldo shoe color. For Waldo skin color, exclude, then find the outline.

Next, exclude non-red, dilate (a lot) all the red areas, then skeletonize and prune. This part will give a list of possible Waldo center points. This will be the marker to compare all other Waldo color sections to.

From here, using the skeletonized red areas (not the dilated ones), count the lines in each area. If there is the correct number (four, right?), this is certainly a possible area. If not, I guess just exclude it (as being a Waldo center . . . it may still be his hat).

Then check if there is a face shape above, a hair point above, pants point below, shoe points below, and so on.

No code yet -- still reading the docs.

回复收藏 0 原文