进化图像匹配模拟的新型适应度测量

发布于 2024-07-18 12:19:53 字数 226 浏览 6 评论 0原文

我相信很多人已经看过使用遗传算法生成与样本图像匹配的图像的演示。 你从噪声开始,逐渐地它越来越类似于目标图像,直到你得到或多或少精确的副本。

然而,我见过的所有示例都使用相当简单的逐像素比较,从而导致最终图像的“淡入”相当可预测。 我正在寻找的是更新颖的东西:一种比朴素方法更接近我们所认为的“相似”的健身指标。

我心里没有具体的结果 - 我只是在寻找比默认值更“有趣”的东西。 建议?

I'm sure many people have already seen demos of using genetic algorithms to generate an image that matches a sample image. You start off with noise, and gradually it comes to resemble the target image more and more closely, until you have a more-or-less exact duplicate.

All of the examples I've seen, however, use a fairly straightforward pixel-by-pixel comparison, resulting in a fairly predictable 'fade in' of the final image. What I'm looking for is something more novel: A fitness measure that comes closer to what we see as 'similar' than the naive approach.

I don't have a specific result in mind - I'm just looking for something more 'interesting' than the default. Suggestions?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

溺渁∝ 2024-07-25 12:19:53

我假设你正在谈论类似 Roger Alsing 的东西程序

我实现了这个版本,所以我也对其他健身功能感兴趣,尽管我是从提高性能而不是美观的角度来考虑的。 我预计由于进化过程的性质,总会有一些“淡入”的元素(尽管调整进化运算符可能会影响其外观)。

除了小图像之外,逐像素比较的成本可能很高。 例如,我使用的 200x200 像素图像有 40,000 个像素。 如果每个像素有三个值(R、G 和 B),则必须将 120,000 个值合并到单个图像的适合度计算中。 在我的实现中,我在进行比较之前缩小图像,以便减少像素。 代价是进化图像的准确性略有降低。

在研究替代健身功能时,我遇到了一些使用 YUV 色彩空间 而不是 RGB 的建议,因为这更符合人类的认知。

我的另一个想法是仅比较随机选择的像素样本。 如果不尝试的话,我不确定这会有多好。 由于每次评估所比较的像素都会不同,因此它将具有保持群体内多样性的效果。

除此之外,您还处于计算机视觉领域。 我预计这些依赖于特征提取的技术每张图像的成本会更高,但如果它们导致需要更少的代数来实现可接受的结果,那么它们总体上可能会更快。 您可能想要研究 PerceptualDiff 库。 另外,此页面显示了一些 Java可用于根据特征而不是像素来比较图像相似性的代码。

I assume you're talking about something like Roger Alsing's program.

I implemented a version of this, so I'm also interested in alternative fitness functions, though I'm coming at it from the perspective of improving performance rather than aesthetics. I expect there will always be some element of "fade-in" due to the nature of the evolutionary process (though tweaking the evolutionary operators may affect how this looks).

A pixel-by-pixel comparison can be expensive for anything but small images. For example, the 200x200 pixel image I use has 40,000 pixels. With three values per pixel (R, G and B), that's 120,000 values that have to be incorporated into the fitness calculation for a single image. In my implementation I scale the image down before doing the comparison so that there are fewer pixels. The trade-off is slightly reduced accuracy of the evolved image.

In investigating alternative fitness functions I came across some suggestions to use the YUV colour space instead of RGB since this is more closely aligned with human perception.

Another idea that I had was to compare only a randomly selected sample of pixels. I'm not sure how well this would work without trying it. Since the pixels compared would be different for each evaluation it would have the effect of maintaining diversity within the population.

Beyond that, you are in the realms of computer vision. I expect that these techniques, which rely on feature extraction, would be more expensive per image, but they may be faster overall if they result in fewer generations being required to achieve an acceptable result. You might want to investigate the PerceptualDiff library. Also, this page shows some Java code that can be used to compare images for similarity based on features rather than pixels.

小红帽 2024-07-25 12:19:53

一种比简单方法更接近我们所认为的“相似”的适应度测量方法。

在软件中实施这样的措施绝对是不简单的。 谷歌“人类视觉模型”、“感知误差度量”作为一些起点。 您可以回避这个问题 - 只需将候选图像呈现给人类以选择最佳图像,尽管这对于人类来说可能有点无聊。

A fitness measure that comes closer to what we see as 'similar' than the naive approach.

Implementing such a measure in software is definitely nontrivial. Google 'Human vision model', 'perceptual error metric' for some starting points. You can sidestep the issue - just present the candidate images to a human for selecting the best ones, although it might be a bit boring for the human.

柠檬 2024-07-25 12:19:53

我还没有见过这样的演示(也许你可以链接一个)。 但是您的描述中的一些原始想法可能会引发一个有趣的想法:

  • 并行运行三种不同的算法,可能是 RGB 或 HSV。
  • 在运行过程中移动、旋转或以其他方式稍微改变目标图像。
  • 基于像素之间的对比度/值差异的健身,但不知道实际颜色。
  • ...然后用正确的颜色“填充”单个像素?

I haven't seen such a demo (perhaps you could link one). But a couple proto-ideas from your desription that may trigger an interesting one:

  • Three different algorithms running in parallel, perhaps RGB or HSV.
  • Move, rotate, or otherwise change the target image slightly during the run.
  • Fitness based on contrast/value differences between pixels, but without knowing the actual colour.
  • ...then "prime" a single pixel with the correct colour?
怼怹恏 2024-07-25 12:19:53

我同意其他贡献者的观点,即这并非易事。 我还要补充一点,这在商业上非常有价值 - 例如,希望保护其视觉知识产权的公司会非常高兴能够在互联网上搜索与其徽标相似的图像。

我的幼稚方法是在多个图像上训练模式识别器,每个图像都是从目标图像生成的,并对其应用了一个或多个变换:例如,以任一方式旋转几度; 无论如何,都会平移几个像素; 同一图像的不同比例; 各种模糊和效果(卷积掩模在这里很好)。 我还会为每个图像添加一些随机噪声。 样本越多越好。

训练全部可以离线完成,因此不会导致运行时性能问题。

一旦你训练了模式识别器,你就可以将其指向 GA 群体图像,并从识别器中获得一些标量分数。

就我个人而言,我喜欢径向基础网络。 训练快。 我会从太多的输入开始,然后通过主成分分析 (IIRC) 来减少它们。 输出可能只是相似性度量和相异性度量。

最后一件事; 无论您采用哪种方法 - 您可以在博客中介绍它,发布演示,等等; 让我们知道你的进展如何。

I would agree with other contributors that this is non-trivial. I'd also add that it would be very valuable commercially - for example, companies who wish to protect their visual IP would be extremely happy to be able to trawl the internet looking for similar images to their logos.

My naïve approach to this would be to train a pattern recognizer on a number of images, each generated from the target image with one or more transforms applied to it: e.g. rotated a few degrees either way; a translation a few pixels either way; different scales of the same image; various blurs and effects (convolution masks are good here). I would also add some randomness noise to the each of the images. The more samples the better.

The training can all be done off-line, so shouldn't cause a problem with runtime performance.

Once you've got a pattern recognizer trained, you can point it at the the GA population images, and get some scalar score out of the recognizers.

Personally, I like Radial Basis Networks. Quick to train. I'd start with far too many inputs, and whittle them down with principle component analysis (IIRC). The outputs could just be a similiarity measure and dissimilarity measure.

One last thing; whatever approach you go for - could you blog about it, publish the demo, whatever; let us know how you got on.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文