比较两个图像的算法
给定两个不同的图像文件(无论我选择什么格式),我需要编写一个程序来预测一个图像文件是另一个图像文件的非法副本的可能性。 文案的作者可能会做旋转、制作负片或添加琐碎细节(以及更改图像尺寸)等操作。
你知道有什么算法可以完成这种工作吗?
Given two different image files (in whatever format I choose), I need to write a program to predict the chance if one being the illegal copy of another. The author of the copy may do stuff like rotating, making negative, or adding trivial details (as well as changing the dimension of the image).
Do you know any algorithm to do this kind of job?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
阅读论文:波里克利、法提赫、昂塞尔·图泽尔和彼得·米尔。 “使用基于模型更新的协方差跟踪
黎曼流形的均值”。 (2006) IEEE 计算机视觉和模式识别。
使用本文介绍的技术,我成功地检测到从相邻网络摄像头捕获的图像中的重叠区域。 我的协方差矩阵由 Sobel、canny 和 SUSAN 方面/边缘检测输出以及原始灰度像素组成。
Read the paper: Porikli, Fatih, Oncel Tuzel, and Peter Meer. “Covariance Tracking Using Model Update Based
on Means on Riemannian Manifolds”. (2006) IEEE Computer Vision and Pattern Recognition.
I was successfully able to detect overlapping regions in images captured from adjacent webcams using the technique presented in this paper. My covariance matrix was composed of Sobel, canny and SUSAN aspect/edge detection outputs, as well as the original greyscale pixels.
从你描述的形式来看,问题很棘手。 您是否认为将图像的一部分复制、粘贴到另一个更大的图像中作为副本? 等等。
我们粗略地称为重复的东西对于算法来说可能很难辨别。
您的重复项可以是:
) 2比较容易解决。 第三,非常主观,仍然是一个研究课题。
我可以为 No1 和 No1 提供解决方案 2.
两种解决方案都使用优秀的图像哈希散列库:https://github.com/JohannesBuchner/imagehash
可以使用感知散列测量来找到精确的重复项。
phash 库在这方面非常擅长。 我经常用它来清洁
训练数据。
用法(来自 github 站点)非常简单:
在这种情况下,您必须设置一个阈值并比较哈希值与每个哈希值的距离。
其他。 这必须通过对图像内容进行反复试验来完成。
如果您退一步,如果在主图像上加水印,这个问题就更容易解决。
您将需要使用水印方案将代码嵌入到图像中。 退一步说,与一些人建议的一些低级方法(边缘检测等)相反,水印方法更优越,因为:
它可以抵抗信号处理攻击
► 信号增强——锐化、对比度等。
► 滤波 – 中值、低通、高通等。
► 加性噪声 – 高斯噪声、均匀噪声等。
► 有损压缩 – JPEG、MPEG 等。
可抵抗几何攻击
► 仿射变换
► 数据缩减——裁剪、剪裁等。
► 随机局部扭曲
► 变形
对水印算法进行一些研究,您将走上解决问题的正确道路。 (
注意:您可以使用 STIRMARK 数据集对您的方法进行基准测试。 这是此类应用程序的公认标准。
In the form described by you, the problem is tough. Do you consider copy, paste of part of the image into another larger image as a copy ? etc.
What we loosely refer to as duplicates can be difficult for algorithms to discern.
Your duplicates can be either:
No1 & 2 are easier to solve. No 3. is very subjective and still a research topic.
I can offer a solution for No1 & 2.
Both solutions use the excellent image hash- hashing library: https://github.com/JohannesBuchner/imagehash
Exact duplicates can be found using a perceptual hashing measure.
The phash library is quite good at this. I routinely use it to clean
training data.
Usage (from github site) is as simple as:
In this case you will have to set a threshold and compare the hash values for their distance from each
other. This has to be done by trial-and-error for your image content.
If you take a step-back, this is easier to solve if you watermark the master images.
You will need to use a watermarking scheme to embed a code into the image. To take a step back, as opposed to some of the low-level approaches (edge detection etc) suggested by some folks, a watermarking method is superior because:
It is resistant to Signal processing attacks
► Signal enhancement – sharpening, contrast, etc.
► Filtering – median, low pass, high pass, etc.
► Additive noise – Gaussian, uniform, etc.
► Lossy compression – JPEG, MPEG, etc.
It is resistant to Geometric attacks
► Affine transforms
► Data reduction – cropping, clipping, etc.
► Random local distortions
► Warping
Do some research on watermarking algorithms and you will be on the right path to solving your problem. (
Note: You can benchmark you method using the STIRMARK dataset. It is an accepted standard for this type of application.
它确实没有看起来那么简单:-) 尼克的建议是一个很好的建议。
首先,请记住,任何有价值的比较方法本质上都是通过将图像转换为不同的形式来起作用的——这种形式可以更容易地挑选出相似的特征。 通常,这些东西不适合轻松阅读......
我能想到的最简单的例子之一就是简单地使用每个图像的颜色空间。 如果两个图像具有高度相似的颜色分布,那么您可以合理地确定它们显示相同的内容。 至少,您可以有足够的确定性来标记它,或者进行更多测试。 在色彩空间中比较图像也会抵制旋转、缩放和某些裁剪等操作。 当然,它不会抵抗对图像的大量修改或大量重新着色(即使是简单的色调转变也会有些棘手)。
http://en.wikipedia.org/wiki/RGB_color_space
http://upvector.com/index.php?section=tutorials&subsection =教程/色彩空间
另一个例子涉及霍夫变换。 这种变换本质上是将图像分解为一组线条。 然后,您可以在每张图像中选取一些“最强”的线条,看看它们是否对齐。 您也可以做一些额外的工作来尝试补偿旋转和缩放 - 在这种情况下,因为比较几行的计算工作比对整个图像进行相同的计算工作要少得多 - 它不会那么糟糕。
http://homepages.inf.ed.ac.uk/amos/hough.html
http://rkb.home.cern.ch/rkb/AN16pp/node122.html
http://en.wikipedia.org/wiki/Hough_transform
It is indeed much less simple than it seems :-) Nick's suggestion is a good one.
To get started, keep in mind that any worthwhile comparison method will essentially work by converting the images into a different form -- a form which makes it easier to pick similar features out. Usually, this stuff doesn't make for very light reading ...
One of the simplest examples I can think of is simply using the color space of each image. If two images have highly similar color distributions, then you can be reasonably sure that they show the same thing. At least, you can have enough certainty to flag it, or do more testing. Comparing images in color space will also resist things such as rotation, scaling, and some cropping. It won't, of course, resist heavy modification of the image or heavy recoloring (and even a simple hue shift will be somewhat tricky).
http://en.wikipedia.org/wiki/RGB_color_space
http://upvector.com/index.php?section=tutorials&subsection=tutorials/colorspace
Another example involves something called the Hough Transform. This transform essentially decomposes an image into a set of lines. You can then take some of the 'strongest' lines in each image and see if they line up. You can do some extra work to try and compensate for rotation and scaling too -- and in this case, since comparing a few lines is MUCH less computational work than doing the same to entire images -- it won't be so bad.
http://homepages.inf.ed.ac.uk/amos/hough.html
http://rkb.home.cern.ch/rkb/AN16pp/node122.html
http://en.wikipedia.org/wiki/Hough_transform
这些只是我思考这个问题的想法,从未尝试过,但我喜欢思考这样的问题!
开始之前
考虑标准化图片,如果一张图片的分辨率比另一张图片的分辨率高,请考虑其中一张图片是另一张图片的压缩版本的选项,因此缩小分辨率可能会提供更准确的结果。
考虑扫描图像的各个预期区域,这些区域可以代表图像的缩放部分以及各种位置和旋转。 如果其中一张图像是另一张图像的倾斜版本,事情就会变得棘手,这些是您应该识别和妥协的限制。
Matlab 是测试和评估图像的优秀工具。
测试算法
您应该(至少)测试大量人工分析的测试数据集,其中匹配项是事先已知的。 例如,如果您的测试数据中有 1,000 张图像,其中 5% 匹配,那么您现在就有了一个相当可靠的基准。 在我们的测试数据中找到 10% 阳性的算法不如找到 4% 阳性的算法好。 然而,一种算法可能会找到所有匹配项,但误报率也高达 20%,因此有多种方法可以对算法进行评级。
测试数据应尝试设计为涵盖您期望在现实世界中找到的尽可能多的动态类型。
需要注意的是,每个有用的算法都必须比随机猜测表现得更好,否则对我们来说毫无用处!
然后,您可以以受控的方式将您的软件应用到现实世界中,并开始分析它产生的结果。 这是一种可以无限持续下去的软件项目,总是可以进行调整和改进,在设计时记住这一点很重要,因为很容易陷入永无止境的项目的陷阱。
颜色桶
使用两张图片,扫描每个像素并计算颜色。 例如,您可能有“桶”:(
显然您会有更高分辨率的计数器)。 每次找到“红色”像素时,都会增加红色计数器。 每个桶都可以代表颜色光谱,分辨率越高越准确,但您应该尝试可接受的差异率。
获得总数后,将其与第二张图像的总数进行比较。 您可能会发现每个图像都有相当独特的足迹,足以识别匹配项。
边缘检测
如何使用边缘检测。

(来源:wikimedia.org )
对于两张相似的图片,边缘检测应该为您提供可用且相当可靠的独特足迹。
拍摄两张照片并应用边缘检测。 也许测量边缘的平均厚度,然后计算图像可以缩放的概率,并在必要时重新缩放。 下面是在各种旋转中应用 Gabor 滤波器(一种边缘检测)的示例。
比较图片像素对于像素,计算匹配和不匹配的数量。 如果它们在一定的误差阈值内,则表示匹配。 否则,您可以尝试将分辨率降低到某个点,看看匹配的概率是否会提高。
感兴趣区域
某些图像可能具有独特的感兴趣片段/区域。 这些区域可能与图像的其余部分形成鲜明对比,并且是在其他图像中搜索以查找匹配项的好项目。 以此图像为例:
(来源:meetthegimp.org)
蓝色建筑工人是感兴趣区域,可以用作搜索对象。 您可能有多种方法可以从该感兴趣区域提取属性/数据并使用它们来搜索数据集。
如果您有 2 个以上的感兴趣区域,则可以测量它们之间的距离。 采取这个简化的示例:
(来源:per2000.eu)
我们有 3 个明确的兴趣区域。 区域1和2之间的距离可以是200个像素、1和3400个像素之间、以及2和3200个像素之间。
在其他图像中搜索相似的感兴趣区域,标准化距离值并查看是否有潜在的匹配项。 该技术对于旋转和缩放的图像非常有效。 您感兴趣的区域越多,匹配的概率就会随着每次距离测量的匹配而增加。
考虑数据集的背景非常重要。 例如,如果您的数据集是现代艺术,那么感兴趣区域会很好地工作,因为感兴趣区域可能设计成为最终图像的基本部分。 然而,如果您正在处理建筑工地的图像,感兴趣的区域可能会被非法复印机解释为丑陋的,并且可能会被随意裁剪/编辑掉。 请记住数据集的共同特征,并尝试利用这些知识。
变形
变形两个图像是将一张图像变成另一张图像的过程通过一组步骤:
注意,这与将一张图像淡入另一张图像不同!
有许多可以变形图像的软件包。 它传统上用作过渡效果,两个图像通常不会中途变形,一个极端变形为另一个极端作为最终结果。
为什么这会有用? 根据您使用的变形算法,图像的相似性与变形算法的某些参数之间可能存在关系。
在一个过于简化的示例中,当需要进行的更改较少时,一种算法可能会执行得更快。 然后我们知道这两个图像共享属性的可能性更高。
这种技术可以很好地适用于旋转、扭曲、倾斜、缩放以及所有类型的复制图像。 再说一遍,这只是我的一个想法,据我所知,它并不是基于任何学术界的研究(尽管我没有仔细研究过),所以这对你来说可能是大量的工作,但结果有限/没有结果。
Zipping
Ow 在这个问题上的回答非常好,我记得读过有关研究人工智能的此类技术。 它在比较语料库词典方面非常有效。
比较语料库时一个有趣的优化是,您可以删除被认为太常见的单词,例如“The”、“A”、“And”等。这些单词会稀释我们的结果,我们想要弄清楚两个语料库有多大差异所以这些可以在加工前去除。 也许图像中存在类似的常见信号可以在压缩之前去除? 也许值得研究一下。
压缩比是确定两组数据相似程度的一种非常快速且合理有效的方法。 阅读压缩的工作原理会让您很好地了解为什么压缩如此有效。 对于快速发布的算法来说,这可能是一个很好的起点。
透明度
我再次不确定如何存储某些图像类型(gif png 等)的透明度数据,但这将是可提取的,并且可以作为有效的简化剪裁来与您的数据集透明度进行比较。
反转信号
图像只是一个信号。 如果您从一个扬声器播放一个噪音,并在另一个扬声器中以完全相同的音量完美同步地播放相反的噪音,它们就会相互抵消。
(来源:themotorreport.com.au )
反转图像,并将其添加到其他图像上。 重复缩放/循环位置,直到找到结果图像,其中足够的像素为白色(或黑色?我将其称为中性画布),为您提供正匹配或部分匹配。
但是,考虑两个相同的图像,除了其中一个应用了增亮效果:
(来源:mcburrz.com)
反转其中之一它们,然后将其添加到另一个将不会产生中性画布,而这正是我们的目标。 然而,当比较两张原始图像的像素时,我们绝对可以看到两者之间的清晰关系。
我已经有几年没有研究颜色了,并且不确定色谱是否处于线性范围内,但是如果您确定了两张图片之间色差的平均因子,则可以在处理之前使用该值来标准化数据这项技术。
树数据结构
起初这些似乎不适合这个问题,但我认为它们可以工作。
您可以考虑提取图像的某些属性(例如颜色箱)并生成霍夫曼树或类似的数据结构。 您也许可以比较两棵树的相似性。 这对于摄影数据(例如具有大范围颜色的照片数据)效果不佳,但对于卡通或其他减少颜色设置的图像,这可能有效。
这可能行不通,但这是一个想法。 trie 数据结构 非常适合存储词典,例如词典。 这是一棵前缀树。 也许可以构建一个相当于词典的图像(同样我只能想到颜色)来构建一个特里树。 如果将 300x300 图像缩小为 5x5 正方形,然后将每个 5x5 正方形分解为一系列颜色,您可以根据结果数据构造一个特里树。 如果 2x2 正方形包含:
我们有一个相当独特的 trie 代码,可扩展 24 个级别,增加/减少级别(IE 减少/增加子正方形的大小)可能会产生更准确的结果。
比较 trie 树应该相当容易,并且可能提供有效的结果。
更多想法
我偶然发现了一篇关于卫星分类的有趣论文简介图像,它概述了:
尽管其中一些可能与您的数据集无关,但可能值得更详细地研究这些测量结果。
其他需要考虑的事情
可能有很多关于此类事情的论文,因此阅读其中的一些论文应该会有所帮助,尽管它们可能非常技术性。 这是计算领域一个极其困难的领域,许多人花费了大量时间试图做类似的事情,但毫无结果。 保持简单并以这些想法为基础将是最好的方法。 创建一个比随机匹配率更好的算法应该是一个相当困难的挑战,并且开始改进它确实开始变得很难实现。
每种方法可能都需要进行彻底的测试和调整,如果您有任何有关要检查的图片类型的信息,这将会很有用。 例如,广告中许多都会包含文本,因此进行文本识别将是一种简单且可能非常可靠的查找匹配项的方法,尤其是与其他解决方案结合使用时。 如前所述,尝试利用数据集的共同属性。
将可进行加权投票(取决于其有效性)的替代测量和技术相结合,将是创建生成更准确结果的系统的一种方法。
如果采用多种算法,正如本答案开头提到的,人们可能会找到所有阳性结果,但误报率为 20%,因此研究其他算法的属性/优点/缺点将会很有趣,因为另一种算法可能会有效消除从另一个返回的误报。
小心不要陷入试图完成永无止境的项目,祝你好运!
These are simply ideas I've had thinking about the problem, never tried it but I like thinking about problems like this!
Before you begin
Consider normalising the pictures, if one is a higher resolution than the other, consider the option that one of them is a compressed version of the other, therefore scaling the resolution down might provide more accurate results.
Consider scanning various prospective areas of the image that could represent zoomed portions of the image and various positions and rotations. It starts getting tricky if one of the images are a skewed version of another, these are the sort of limitations you should identify and compromise on.
Matlab is an excellent tool for testing and evaluating images.
Testing the algorithms
You should test (at the minimum) a large human analysed set of test data where matches are known beforehand. If for example in your test data you have 1,000 images where 5% of them match, you now have a reasonably reliable benchmark. An algorithm that finds 10% positives is not as good as one that finds 4% of positives in our test data. However, one algorithm may find all the matches, but also have a large 20% false positive rate, so there are several ways to rate your algorithms.
The test data should attempt to be designed to cover as many types of dynamics as possible that you would expect to find in the real world.
It is important to note that each algorithm to be useful must perform better than random guessing, otherwise it is useless to us!
You can then apply your software into the real world in a controlled way and start to analyse the results it produces. This is the sort of software project which can go on for infinitum, there are always tweaks and improvements you can make, it is important to bear that in mind when designing it as it is easy to fall into the trap of the never ending project.
Colour Buckets
With two pictures, scan each pixel and count the colours. For example you might have the 'buckets':
(Obviously you would have a higher resolution of counters). Every time you find a 'red' pixel, you increment the red counter. Each bucket can be representative of spectrum of colours, the higher resolution the more accurate but you should experiment with an acceptable difference rate.
Once you have your totals, compare it to the totals for a second image. You might find that each image has a fairly unique footprint, enough to identify matches.
Edge detection
How about using Edge Detection.

(source: wikimedia.org)
With two similar pictures edge detection should provide you with a usable and fairly reliable unique footprint.
Take both pictures, and apply edge detection. Maybe measure the average thickness of the edges and then calculate the probability the image could be scaled, and rescale if necessary. Below is an example of an applied Gabor Filter (a type of edge detection) in various rotations.
Compare the pictures pixel for pixel, count the matches and the non matches. If they are within a certain threshold of error, you have a match. Otherwise, you could try reducing the resolution up to a certain point and see if the probability of a match improves.
Regions of Interest
Some images may have distinctive segments/regions of interest. These regions probably contrast highly with the rest of the image, and are a good item to search for in your other images to find matches. Take this image for example:
(source: meetthegimp.org)
The construction worker in blue is a region of interest and can be used as a search object. There are probably several ways you could extract properties/data from this region of interest and use them to search your data set.
If you have more than 2 regions of interest, you can measure the distances between them. Take this simplified example:
(source: per2000.eu)
We have 3 clear regions of interest. The distance between region 1 and 2 may be 200 pixels, between 1 and 3 400 pixels, and 2 and 3 200 pixels.
Search other images for similar regions of interest, normalise the distance values and see if you have potential matches. This technique could work well for rotated and scaled images. The more regions of interest you have, the probability of a match increases as each distance measurement matches.
It is important to think about the context of your data set. If for example your data set is modern art, then regions of interest would work quite well, as regions of interest were probably designed to be a fundamental part of the final image. If however you are dealing with images of construction sites, regions of interest may be interpreted by the illegal copier as ugly and may be cropped/edited out liberally. Keep in mind common features of your dataset, and attempt to exploit that knowledge.
Morphing
Morphing two images is the process of turning one image into the other through a set of steps:
Note, this is different to fading one image into another!
There are many software packages that can morph images. It's traditionaly used as a transitional effect, two images don't morph into something halfway usually, one extreme morphs into the other extreme as the final result.
Why could this be useful? Dependant on the morphing algorithm you use, there may be a relationship between similarity of images, and some parameters of the morphing algorithm.
In a grossly over simplified example, one algorithm might execute faster when there are less changes to be made. We then know there is a higher probability that these two images share properties with each other.
This technique could work well for rotated, distorted, skewed, zoomed, all types of copied images. Again this is just an idea I have had, it's not based on any researched academia as far as I am aware (I haven't look hard though), so it may be a lot of work for you with limited/no results.
Zipping
Ow's answer in this question is excellent, I remember reading about these sort of techniques studying AI. It is quite effective at comparing corpus lexicons.
One interesting optimisation when comparing corpuses is that you can remove words considered to be too common, for example 'The', 'A', 'And' etc. These words dilute our result, we want to work out how different the two corpus are so these can be removed before processing. Perhaps there are similar common signals in images that could be stripped before compression? It might be worth looking into.
Compression ratio is a very quick and reasonably effective way of determining how similar two sets of data are. Reading up about how compression works will give you a good idea why this could be so effective. For a fast to release algorithm this would probably be a good starting point.
Transparency
Again I am unsure how transparency data is stored for certain image types, gif png etc, but this will be extractable and would serve as an effective simplified cut out to compare with your data sets transparency.
Inverting Signals
An image is just a signal. If you play a noise from a speaker, and you play the opposite noise in another speaker in perfect sync at the exact same volume, they cancel each other out.
(source: themotorreport.com.au)
Invert on of the images, and add it onto your other image. Scale it/loop positions repetitively until you find a resulting image where enough of the pixels are white (or black? I'll refer to it as a neutral canvas) to provide you with a positive match, or partial match.
However, consider two images that are equal, except one of them has a brighten effect applied to it:
(source: mcburrz.com)
Inverting one of them, then adding it to the other will not result in a neutral canvas which is what we are aiming for. However, when comparing the pixels from both original images, we can definatly see a clear relationship between the two.
I haven't studied colour for some years now, and am unsure if the colour spectrum is on a linear scale, but if you determined the average factor of colour difference between both pictures, you can use this value to normalise the data before processing with this technique.
Tree Data structures
At first these don't seem to fit for the problem, but I think they could work.
You could think about extracting certain properties of an image (for example colour bins) and generate a huffman tree or similar data structure. You might be able to compare two trees for similarity. This wouldn't work well for photographic data for example with a large spectrum of colour, but cartoons or other reduced colour set images this might work.
This probably wouldn't work, but it's an idea. The trie datastructure is great at storing lexicons, for example a dictionarty. It's a prefix tree. Perhaps it's possible to build an image equivalent of a lexicon, (again I can only think of colours) to construct a trie. If you reduced say a 300x300 image into 5x5 squares, then decompose each 5x5 square into a sequence of colours you could construct a trie from the resulting data. If a 2x2 square contains:
We have a fairly unique trie code that extends 24 levels, increasing/decreasing the levels (IE reducing/increasing the size of our sub square) may yield more accurate results.
Comparing trie trees should be reasonably easy, and could possible provide effective results.
More ideas
I stumbled accross an interesting paper breif about classification of satellite imagery, it outlines:
It may be worth investigating those measurements in more detail, although some of them may not be relevant to your data set.
Other things to consider
There are probably a lot of papers on this sort of thing, so reading some of them should help although they can be very technical. It is an extremely difficult area in computing, with many fruitless hours of work spent by many people attempting to do similar things. Keeping it simple and building upon those ideas would be the best way to go. It should be a reasonably difficult challenge to create an algorithm with a better than random match rate, and to start improving on that really does start to get quite hard to achieve.
Each method would probably need to be tested and tweaked thoroughly, if you have any information about the type of picture you will be checking as well, this would be useful. For example advertisements, many of them would have text in them, so doing text recognition would be an easy and probably very reliable way of finding matches especially when combined with other solutions. As mentioned earlier, attempt to exploit common properties of your data set.
Combining alternative measurements and techniques each that can have a weighted vote (dependant on their effectiveness) would be one way you could create a system that generates more accurate results.
If employing multiple algorithms, as mentioned at the begining of this answer, one may find all the positives but have a false positive rate of 20%, it would be of interest to study the properties/strengths/weaknesses of other algorithms as another algorithm may be effective in eliminating false positives returned from another.
Be careful to not fall into attempting to complete the never ending project, good luck!
我相信,如果您愿意将该方法应用于所有可能的方向和负版本,那么图像识别(具有良好可靠性)的良好开端就是使用特征脸:http://en.wikipedia.org/wiki/Eigenface
另一个想法是将两个图像转换为其组件的向量。 一个好方法是创建一个在 x*y 维度上运行的向量(x 是图像的宽度,y 是高度),每个维度的值应用于 (x,y) 像素值。 然后运行 K 最近邻的变体,有两个类别:匹配和不匹配。 如果它足够接近原始图像,它将适合匹配类别,如果不是,则不会。
K 最近邻(KNN)可以在这里找到,网上也有其他很好的解释:http://en.wikipedia.org/wiki/K-nearest_neighbor_algorithm
KNN 的好处是,与原始图像比较的变体越多,算法就越准确。 缺点是您需要一个图像目录来首先训练系统。
I believe if you're willing to apply the approach to every possible orientation and to negative versions, a good start to image recognition (with good reliability) is to use eigenfaces: http://en.wikipedia.org/wiki/Eigenface
Another idea would be to transform both images into vectors of their components. A good way to do this is to create a vector that operates in x*y dimensions (x being the width of your image and y being the height), with the value for each dimension applying to the (x,y) pixel value. Then run a variant of K-Nearest Neighbours with two categories: match and no match. If it's sufficiently close to the original image it will fit in the match category, if not then it won't.
K Nearest Neighbours(KNN) can be found here, there are other good explanations of it on the web too: http://en.wikipedia.org/wiki/K-nearest_neighbor_algorithm
The benefits of KNN is that the more variants you're comparing to the original image, the more accurate the algorithm becomes. The downside is you need a catalogue of images to train the system first.
如果您运行的是 Linux,我建议您使用两个工具:
hugin-tools 包中的align_image_stack - 是一个命令行程序,可以自动校正旋转、缩放和其他扭曲(它主要用于合成 HDR 摄影,但也适用于视频帧和其他文档)。 更多信息:http://hugin.sourceforge.net/docs/manual/Align_image_stack.html
来自 imagemagick 包的比较 - 一个可以查找并计算两个图像中不同像素数量的程序。 这是一个简洁的教程:http://www.imagemagick.org/Usage/compare/使用 -fuzz N% 可以提高容错能力。 N 越高,将两个像素视为相同的误差容限就越高。
align_image_stack 应该纠正任何偏移,以便比较命令实际上有机会检测到相同的像素。
If you're running Linux I would suggest two tools:
align_image_stack from package hugin-tools - is a commandline program that can automatically correct rotation, scaling, and other distortions (it's mostly intended for compositing HDR photography, but works for video frames and other documents too). More information: http://hugin.sourceforge.net/docs/manual/Align_image_stack.html
compare from package imagemagick - a program that can find and count the amount of different pixels in two images. Here's a neat tutorial: http://www.imagemagick.org/Usage/compare/ uising the -fuzz N% you can increase the error tolerance. The higher the N the higher the error tolerance to still count two pixels as the same.
align_image_stack should correct any offset so the compare command will actually have a chance of detecting same pixels.
如果您愿意考虑采用不同的方法来检测图像的非法副本,您可以考虑水印< /a>. (从1.4开始)
虽然这也是一个复杂的领域,但有一些技术可以允许水印信息通过总体图像更改而保留:(从 1.9 开始)
当然,常见问题解答称实施这种方法:“...非常具有挑战性”,但是如果您成功实现了该方法,您就可以对图像是否是副本有很高的信心,而不是百分比可能性。
If you're willing to consider a different approach altogether to detecting illegal copies of your images, you could consider watermarking. (from 1.4)
While it's also a complex field, there are techniques that allow the watermark information to persist through gross image alteration: (from 1.9)
of course, the faq calls implementing this approach: "...very challenging" but if you succeed with it, you get a high confidence of whether the image is a copy or not, rather than a percentage likelihood.
这只是一个建议,可能行不通,我已经准备好接受此事。
这会产生误报,但希望不会产生误报。
调整两个图像的大小,使它们具有相同的大小(我假设两个图像的宽度与长度的比率相同)。
使用无损压缩算法(例如gzip)压缩两个图像的位图。
使用
查找具有相似文件大小的文件对。 例如,您可以根据文件大小的相似程度对您拥有的每对文件进行排序,然后检索前 X 个。
正如我所说,这肯定会产生误报,但希望不会产生误报。 您可以在五分钟内完成此操作,而 Porikil 等人。 等人。 可能需要大量的工作。
This is just a suggestion, it might not work and I'm prepared to be called on this.
This will generate false positives, but hopefully not false negatives.
Resize both of the images so that they are the same size (I assume that the ratios of widths to lengths are the same in both images).
Compress a bitmap of both images with a lossless compression algorithm (e.g. gzip).
Find pairs of files that have similar file sizes. For instance, you could just sort every pair of files you have by how similar the file sizes are and retrieve the top X.
As I said, this will definitely generate false positives, but hopefully not false negatives. You can implement this in five minutes, whereas the Porikil et. al. would probably require extensive work.
一个想法:
第 2 步并非微不足道。 特别是,您可能需要使用智能算法来查找其他图像上最相似的关键点。 点描述符通常是非常高维的(比如一百个参数),并且有很多点需要查看。 kd 树在这里可能有用,哈希查找效果不佳。
变体:
An idea:
Step 2 is not trivial. In particular, you may need to use a smart algorithm to find the most similar keypoint on the other image. Point descriptors are usually very high-dimensional (like a hundred parameters), and there are many points to look through. kd-trees may be useful here, hash lookups don't work well.
Variants: