检测照片中纸张角点的算法

发布于 2024-11-18 12:59:08 字数 3795 浏览 6 评论 0原文

检测照片中发票/收据/纸张的角点的最佳方法是什么?这将用于 OCR 之前的后续透视校正。

我目前的方法是:

RGB>灰色>使用阈值处理的 Canny 边缘检测 >膨胀(1)>去除小物体(6) >清除边界对象>根据凸面积选择大型博客。 > [角点检测 - 未实现]

我忍不住认为必须有一种更强大的“智能”/统计方法来处理这种类型的分割。我没有很多训练示例,但我可能可以收集 100 张图像。

更广泛的背景:

我正在使用 matlab 进行原型设计,并计划在 OpenCV 和 Tesserect-OCR 中实现该系统。这是我需要为此特定应用解决的许多图像处理问题中的第一个。因此,我希望推出自己的解决方案并重新熟悉图像处理算法。

以下是我希望算法处理的一些示例图像:如果您想接受挑战,大图像位于 http://madteckhead.com/tmp

案例 1
(来源:madteckhead.com

case 2
(来源:madteckhead.com

case 3
(来源:madteckhead.com

case 4
(来源:madteckhead.com

在最好的情况下,这给出:

案例 1 - 精明
(来源:madteckhead.com

案例 1 - 帖子精明
(来源:madteckhead.com

情况 1 - 最大博客
(来源:madteckhead.com

但是在其他情况下很容易失败:

案例2-精明
(来源:madteckhead.com

案例 2 - 帖子精明
(来源:madteckhead.com

情况 2 - 最大博客
(来源:madteckhead.com

编辑:霍夫变换进度

问:什么算法可以对霍夫线进行聚类以找到角点? 根据答案的建议,我能够使用霍夫变换,选择线条并过滤它们。我目前的方法相当粗糙。我假设发票与图像的偏差始终小于 15 度。如果是这种情况,我最终会得到合理的线条结果(见下文)。但我并不完全确定是否有合适的算法来对线进行聚类(或投票)以推断角点。霍夫线不连续。在噪声图像中,可能存在平行线,因此需要某种形式或距线原点的距离度量。有什么想法吗?

情况 1 情况 2情况 3情况 4
(来源:madteckhead.com

What is the best way to detect the corners of an invoice/receipt/sheet-of-paper in a photo? This is to be used for subsequent perspective correction, before OCR.

My current approach has been:

RGB > Gray > Canny Edge Detection with thresholding > Dilate(1) > Remove small objects(6) > clear boarder objects > pick larges blog based on Convex Area. > [corner detection - Not implemented]

I can't help but think there must be a more robust 'intelligent'/statistical approach to handle this type of segmentation. I don't have a lot of training examples, but I could probably get 100 images together.

Broader context:

I'm using matlab to prototype, and planning to implement the system in OpenCV and Tesserect-OCR. This is the first of a number of image processing problems I need to solve for this specific application. So I'm looking to roll my own solution and re-familiarize myself with image processing algorithms.

Here are some sample image that I'd like the algorithm to handle: If you'd like to take up the challenge the large images are at http://madteckhead.com/tmp

case 1
(source: madteckhead.com)

case 2
(source: madteckhead.com)

case 3
(source: madteckhead.com)

case 4
(source: madteckhead.com)

In the best case this gives:

case 1 - canny
(source: madteckhead.com)

case 1 - post canny
(source: madteckhead.com)

case 1 - largest blog
(source: madteckhead.com)

However it fails easily on other cases:

case 2 - canny
(source: madteckhead.com)

case 2 - post canny
(source: madteckhead.com)

case 2 - largest blog
(source: madteckhead.com)

EDIT: Hough Transform Progress

Q: What algorithm would cluster the hough lines to find corners?
Following advice from answers I was able to use the Hough Transform, pick lines, and filter them. My current approach is rather crude. I've made the assumption the invoice will always be less than 15deg out of alignment with the image. I end up with reasonable results for lines if this is the case (see below). But am not entirely sure of a suitable algorithm to cluster the lines (or vote) to extrapolate for the corners. The Hough lines are not continuous. And in the noisy images, there can be parallel lines so some form or distance from line origin metrics are required. Any ideas?

case 1
case 2
case 3
case 4
(source: madteckhead.com)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

弥繁 2024-11-25 12:59:08

我是马丁的朋友,今年早些时候他正在研究这个问题。这是我的第一个编码项目,有点仓促,所以代码需要一些错误......解码......
我将从我已经看到的你所做的事情中给出一些提示,然后在明天休息时对我的代码进行排序。

第一个提示,OpenCVpython 非常棒,尽快转向它们。 :D

不是去除小物体和/或噪音,而是降低精明的限制,这样它接受更多的边缘,然后找到最大的闭合轮廓(在 OpenCV 中使用 findcontour() 和一些简单的参数,我认为我使用了CV_RETR_LIST)。当它在一张白纸上时可能仍然很困难,但绝对提供了最好的结果。

对于 Houghline2() 变换,尝试使用 CV_HOUGH_STANDARD 而不是 CV_HOUGH_PROBABILISTIC,它会给出 rhotheta 值,在极坐标中定义直线,然后您可以在一定的容差范围内对直线进行分组。

我的分组用作查找表,对于霍夫变换输出的每一行,它都会给出 rho 和 theta 对。如果这些值在表中一对值的 5% 之内,则它们将被丢弃;如果它们在 5% 之外,则将新条目添加到表中。

然后,您可以更轻松地分析平行线或线之间的距离。

希望这有帮助。

I'm Martin's friend who was working on this earlier this year. This was my first ever coding project, and kinda ended in a bit of a rush, so the code needs some errr...decoding...
I'll give a few tips from what I've seen you doing already, and then sort my code on my day off tomorrow.

First tip, OpenCV and python are awesome, move to them as soon as possible. :D

Instead of removing small objects and or noise, lower the canny restraints, so it accepts more edges, and then find the largest closed contour (in OpenCV use findcontour() with some simple parameters, I think I used CV_RETR_LIST). might still struggle when it's on a white piece of paper, but was definitely providing best results.

For the Houghline2() Transform, try with the CV_HOUGH_STANDARD as opposed to the CV_HOUGH_PROBABILISTIC, it'll give rho and theta values, defining the line in polar coordinates, and then you can group the lines within a certain tolerance to those.

My grouping worked as a look up table, for each line outputted from the hough transform it would give a rho and theta pair. If these values were within, say 5% of a pair of values in the table, they were discarded, if they were outside that 5%, a new entry was added to the table.

You can then do analysis of parallel lines or distance between lines much more easily.

Hope this helps.

后来的我们 2024-11-25 12:59:08

这是我经过一番实验后得出的结论:

import cv, cv2, numpy as np
import sys

def get_new(old):
    new = np.ones(old.shape, np.uint8)
    cv2.bitwise_not(new,new)
    return new

if __name__ == '__main__':
    orig = cv2.imread(sys.argv[1])

    # these constants are carefully picked
    MORPH = 9
    CANNY = 84
    HOUGH = 25

    img = cv2.cvtColor(orig, cv2.COLOR_BGR2GRAY)
    cv2.GaussianBlur(img, (3,3), 0, img)


    # this is to recognize white on white
    kernel = cv2.getStructuringElement(cv2.MORPH_RECT,(MORPH,MORPH))
    dilated = cv2.dilate(img, kernel)

    edges = cv2.Canny(dilated, 0, CANNY, apertureSize=3)

    lines = cv2.HoughLinesP(edges, 1,  3.14/180, HOUGH)
    for line in lines[0]:
         cv2.line(edges, (line[0], line[1]), (line[2], line[3]),
                         (255,0,0), 2, 8)

    # finding contours
    contours, _ = cv2.findContours(edges.copy(), cv.CV_RETR_EXTERNAL,
                                   cv.CV_CHAIN_APPROX_TC89_KCOS)
    contours = filter(lambda cont: cv2.arcLength(cont, False) > 100, contours)
    contours = filter(lambda cont: cv2.contourArea(cont) > 10000, contours)

    # simplify contours down to polygons
    rects = []
    for cont in contours:
        rect = cv2.approxPolyDP(cont, 40, True).copy().reshape(-1, 2)
        rects.append(rect)

    # that's basically it
    cv2.drawContours(orig, rects,-1,(0,255,0),1)

    # show only contours
    new = get_new(img)
    cv2.drawContours(new, rects,-1,(0,255,0),1)
    cv2.GaussianBlur(new, (9,9), 0, new)
    new = cv2.Canny(new, 0, CANNY, apertureSize=3)

    cv2.namedWindow('result', cv2.WINDOW_NORMAL)
    cv2.imshow('result', orig)
    cv2.waitKey(0)
    cv2.imshow('result', dilated)
    cv2.waitKey(0)
    cv2.imshow('result', edges)
    cv2.waitKey(0)
    cv2.imshow('result', new)
    cv2.waitKey(0)

    cv2.destroyAllWindows()

并不完美,但至少适用于所有示例:

1
2
3
4

Here's what I came up with after a bit of experimentation:

import cv, cv2, numpy as np
import sys

def get_new(old):
    new = np.ones(old.shape, np.uint8)
    cv2.bitwise_not(new,new)
    return new

if __name__ == '__main__':
    orig = cv2.imread(sys.argv[1])

    # these constants are carefully picked
    MORPH = 9
    CANNY = 84
    HOUGH = 25

    img = cv2.cvtColor(orig, cv2.COLOR_BGR2GRAY)
    cv2.GaussianBlur(img, (3,3), 0, img)


    # this is to recognize white on white
    kernel = cv2.getStructuringElement(cv2.MORPH_RECT,(MORPH,MORPH))
    dilated = cv2.dilate(img, kernel)

    edges = cv2.Canny(dilated, 0, CANNY, apertureSize=3)

    lines = cv2.HoughLinesP(edges, 1,  3.14/180, HOUGH)
    for line in lines[0]:
         cv2.line(edges, (line[0], line[1]), (line[2], line[3]),
                         (255,0,0), 2, 8)

    # finding contours
    contours, _ = cv2.findContours(edges.copy(), cv.CV_RETR_EXTERNAL,
                                   cv.CV_CHAIN_APPROX_TC89_KCOS)
    contours = filter(lambda cont: cv2.arcLength(cont, False) > 100, contours)
    contours = filter(lambda cont: cv2.contourArea(cont) > 10000, contours)

    # simplify contours down to polygons
    rects = []
    for cont in contours:
        rect = cv2.approxPolyDP(cont, 40, True).copy().reshape(-1, 2)
        rects.append(rect)

    # that's basically it
    cv2.drawContours(orig, rects,-1,(0,255,0),1)

    # show only contours
    new = get_new(img)
    cv2.drawContours(new, rects,-1,(0,255,0),1)
    cv2.GaussianBlur(new, (9,9), 0, new)
    new = cv2.Canny(new, 0, CANNY, apertureSize=3)

    cv2.namedWindow('result', cv2.WINDOW_NORMAL)
    cv2.imshow('result', orig)
    cv2.waitKey(0)
    cv2.imshow('result', dilated)
    cv2.waitKey(0)
    cv2.imshow('result', edges)
    cv2.waitKey(0)
    cv2.imshow('result', new)
    cv2.waitKey(0)

    cv2.destroyAllWindows()

Not perfect, but at least works for all samples:

1
2
3
4

秋叶绚丽 2024-11-25 12:59:08

我大学的一个学生小组最近演示了他们编写的一个 iPhone 应用程序(和 python OpenCV 应用程序)就是为了做到这一点。我记得,步骤是这样的:

  • 中值过滤器完全删除纸上的文本(这是白纸上的手写文本,具​​有相当好的照明,可能不适用于打印文本,但效果很好)。原因是它使角点检测变得更加容易。
  • 线的霍夫变换
  • 找到霍夫变换累加器空间中的峰值,并在整个图像上绘制每条线。
  • 分析线条并删除任何彼此非常接近且角度相似的线条(将线条聚集为一条)。这是必要的,因为霍夫变换并不完美,因为它在离散样本空间中工作。
  • 找到大致平行且与其他线对相交的线对,以查看哪些线形成四边形。

这似乎工作得相当好,他们能够拍摄一张纸或一本书的照片,执行角点检测,然后几乎实时地将图像中的文档映射到平面上(有一个 OpenCV 函数可以执行映射)。当我看到它工作时,没有 OCR。

A student group at my university recently demonstrated an iPhone app (and python OpenCV app) that they'd written to do exactly this. As I remember, the steps were something like this:

  • Median filter to completely remove the text on the paper (this was handwritten text on white paper with fairly good lighting and may not work with printed text, it worked very well). The reason was that it makes the corner detection much easier.
  • Hough Transform for lines
  • Find the peaks in the Hough Transform accumulator space and draw each line across the entire image.
  • Analyse the lines and remove any that are very close to each other and are at a similar angle (cluster the lines into one). This is necessary because the Hough Transform isn't perfect as it's working in a discrete sample space.
  • Find pairs of lines that are roughly parallel and that intersect other pairs to see which lines form quads.

This seemed to work fairly well and they were able to take a photo of a piece of paper or book, perform the corner detection and then map the document in the image onto a flat plane in almost realtime (there was a single OpenCV function to perform the mapping). There was no OCR when I saw it working.

聊慰 2024-11-25 12:59:08

您可以使用角点检测,而不是从边缘检测开始。

Marvin Framework 为此提供了 Moravec 算法的实现。您可以找到纸张的角作为起点。 Moravec 算法的输出如下:

在此处输入图像描述

Instead of starting from edge detection you could use Corner detection.

Marvin Framework provides an implementation of Moravec algorithm for this purpose. You could find the corners of the papers as a starting point. Below the output of Moravec's algorithm:

enter image description here

趁微风不噪 2024-11-25 12:59:08

您也可以使用 MSER (最大稳定极值区域) Sobel 算子的结果是找到图像的稳定区域。对于 MSER 返回的每个区域,您可以应用凸包和多边形逼近来获得如下所示的结果:

但是这种检测对于多张图片的实时检测非常有用,而单张图片并不总是返回最佳结果。

result

Also you can use MSER (Maximally stable extremal regions) over Sobel operator result to find the stable regions of the image. For each region returned by MSER you can apply convex hull and poly approximation to obtain some like this:

But this kind of detection is useful for live detection more than a single picture that not always return the best result.

result

煞人兵器 2024-11-25 12:59:08

边缘检测后,使用霍夫变换。
然后,将这些点与它们的标签一起放入SVM(支持向量机)中,如果示例上有平滑的线条,SVM将没有任何困难来划分示例的必要部分和其他部分。我对 SVM 的建议是设置一个参数,比如连接性和长度。也就是说,如果点相连并且很长,它们很可能是收据的一条线。然后,您可以消除所有其他点。

After edge-detection, use Hough Transform.
Then, put those points in an SVM(supporting vector machine) with their labels, if the examples have smooth lines on them, SVM will not have any difficulty to divide the necessary parts of the example and other parts. My advice on SVM, put a parameter like connectivity and length. That is, if points are connected and long, they are likely to be a line of the receipt. Then, you can eliminate all of the other points.

笑咖 2024-11-25 12:59:08

这里有 @Vanuan 使用 C++ 的代码:

cv::cvtColor(mat, mat, CV_BGR2GRAY);
cv::GaussianBlur(mat, mat, cv::Size(3,3), 0);
cv::Mat kernel = cv::getStructuringElement(cv::MORPH_RECT, cv::Point(9,9));
cv::Mat dilated;
cv::dilate(mat, dilated, kernel);

cv::Mat edges;
cv::Canny(dilated, edges, 84, 3);

std::vector<cv::Vec4i> lines;
lines.clear();
cv::HoughLinesP(edges, lines, 1, CV_PI/180, 25);
std::vector<cv::Vec4i>::iterator it = lines.begin();
for(; it!=lines.end(); ++it) {
    cv::Vec4i l = *it;
    cv::line(edges, cv::Point(l[0], l[1]), cv::Point(l[2], l[3]), cv::Scalar(255,0,0), 2, 8);
}
std::vector< std::vector<cv::Point> > contours;
cv::findContours(edges, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_TC89_KCOS);
std::vector< std::vector<cv::Point> > contoursCleaned;
for (int i=0; i < contours.size(); i++) {
    if (cv::arcLength(contours[i], false) > 100)
        contoursCleaned.push_back(contours[i]);
}
std::vector<std::vector<cv::Point> > contoursArea;

for (int i=0; i < contoursCleaned.size(); i++) {
    if (cv::contourArea(contoursCleaned[i]) > 10000){
        contoursArea.push_back(contoursCleaned[i]);
    }
}
std::vector<std::vector<cv::Point> > contoursDraw (contoursCleaned.size());
for (int i=0; i < contoursArea.size(); i++){
    cv::approxPolyDP(Mat(contoursArea[i]), contoursDraw[i], 40, true);
}
Mat drawing = Mat::zeros( mat.size(), CV_8UC3 );
cv::drawContours(drawing, contoursDraw, -1, cv::Scalar(0,255,0),1);

Here you have @Vanuan 's code using C++:

cv::cvtColor(mat, mat, CV_BGR2GRAY);
cv::GaussianBlur(mat, mat, cv::Size(3,3), 0);
cv::Mat kernel = cv::getStructuringElement(cv::MORPH_RECT, cv::Point(9,9));
cv::Mat dilated;
cv::dilate(mat, dilated, kernel);

cv::Mat edges;
cv::Canny(dilated, edges, 84, 3);

std::vector<cv::Vec4i> lines;
lines.clear();
cv::HoughLinesP(edges, lines, 1, CV_PI/180, 25);
std::vector<cv::Vec4i>::iterator it = lines.begin();
for(; it!=lines.end(); ++it) {
    cv::Vec4i l = *it;
    cv::line(edges, cv::Point(l[0], l[1]), cv::Point(l[2], l[3]), cv::Scalar(255,0,0), 2, 8);
}
std::vector< std::vector<cv::Point> > contours;
cv::findContours(edges, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_TC89_KCOS);
std::vector< std::vector<cv::Point> > contoursCleaned;
for (int i=0; i < contours.size(); i++) {
    if (cv::arcLength(contours[i], false) > 100)
        contoursCleaned.push_back(contours[i]);
}
std::vector<std::vector<cv::Point> > contoursArea;

for (int i=0; i < contoursCleaned.size(); i++) {
    if (cv::contourArea(contoursCleaned[i]) > 10000){
        contoursArea.push_back(contoursCleaned[i]);
    }
}
std::vector<std::vector<cv::Point> > contoursDraw (contoursCleaned.size());
for (int i=0; i < contoursArea.size(); i++){
    cv::approxPolyDP(Mat(contoursArea[i]), contoursDraw[i], 40, true);
}
Mat drawing = Mat::zeros( mat.size(), CV_8UC3 );
cv::drawContours(drawing, contoursDraw, -1, cv::Scalar(0,255,0),1);
你是暖光i 2024-11-25 12:59:08
  1. 转换为实验室空间

  2. 使用 kmeans 段 2 簇

  3. 然后在其中一个簇上使用轮廓或霍夫(内部)
  1. Convert to lab space

  2. Use kmeans segment 2 cluster

  3. Then use contours or hough on one of the clusters (intenral)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文