当前位置：文江博客话题详情

检测照片中纸张角点的算法

发布于 2024-11-18 12:59:08 字数 3795 浏览 6 评论 0原文

检测照片中发票/收据/纸张的角点的最佳方法是什么？这将用于 OCR 之前的后续透视校正。

我目前的方法是：

RGB>灰色>使用阈值处理的 Canny 边缘检测 >膨胀(1)>去除小物体(6) >清除边界对象>根据凸面积选择大型博客。 > [角点检测 - 未实现]

我忍不住认为必须有一种更强大的“智能”/统计方法来处理这种类型的分割。我没有很多训练示例，但我可能可以收集 100 张图像。

更广泛的背景：

我正在使用 matlab 进行原型设计，并计划在 OpenCV 和 Tesserect-OCR 中实现该系统。这是我需要为此特定应用解决的许多图像处理问题中的第一个。因此，我希望推出自己的解决方案并重新熟悉图像处理算法。

以下是我希望算法处理的一些示例图像：如果您想接受挑战，大图像位于 http://madteckhead.com/tmp

_{（来源：madteckhead.com）}

在最好的情况下，这给出：

_{（来源：madteckhead.com）}

但是在其他情况下很容易失败:

_{（来源：madteckhead.com）}

编辑：霍夫变换进度

问：什么算法可以对霍夫线进行聚类以找到角点？根据答案的建议，我能够使用霍夫变换，选择线条并过滤它们。我目前的方法相当粗糙。我假设发票与图像的偏差始终小于 15 度。如果是这种情况，我最终会得到合理的线条结果（见下文）。但我并不完全确定是否有合适的算法来对线进行聚类（或投票）以推断角点。霍夫线不连续。在噪声图像中，可能存在平行线，因此需要某种形式或距线原点的距离度量。有什么想法吗？

情况 1
_{（来源：madteckhead.com）}

原文

What is the best way to detect the corners of an invoice/receipt/sheet-of-paper in a photo? This is to be used for subsequent perspective correction, before OCR.

My current approach has been:

RGB > Gray > Canny Edge Detection with thresholding > Dilate(1) > Remove small objects(6) > clear boarder objects > pick larges blog based on Convex Area. > [corner detection - Not implemented]

I can't help but think there must be a more robust 'intelligent'/statistical approach to handle this type of segmentation. I don't have a lot of training examples, but I could probably get 100 images together.

Broader context:

I'm using matlab to prototype, and planning to implement the system in OpenCV and Tesserect-OCR. This is the first of a number of image processing problems I need to solve for this specific application. So I'm looking to roll my own solution and re-familiarize myself with image processing algorithms.

Here are some sample image that I'd like the algorithm to handle: If you'd like to take up the challenge the large images are at http://madteckhead.com/tmp

_{(source: madteckhead.com)}

In the best case this gives:

_{(source: madteckhead.com)}

However it fails easily on other cases:

_{(source: madteckhead.com)}

EDIT: Hough Transform Progress

Q: What algorithm would cluster the hough lines to find corners?
Following advice from answers I was able to use the Hough Transform, pick lines, and filter them. My current approach is rather crude. I've made the assumption the invoice will always be less than 15deg out of alignment with the image. I end up with reasonable results for lines if this is the case (see below). But am not entirely sure of a suitable algorithm to cluster the lines (or vote) to extrapolate for the corners. The Hough lines are not continuous. And in the noisy images, there can be parallel lines so some form or distance from line origin metrics are required. Any ideas?

case 1

_{(source: madteckhead.com)}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

弥繁 2024-11-25 12:59:08

我是马丁的朋友，今年早些时候他正在研究这个问题。这是我的第一个编码项目，有点仓促，所以代码需要一些错误......解码......
我将从我已经看到的你所做的事情中给出一些提示，然后在明天休息时对我的代码进行排序。

第一个提示，OpenCV 和 python 非常棒，尽快转向它们。：D

不是去除小物体和/或噪音，而是降低精明的限制，这样它接受更多的边缘，然后找到最大的闭合轮廓（在 OpenCV 中使用 findcontour() 和一些简单的参数，我认为我使用了CV_RETR_LIST）。当它在一张白纸上时可能仍然很困难，但绝对提供了最好的结果。

对于 Houghline2() 变换，尝试使用 CV_HOUGH_STANDARD 而不是 CV_HOUGH_PROBABILISTIC，它会给出 rho和 theta 值，在极坐标中定义直线，然后您可以在一定的容差范围内对直线进行分组。

我的分组用作查找表，对于霍夫变换输出的每一行，它都会给出 rho 和 theta 对。如果这些值在表中一对值的 5% 之内，则它们将被丢弃；如果它们在 5% 之外，则将新条目添加到表中。

然后，您可以更轻松地分析平行线或线之间的距离。

希望这有帮助。

回复收藏 0 原文

后来的我们 2024-11-25 12:59:08

这是我经过一番实验后得出的结论：

import cv, cv2, numpy as np
import sys

def get_new(old):
    new = np.ones(old.shape, np.uint8)
    cv2.bitwise_not(new,new)
    return new

if __name__ == '__main__':
    orig = cv2.imread(sys.argv[1])

    # these constants are carefully picked
    MORPH = 9
    CANNY = 84
    HOUGH = 25

    img = cv2.cvtColor(orig, cv2.COLOR_BGR2GRAY)
    cv2.GaussianBlur(img, (3,3), 0, img)


    # this is to recognize white on white
    kernel = cv2.getStructuringElement(cv2.MORPH_RECT,(MORPH,MORPH))
    dilated = cv2.dilate(img, kernel)

    edges = cv2.Canny(dilated, 0, CANNY, apertureSize=3)

    lines = cv2.HoughLinesP(edges, 1,  3.14/180, HOUGH)
    for line in lines[0]:
         cv2.line(edges, (line[0], line[1]), (line[2], line[3]),
                         (255,0,0), 2, 8)

    # finding contours
    contours, _ = cv2.findContours(edges.copy(), cv.CV_RETR_EXTERNAL,
                                   cv.CV_CHAIN_APPROX_TC89_KCOS)
    contours = filter(lambda cont: cv2.arcLength(cont, False) > 100, contours)
    contours = filter(lambda cont: cv2.contourArea(cont) > 10000, contours)

    # simplify contours down to polygons
    rects = []
    for cont in contours:
        rect = cv2.approxPolyDP(cont, 40, True).copy().reshape(-1, 2)
        rects.append(rect)

    # that's basically it
    cv2.drawContours(orig, rects,-1,(0,255,0),1)

    # show only contours
    new = get_new(img)
    cv2.drawContours(new, rects,-1,(0,255,0),1)
    cv2.GaussianBlur(new, (9,9), 0, new)
    new = cv2.Canny(new, 0, CANNY, apertureSize=3)

    cv2.namedWindow('result', cv2.WINDOW_NORMAL)
    cv2.imshow('result', orig)
    cv2.waitKey(0)
    cv2.imshow('result', dilated)
    cv2.waitKey(0)
    cv2.imshow('result', edges)
    cv2.waitKey(0)
    cv2.imshow('result', new)
    cv2.waitKey(0)

    cv2.destroyAllWindows()

并不完美，但至少适用于所有示例：

Here's what I came up with after a bit of experimentation:

import cv, cv2, numpy as np
import sys

def get_new(old):
    new = np.ones(old.shape, np.uint8)
    cv2.bitwise_not(new,new)
    return new

if __name__ == '__main__':
    orig = cv2.imread(sys.argv[1])

    # these constants are carefully picked
    MORPH = 9
    CANNY = 84
    HOUGH = 25

    img = cv2.cvtColor(orig, cv2.COLOR_BGR2GRAY)
    cv2.GaussianBlur(img, (3,3), 0, img)


    # this is to recognize white on white
    kernel = cv2.getStructuringElement(cv2.MORPH_RECT,(MORPH,MORPH))
    dilated = cv2.dilate(img, kernel)

    edges = cv2.Canny(dilated, 0, CANNY, apertureSize=3)

    lines = cv2.HoughLinesP(edges, 1,  3.14/180, HOUGH)
    for line in lines[0]:
         cv2.line(edges, (line[0], line[1]), (line[2], line[3]),
                         (255,0,0), 2, 8)

    # finding contours
    contours, _ = cv2.findContours(edges.copy(), cv.CV_RETR_EXTERNAL,
                                   cv.CV_CHAIN_APPROX_TC89_KCOS)
    contours = filter(lambda cont: cv2.arcLength(cont, False) > 100, contours)
    contours = filter(lambda cont: cv2.contourArea(cont) > 10000, contours)

    # simplify contours down to polygons
    rects = []
    for cont in contours:
        rect = cv2.approxPolyDP(cont, 40, True).copy().reshape(-1, 2)
        rects.append(rect)

    # that's basically it
    cv2.drawContours(orig, rects,-1,(0,255,0),1)

    # show only contours
    new = get_new(img)
    cv2.drawContours(new, rects,-1,(0,255,0),1)
    cv2.GaussianBlur(new, (9,9), 0, new)
    new = cv2.Canny(new, 0, CANNY, apertureSize=3)

    cv2.namedWindow('result', cv2.WINDOW_NORMAL)
    cv2.imshow('result', orig)
    cv2.waitKey(0)
    cv2.imshow('result', dilated)
    cv2.waitKey(0)
    cv2.imshow('result', edges)
    cv2.waitKey(0)
    cv2.imshow('result', new)
    cv2.waitKey(0)

    cv2.destroyAllWindows()

Not perfect, but at least works for all samples:

回复收藏 0 原文

秋叶绚丽 2024-11-25 12:59:08

我大学的一个学生小组最近演示了他们编写的一个 iPhone 应用程序（和 python OpenCV 应用程序）就是为了做到这一点。我记得，步骤是这样的：

中值过滤器完全删除纸上的文本（这是白纸上的手写文本，具有相当好的照明，可能不适用于打印文本，但效果很好）。原因是它使角点检测变得更加容易。
线的霍夫变换
找到霍夫变换累加器空间中的峰值，并在整个图像上绘制每条线。
分析线条并删除任何彼此非常接近且角度相似的线条（将线条聚集为一条）。这是必要的，因为霍夫变换并不完美，因为它在离散样本空间中工作。
找到大致平行且与其他线对相交的线对，以查看哪些线形成四边形。

这似乎工作得相当好，他们能够拍摄一张纸或一本书的照片，执行角点检测，然后几乎实时地将图像中的文档映射到平面上（有一个 OpenCV 函数可以执行映射）。当我看到它工作时，没有 OCR。

回复收藏 0 原文

聊慰 2024-11-25 12:59:08

您可以使用角点检测，而不是从边缘检测开始。

Marvin Framework 为此提供了 Moravec 算法的实现。您可以找到纸张的角作为起点。 Moravec 算法的输出如下：

在此处输入图像描述

回复收藏 0 原文

趁微风不噪 2024-11-25 12:59:08

您也可以使用 MSER （最大稳定极值区域） Sobel 算子的结果是找到图像的稳定区域。对于 MSER 返回的每个区域，您可以应用凸包和多边形逼近来获得如下所示的结果：

但是这种检测对于多张图片的实时检测非常有用，而单张图片并不总是返回最佳结果。

回复收藏 0 原文

煞人兵器 2024-11-25 12:59:08

边缘检测后，使用霍夫变换。
然后，将这些点与它们的标签一起放入SVM（支持向量机）中，如果示例上有平滑的线条，SVM将没有任何困难来划分示例的必要部分和其他部分。我对 SVM 的建议是设置一个参数，比如连接性和长度。也就是说，如果点相连并且很长，它们很可能是收据的一条线。然后，您可以消除所有其他点。

回复收藏 0 原文

笑咖 2024-11-25 12:59:08

这里有 @Vanuan 使用 C++ 的代码：

cv::cvtColor(mat, mat, CV_BGR2GRAY);
cv::GaussianBlur(mat, mat, cv::Size(3,3), 0);
cv::Mat kernel = cv::getStructuringElement(cv::MORPH_RECT, cv::Point(9,9));
cv::Mat dilated;
cv::dilate(mat, dilated, kernel);

cv::Mat edges;
cv::Canny(dilated, edges, 84, 3);

std::vector<cv::Vec4i> lines;
lines.clear();
cv::HoughLinesP(edges, lines, 1, CV_PI/180, 25);
std::vector<cv::Vec4i>::iterator it = lines.begin();
for(; it!=lines.end(); ++it) {
    cv::Vec4i l = *it;
    cv::line(edges, cv::Point(l[0], l[1]), cv::Point(l[2], l[3]), cv::Scalar(255,0,0), 2, 8);
}
std::vector< std::vector<cv::Point> > contours;
cv::findContours(edges, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_TC89_KCOS);
std::vector< std::vector<cv::Point> > contoursCleaned;
for (int i=0; i < contours.size(); i++) {
    if (cv::arcLength(contours[i], false) > 100)
        contoursCleaned.push_back(contours[i]);
}
std::vector<std::vector<cv::Point> > contoursArea;

for (int i=0; i < contoursCleaned.size(); i++) {
    if (cv::contourArea(contoursCleaned[i]) > 10000){
        contoursArea.push_back(contoursCleaned[i]);
    }
}
std::vector<std::vector<cv::Point> > contoursDraw (contoursCleaned.size());
for (int i=0; i < contoursArea.size(); i++){
    cv::approxPolyDP(Mat(contoursArea[i]), contoursDraw[i], 40, true);
}
Mat drawing = Mat::zeros( mat.size(), CV_8UC3 );
cv::drawContours(drawing, contoursDraw, -1, cv::Scalar(0,255,0),1);

Here you have @Vanuan 's code using C++:

cv::cvtColor(mat, mat, CV_BGR2GRAY);
cv::GaussianBlur(mat, mat, cv::Size(3,3), 0);
cv::Mat kernel = cv::getStructuringElement(cv::MORPH_RECT, cv::Point(9,9));
cv::Mat dilated;
cv::dilate(mat, dilated, kernel);

cv::Mat edges;
cv::Canny(dilated, edges, 84, 3);

std::vector<cv::Vec4i> lines;
lines.clear();
cv::HoughLinesP(edges, lines, 1, CV_PI/180, 25);
std::vector<cv::Vec4i>::iterator it = lines.begin();
for(; it!=lines.end(); ++it) {
    cv::Vec4i l = *it;
    cv::line(edges, cv::Point(l[0], l[1]), cv::Point(l[2], l[3]), cv::Scalar(255,0,0), 2, 8);
}
std::vector< std::vector<cv::Point> > contours;
cv::findContours(edges, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_TC89_KCOS);
std::vector< std::vector<cv::Point> > contoursCleaned;
for (int i=0; i < contours.size(); i++) {
    if (cv::arcLength(contours[i], false) > 100)
        contoursCleaned.push_back(contours[i]);
}
std::vector<std::vector<cv::Point> > contoursArea;

for (int i=0; i < contoursCleaned.size(); i++) {
    if (cv::contourArea(contoursCleaned[i]) > 10000){
        contoursArea.push_back(contoursCleaned[i]);
    }
}
std::vector<std::vector<cv::Point> > contoursDraw (contoursCleaned.size());
for (int i=0; i < contoursArea.size(); i++){
    cv::approxPolyDP(Mat(contoursArea[i]), contoursDraw[i], 40, true);
}
Mat drawing = Mat::zeros( mat.size(), CV_8UC3 );
cv::drawContours(drawing, contoursDraw, -1, cv::Scalar(0,255,0),1);

回复收藏 0 原文