根据屏幕截图对程序进行分类
我需要编写算法来根据屏幕截图检测应用程序(用于填写表单)所处的状态。
它有 2 个输入:
答:大约 2-10 个应用程序的屏幕截图,其中选择了不同的选项卡。这些是由用户制作的,所以我可以指示他“选择程序的上部区域”或“选择整个窗口”之类的事情,但我不能期望像素完美的精度。
B:其中一种状态的屏幕截图。表格中填写了不同的数据。
目标是确定“A”中的哪个屏幕截图与“B”来自同一状态。
屏幕截图示例:
基于此屏幕截图的示例:
输入:此程序的 10 个屏幕截图,其中选择了“菜单”、“销售订单”、“采购订单”...选项卡
B输入:上面的屏幕截图。
任务是确定 10 个屏幕截图中哪一个与该图像匹配。
我尝试使用图像描述符算法(SURF),但是它的错误率非常高,因为它不是为此类任务而设计的。
有人知道如何进行这样的分类吗?我应该在屏幕截图上使用一些过滤器(例如中值或模糊),然后运行一些分类算法吗?或者提取一些其他特征来分类(FFT,直方图,..)?
I need to write and algorithm that can detect which state an application (used to fill out forms) is in based on screenshots.
It has 2 inputs:
A: Approximately 2-10 screenshots from an application with different tabs selected. These are made by the user, so i can instruct him to things like "select the upper area of the program" or "select the whole window", but i can not expect pixel-perfect precision.
B: a screenshot of one of these states. The forms are filled with different data.
The goal is to determine which screenshot from "A" is from the same state as "B".
An example screenshot:
An example based on this screenshot:
A input: 10 screenshots from this program with "Menu","Sale Order","Purchase order",... tabs selected
B input: the screenshot above.
The task is to determine which of the 10 screenshots matches this image.
I have tried to use an image descriptor algorithm, (SURF) but it has a really high error ratio, since its not made for such tasks.
Anyone has an idea how to make such classification? Should i use some filter (e.g median or blur) on the screenshots, and then run trought some classification algorithm? Or extract some other feature to classify (FFT,histogram,..)?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我想你可以使用制表符宽度而不是制表符标签,这样更容易计算。例如,{“菜单”、“销售订单”、“采购订单”}都有不同的宽度。
如果您必须查看选项卡内部,可以尝试一些模板匹配。
I guess you can use the tab width instead of the tab label, which is much easier to calculate. For example, {"Menu", "Sale Order", "Purchase Order"} all have different widths.
If you have to look inside the tab, you can attempt some template matching.
检测每个选项卡的文本,然后查看背景颜色。
或者,找到用于像素级配准的菜单图标之一,然后进行逐点采样以确定选择哪个选项卡。
Detect the text of each tab, then look at the background color.
Alternatively, locate one of the menu icons for pixel-level registration, then do a pointwise sampling to determine which tab is selected.