根据屏幕截图对程序进行分类

发布于 2024-11-15 15:21:17 字数 722 浏览 5 评论 0原文

我需要编写算法来根据屏幕截图检测应用程序(用于填写表单)所处的状态。
它有 2 个输入:
答:大约 2-10 个应用程序的屏幕截图,其中选择了不同的选项卡。这些是由用户制作的,所以我可以指示他“选择程序的上部区域”或“选择整个窗口”之类的事情,但我不能期望像素完美的精度。
B:其中一种状态的屏幕截图。表格中填写了不同的数据。

目标是确定“A”中的哪个屏幕截图与“B”来自同一状态。

屏幕截图示例: example Screenshot

基于此屏幕截图的示例:
输入:此程序的 10 个屏幕截图,其中选择了“菜单”、“销售订单”、“采购订单”...选项卡
B输入:上面的屏幕截图。

任务是确定 10 个屏幕截图中哪一个与该图像匹配。

我尝试使用图像描述符算法(SURF),但是它的错误率非常高,因为它不是为此类任务而设计的。

有人知道如何进行这样的分类吗?我应该在屏幕截图上使用一些过滤器(例如中值或模糊),然后运行一些分类算法吗?或者提取一些其他特征来分类(FFT,直方图,..)?

I need to write and algorithm that can detect which state an application (used to fill out forms) is in based on screenshots.
It has 2 inputs:
A: Approximately 2-10 screenshots from an application with different tabs selected. These are made by the user, so i can instruct him to things like "select the upper area of the program" or "select the whole window", but i can not expect pixel-perfect precision.
B: a screenshot of one of these states. The forms are filled with different data.

The goal is to determine which screenshot from "A" is from the same state as "B".

An example screenshot:
example screenshot

An example based on this screenshot:
A input: 10 screenshots from this program with "Menu","Sale Order","Purchase order",... tabs selected
B input: the screenshot above.

The task is to determine which of the 10 screenshots matches this image.

I have tried to use an image descriptor algorithm, (SURF) but it has a really high error ratio, since its not made for such tasks.

Anyone has an idea how to make such classification? Should i use some filter (e.g median or blur) on the screenshots, and then run trought some classification algorithm? Or extract some other feature to classify (FFT,histogram,..)?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

寻梦旅人 2024-11-22 15:21:17

我想你可以使用制表符宽度而不是制表符标签,这样更容易计算。例如,{“菜单”、“销售订单”、“采购订单”}都有不同的宽度。

如果您必须查看选项卡内部,可以尝试一些模板匹配

I guess you can use the tab width instead of the tab label, which is much easier to calculate. For example, {"Menu", "Sale Order", "Purchase Order"} all have different widths.

If you have to look inside the tab, you can attempt some template matching.

孤独患者 2024-11-22 15:21:17

检测每个选项卡的文本,然后查看背景颜色。
或者,找到用于像素级配准的菜单图标之一,然后进行逐点采样以确定选择哪个选项卡。

Detect the text of each tab, then look at the background color.
Alternatively, locate one of the menu icons for pixel-level registration, then do a pointwise sampling to determine which tab is selected.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文