屏幕区域识别以查找屏幕上的字段位置
我正在尝试找出一种在 C# 中使用 Sikuli 图像识别的方法。我不想使用 Sikuli 本身,因为它的脚本语言有点慢,而且我真的不想在我的 .NET C# 应用程序中间引入 java 桥。
因此,我有一个代表屏幕区域的位图(我将调用该区域 BUTTON1)。屏幕布局可能略有变化,或者屏幕可能已在桌面上移动——所以我无法使用直接位置。我必须首先找到 BUTTON1 在实时屏幕中的当前位置。 (我试图发布这个图片,但我想我不能,因为我是一个新用户......我希望描述清楚......)
我认为 Sikuli 正在幕后使用 OpenCV。由于它是开源的,我想我可以对其进行逆向工程,并弄清楚如何做他们在 OpenCV 中所做的事情,而是在 Emgu.CV 中实现它 - 但我的 Java 不是很强大。
我寻找了显示这一点的示例,但所有示例要么非常简单(即如何识别停车标志)要么非常复杂(即如何进行面部识别)......也许我只是很笨,但我可以似乎并没有在逻辑上跳跃如何做到这一点。
另外,我担心所有各种图像处理例程实际上都是处理器密集型的,我真的希望它尽可能轻量(实际上我可能有很多我试图在屏幕上找到的按钮和字段......)
所以,我考虑这样做的方法是:
A)将位图转换为字节数组并进行强力搜索。 (我知道如何做那部分)。然后
B)使用我发现的字节数组位置来计算其屏幕位置(我真的不完全确定我是如何做到这一点的)而不是使用图像处理的东西。
这完全是疯了吗?有谁有一个简单的例子来说明如何使用 Aforge.Net 或 Emgu.CV 来做到这一点? (或者如何充实上面的步骤 B...?)
谢谢!
I am trying to figure out a way of getting Sikuli's image recognition to use within C#. I don't want to use Sikuli itself because its scripting language is a little slow, and because I really don't want to introduce a java bridge in the middle of my .NET C# app.
So, I have a bitmap which represents an area of my screen (I will call this region BUTTON1). The screen layout may have changed slightly, or the screen may have been moved on the desktop -- so I can't use a direct position. I have to first find where the current position of BUTTON1 is within the live screen. (I tried to post pictures of this, but I guess I can't because I am a new user... I hope the description makes it clear...)
I think that Sikuli is using OpenCV under the covers. Since it is open source, I guess I could reverse engineer it, and figure out how to do what they are doing in OpenCV, implementing it in Emgu.CV instead -- but my Java isn't very strong.
I looked for examples showing this, but all of the examples are either extremely simple (ie, how to recognize a stop sign) or very complex (ie how to do facial recognition)... and maybe I am just dense, but I can't seem to make the jump in logic of how to do this.
Also I worry that all of the various image manipulation routines are actually processor intensive, and I really want this as lightweight as possible (in reality I might have lots of buttons and fields I am trying to find on a screen...)
So, the way I am thinking about doing this instead is:
A) Convert the bitmaps to byte arrays and do brute force search. (I know how to do that part). And then
B) Use the byte array position that I found to calculate its screen position (I'm really not completely sure how I do this) instead of using the image processing stuff.
Is that completely crazy? Does anyone have a simple example of how one could use Aforge.Net or Emgu.CV to do this? (Or how to flesh out step B above...?)
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
一般来说,听起来您想要基本的物体识别。我没有任何使用 SIKULI 的经验,但是有很多方法可以进行对象识别(基于边缘的模板匹配等)。话虽如此,您也许可以直接进行直方图匹配。
http://www.codeproject.com/KB/GDI-plus/Image_Processing_Lab.aspx
该页面应该向您展示如何使用 AForge.net 获取图像的直方图。您只需使用类似这样的方法进行强力搜索:
然后将新创建的位图的直方图与您计算的原始图像的直方图进行比较(匹配度最接近的区域就是您选择的区域)按钮 1)。这不是最优雅的解决方案,但它可能适合您的需求。否则你会遇到更困难的技术(当然我现在可能会忘记一些可能更简单的东西)。
Generally speaking, it sounds like you want basic object recognition. I don't have any experience with SIKULI, but there are a number of ways to do object recognition (Edge based template matching, etc.). That being said you might be able to go with just straight histogram matching.
http://www.codeproject.com/KB/GDI-plus/Image_Processing_Lab.aspx
That page should show you how to use AForge.net to get the histogram of an image. You would just do a brute force search using something like this:
And then compare the newly created bitmap's histogram to the one that you calculated of the original image (whatever area is the closest in terms of matching is what you would select as being the region of BUTTON1). It's not the most elegant solution but it might work for your needs. Otherwise you get into more difficult techniques (of course I could be forgetting something at the moment that might be simpler).