我想编写一个程序,能够通过控制鼠标/键盘并能够“查看”屏幕上的内容来“使用”其他程序。
我使用 AutoIt 来做类似的事情,但有时我不得不作弊,因为该语言不是那么强大,或者也许只是我很糟糕,我无法用它做那么多:P
所以...我需要:
- 截取屏幕截图,然后我将比较它们以使程序“理解”,但它需要“看”
- 使用鼠标:移动、单击、释放,很简单,不是吗?
- 使用键盘:按一些键或组合键,包括特殊键,如 Alt、Ctrl 等...
我如何在 python 中做到这一点?
它在linux和windows下都可以工作吗? (这可能真的很酷,但没有必要)
I'd like to write a program able to "use" other programs by taking control of the mouse/keyboard and being able to "see" what's on the screen.
I used AutoIt to do something similar, but I had to cheat sometimes because the language is not that powerful, or maybe it's just that I suck and I'm not able to do that much with it :P
So... I need to:
- Take screenshots, then I will compare them to make the program "understand", but it needs to "see"
- Use the mouse: move, click and release, it's simple, isn't it?
- Using the keyboard: pressing some keys, or key combinations, including special keys like Alt,Ctrl etc...
How can I do that in python?
Does it works in both linux and windows? (this could be really really cool, but it is not necessary)
发布评论
评论(5)
我使用 PyWinAuto 成功完成了类似的任务。
它还支持使用 Python 图像库 PIL 捕获对话框图像。
I've had some luck with similar tasks using PyWinAuto.
It also has some support for capturing images of dialogs and such using the Python Imaging Library PIL.
AutoIt 完全有能力完成你提到的所有事情。 当我想做一些自动化但使用 Python 的功能时,我发现使用 AutoItX 其中是一个 DLL/COM 控件。
取自我的这个答案:
AutoIt is completely capable of doing everything you mentioned. When I'm wanting to do some automation but use the features of Python, I find it easiest to use AutoItX which is a DLL/COM control.
Taken from this answer of mine:
您可以在 Windows 下使用 WATSUP。
You can use WATSUP under Windows.
如果您熟悉 pascal,那么 SCAR 是一个真正强大的键盘/鼠标/屏幕阅读程序: http://freddy1990.com/index.php?page=product&name=scar 它可以进行OCR、位图查找、颜色查找等。它经常用于自动化在线游戏,但它可以适用于您想要模拟人类阅读屏幕并提供输入的任何情况。
If you are comfortable with pascal, a really powerful keyboard/mouse/screen-reading program is SCAR: http://freddy1990.com/index.php?page=product&name=scar It can do OCR, bitmap finding, color finding, etc. It's often used for automating online games, but it can be used for any situation where you want to simulate a human reading the screen and giving input.
我仅使用 Windows 输入 API 写一个类似VNC的远程控制应用程序过去。 它可以让您在系统级别很好地伪造键盘和鼠标输入(即不仅仅是将事件发布到单个应用程序)。
如果您尝试在 GUI 级别对整个系统进行任何类型的自动化测试,这篇描述自动化响应测试的优秀 USENIX 论文是必读的。
I've used the Windows (only) Input API to write a VNC-like remote-control application in the past. It lets you fake keyboard and mouse input nicely at a system level (ie not just posting events to a single application).
If you're trying to do any sort of automated testing of whole systems at the GUI level, this excellent USENIX paper describing automated responsiveness testing is a must-read.