如何从屏幕上获取文本
有一些 Win OS API 调用,可以让人们从屏幕上获取文本,
而不是通过获取快照然后对其进行 OCR 操作,而是通过 API
获取用户指向的鼠标下方的文本并点击。
这就是 Babylon (http://www.babylon.com) 和一键解答 (< a href="http://www.answers.com/main/download_answers_win.jsp" rel="nofollow noreferrer">http://www.answers.com/main/download_answers_win.jsp)以及许多其他工作。
有人可以指出我获得此功能的正确方向吗?
There is some Win OS API call or so that would let one obtain text from the screen
not via obtaining a snapshot and then doing OCR on it, but via API
the idea is to get the text that is under the mouse that the user points to and clicks on.
This is how tools like Babylon (http://www.babylon.com) and 1-Click Answers (http://www.answers.com/main/download_answers_win.jsp) and many others work.
Can someone point me to the right direction to get this functionality?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
没有直接的方法来获取文本。 应用程序可以以无数种不同的方式呈现文本(Windows API 就是其中之一),并且在呈现之后 - 它只是一堆像素。
不过,您可以尝试的一种方法是找到鼠标正下方的窗口并尝试从中获取文本。 这在大多数标准 Windows 控件(标签、文本框等)上可以正常工作,但在 Internet 浏览器上不起作用。
我认为您能做的最好的事情就是让您的应用程序以上述方式支持尽可能多的不同(通用)控件。
There is no direct way to obtain text. An application could render text in a zillion different ways (Windows API being one of them), and after it's rendered - it's just a bunch of pixels.
A method you could try however is to find the window directly under the mouse and trying to get the text from them. This would work fine on most standard Windows controls (labels, textboxes, etc.) Wouldn't work on Internet browsers though.
I think the best you can do is make your application such that it supports as many different (common) controls as possible in the above described manner.
您可以使用 GetWindowText API 获取每个窗口的文本。 可以使用 GetCursorPos API 找到鼠标位置。
在 Delphi 中你可以使用这个函数(感谢下面的 Peter)
问候,
利文
You can get the text of every window with the GetWindowText API. The mouse position can be found with the GetCursorPos API.
In Delphi you could use this function (kudos to Peter Below)
Regards,
Lieven
Windows 具有用于无障碍工具(例如盲人屏幕阅读器)的 API。 (较新的版本也用于其他目的,例如 UI 自动化和测试。)它适用于许多应用程序,甚至大多数不使用标准 Windows 控件即可呈现自己内容的浏览器。 它不适用于所有应用程序,但在大多数情况下可用于计算鼠标下的文本。
当前的 API 称为 Windows自动化API。 描述一般如何执行此操作超出了 Stack Overflow 答案的范围,因此我只是提供了文档的链接。
首次发布此问题时广泛使用的旧 API 称为 Microsoft Active Accessibility API。 与现代 API 一样,这里的范围太宽泛,无法在此详细说明。
请注意,这两个 API 的文档既是为构建辅助工具(如屏幕阅读器)的开发人员编写的,也是为编写希望与这些辅助工具兼容的应用程序的开发人员编写的。
基本思想是辅助工具获取目标应用程序窗口提供的 COM 接口,并且它可以使用这些接口来确定控件及其文本以及它们在逻辑上和空间上的关联方式。 由标准 Windows 控件组成的应用程序大多会自动受到支持。 具有自定义 UI 实现的应用程序必须提供这些接口。 幸运的是,重要的浏览器,例如主流浏览器,已经完成了支持这些接口的工作。
Windows has APIs for accessibility tools like screen-readers for the blind. (Newer versions are also used for other purposes, like UI automation and testing.) It works with many applications, even most browsers which render their own content without using the standard Windows controls. It won't work with all applications, but it can be used to figure out the text under the mouse in most cases.
The current API is called the Windows Automation API. Describing how to do this in general is beyond the scope of a Stack Overflow answer, so I've simply provided a link to the documentation.
The older API that was widely available when this question was first posted is called the Microsoft Active Accessibility API. As with the modern APIs, the scope here is too broad to detail here.
Note that documentation for both APIs is written both for both developers building accessibility tools (like screen readers) as well as for developers writing apps that want to be compatible with those accessibility tools.
The basic idea is that an accessibility tool gets COM interfaces provided by the target application's window(s), and it can use those interfaces to figure out the controls and their text and how they're related both logically and spatially. Applications that are composed of standard Windows controls are mostly automatically supported. Applications with custom UI implementations have to do work to provide these interfaces. Fortunately, the important ones, like the mainstream browsers, have done the work to support these interfaces.
我认为它叫做剪贴板。 我敢打赌这些程序会注入点击和双击& 键盘事件,然后将项目复制到那里进行检查。 或者,他们对 Windows 文本控件感到不安,并以这种方式抓取内容。 我怀疑由于安全问题,这些工具在 Vista 中运行也有问题。
i think its called the clipboard. i am going to bet these programs inject click and double click & keyboard events and then copy items there for inspection. Alternatively, they are gettin jiggy with the windows text controls, and grabbing content that way. i suspect due to security issues, these tools have problems running in vista also.