以编程方式获取页面的屏幕截图
我正在编写一个供内部使用的专用爬虫和解析器,并且我需要能够截取网页的屏幕截图,以便检查整个过程中使用的颜色。该程序将接收大约十个网址并将它们保存为位图图像。
从那里我计划使用 LockBits 来创建图像中最常用的五种颜色的列表。据我所知,这是获取网页中使用的颜色的最简单方法,但如果有更简单的方法,请提出您的建议。
无论如何,我打算使用ACA WebThumb ActiveX Control 直到我看到价格标签。我对 C# 也很陌生,只使用了几个月。有没有办法解决我截取网页屏幕截图以提取配色方案的问题?
I'm writing a specialized crawler and parser for internal use, and I require the ability to take a screenshot of a web page in order to check what colours are being used throughout. The program will take in around ten web addresses and will save them as a bitmap image.
From there I plan to use LockBits in order to create a list of the five most used colours within the image. To my knowledge, it's the easiest way to get the colours used within a web page, but if there is an easier way to do it please chime in with your suggestions.
Anyway, I was going to use ACA WebThumb ActiveX Control until I saw the price tag. I'm also fairly new to C#, having only used it for a few months. Is there a solution to my problem of taking a screenshot of a web page in order to extract the colour scheme?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
一种快速而肮脏的方法是使用 WinForms WebBrowser< /a> 控制并将其绘制为位图。在独立控制台应用程序中执行此操作有点棘手,因为您必须了解托管 STAThread 使用基本异步编程模式进行控制。但这里有一个有效的概念证明,它将网页捕获为 800x600 BMP 文件:
要编译它,请创建一个新的控制台应用程序,并确保添加
System.Drawing
和的程序集引用System.Windows.Forms
。更新:我重写了代码以避免必须使用 hacky 轮询 WaitOne/DoEvents 模式。此代码应该更接近于以下最佳实践。
更新 2: 您表明您希望在 Windows 窗体应用程序中使用它。在这种情况下,请忘记动态创建
WebBrowser
控件。您想要的是在表单上创建WebBrowser
的隐藏 (Visible=false) 实例,并按照我上面显示的相同方式使用它。这是另一个示例,显示表单的用户代码部分,其中包含文本框 (webAddressTextBox
)、按钮 (generateScreenshotButton
) 和隐藏浏览器 (webBrowser )。当我从事这项工作时,我发现了一个我以前没有处理过的特性——DocumentCompleted 事件实际上可以根据页面的性质多次引发。该示例应该可以正常工作,您可以扩展它以执行您想要的任何操作:
A quick and dirty way would be to use the WinForms WebBrowser control and draw it to a bitmap. Doing this in a standalone console app is slightly tricky because you have to be aware of the implications of hosting a STAThread control while using a fundamentally asynchronous programming pattern. But here is a working proof of concept which captures a web page to an 800x600 BMP file:
To compile this, create a new console application and make sure to add assembly references for
System.Drawing
andSystem.Windows.Forms
.UPDATE: I rewrote the code to avoid having to using the hacky polling WaitOne/DoEvents pattern. This code should be closer to following best practices.
UPDATE 2: You indicate that you want to use this in a Windows Forms application. In that case, forget about dynamically creating the
WebBrowser
control. What you want is to create a hidden (Visible=false) instance of aWebBrowser
on your form and use it the same way I show above. Here is another sample which shows the user code portion of a form with a text box (webAddressTextBox
), a button (generateScreenshotButton
), and a hidden browser (webBrowser
). While I was working on this, I discovered a peculiarity which I didn't handle before -- the DocumentCompleted event can actually be raised multiple times depending on the nature of the page. This sample should work in general, and you can extend it to do whatever you want:https://screenshotlayer.com/documentation 是我最近能找到的唯一免费服务...
您需要使用 HttpWebRequest 下载图像的二进制文件。有关详细信息,请参阅上面提供的网址。
https://screenshotlayer.com/documentation is the only free service I can find lately...
You'll need to use HttpWebRequest to download the binary of the image. See the provided url above for details.
这个问题很旧,但是,您也可以使用 nuget 包 Freezer。它是免费的,使用最新的 Gecko 网络浏览器(支持 HTML5 和 CSS3)并且仅位于一个 dll 中。
This question is old but, alternatively, you can use nuget package Freezer. It's free, uses a recent Gecko webbrowser (supports HTML5 and CSS3) and stands only in one dll.
有一个很棒的基于 Webkit 的浏览器 PhantomJS,它允许从命令行执行任何 JavaScript。
从 http://phantomjs.org/download.html 安装它并从命令行执行以下示例脚本:
它将在 JPEG 文件中创建给定页面的屏幕截图。这种方法的优点是您不依赖任何外部提供商,并且可以轻松地自动进行大量屏幕截图。
There is a great Webkit based browser PhantomJS which allows to execute any JavaScript from command line.
Install it from http://phantomjs.org/download.html and execute the following sample script from command line:
It will create a screenshot of given page in JPEG file. The upside of that approach is that you don't rely on any external provider and can easily automate screenshot taking in large quantities.
我使用了 WebBrowser,但它对我来说并不完美,特别是当需要等待 JavaScript 渲染完成时。
我尝试了一些Api,发现Selenium,Selenium最重要的是,它不需要STAThread 可以在简单的控制台应用程序和服务中运行。
尝试一下:
I used WebBrowser and it doesn't work perfect for me, specially when needs to waiting for JavaScript rendering complete.
I tried some Api(s) and found Selenium, the most important thing about Selenium is, it does not require STAThread and could run in simple console app as well as Services.
give it a try :
检查这个。这似乎可以满足您的要求,从技术上讲,它通过网络浏览器控制以非常相似的方式解决问题。它似乎满足了传入的一系列参数的需求,并且还内置了良好的错误处理功能。唯一的缺点是它是您生成的外部进程 (exe),它会创建一个您稍后将阅读的物理文件。从你的描述来看,你甚至考虑了网络服务,所以我认为这不是问题。
在解决您关于如何同时处理多个问题的最新评论时,这将是完美的。您可以在任何时间生成 3、4、5 个或更多进程的并行进程,或者在另一个捕获进程发生时将颜色位分析作为线程运行。
对于图像处理,我最近遇到了 Emgu,我自己没有使用过它,但它看起来很有趣。它声称速度很快,并且对图形分析有很多支持,包括读取像素颜色。如果我现在手头有任何图形处理项目,我会尝试一下。
Check this out. This seems to do what you wanted and technically it approaches the problem in very similar way through web browser control. It seems to have catered for a range of parameters to be passed in and also good error handling built into it. The only downside is that it is an external process (exe) that you spawn and it create a physical file that you will read later. From your description, you even consider webservices, so I dont think that is a problem.
In solving your latest comment about how to process multiple of them simultaneously, this will be perfect. You can spawn say a parallel of 3, 4, 5 or more processes at any one time or have the analysis of the color bit running as thread while another capturing process is happening.
For image processing, I recently come across Emgu, havent used it myself but it seems fascinating. It claims to be fast and have a lot of support for graphic analysis including reading of pixel color. If I have any graphic processing project on hand right now I will give this a try.
你也可以看看QT jambi
http://qt.nokia。 com/doc/qtjambi-4.4/html/com/trolltech/qt/qtjambi-index.html
进行屏幕截图:
他们有一个很好的基于 webkit 的浏览器 java 实现,你可以简单地通过执行以下操作来 看看示例 - 他们有一个很好的网络浏览器演示。
you may also have a look at QT jambi
http://qt.nokia.com/doc/qtjambi-4.4/html/com/trolltech/qt/qtjambi-index.html
they have a nice webkit based java implementation for a browser where you can do a screenshot simply by doing sth like:
Have a look at the samples - they have a nice webbrowser demo.