以编程方式获取页面的屏幕截图

发布于 2024-08-16 13:22:54 字数 375 浏览 13 评论 0原文

我正在编写一个供内部使用的专用爬虫和解析器,并且我需要能够截取网页的屏幕截图,以便检查整个过程中使用的颜色。该程序将接收大约十个网址并将它们保存为位图图像。

从那里我计划使用 LockBits 来创建图像中最常用的五种颜色的列表。据我所知,这是获取网页中使用的颜色的最简单方法,但如果有更简单的方法,请提出您的建议。

无论如何,我打算使用ACA WebThumb ActiveX Control 直到我看到价格标签。我对 C# 也很陌生,只使用了几个月。有没有办法解决我截取网页屏幕截图以提取配色方案的问题?

I'm writing a specialized crawler and parser for internal use, and I require the ability to take a screenshot of a web page in order to check what colours are being used throughout. The program will take in around ten web addresses and will save them as a bitmap image.

From there I plan to use LockBits in order to create a list of the five most used colours within the image. To my knowledge, it's the easiest way to get the colours used within a web page, but if there is an easier way to do it please chime in with your suggestions.

Anyway, I was going to use ACA WebThumb ActiveX Control until I saw the price tag. I'm also fairly new to C#, having only used it for a few months. Is there a solution to my problem of taking a screenshot of a web page in order to extract the colour scheme?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

夕嗳→ 2024-08-23 13:22:54

一种快速而肮脏的方法是使用 WinForms WebBrowser< /a> 控制并将其绘制为位图。在独立控制台应用程序中执行此操作有点棘手,因为您必须了解托管 STAThread 使用基本异步编程模式进行控制。但这里有一个有效的概念证明,它将网页捕获为 800x600 BMP 文件:

namespace WebBrowserScreenshotSample
{
    using System;
    using System.Drawing;
    using System.Drawing.Imaging;
    using System.Threading;
    using System.Windows.Forms;

    class Program
    {
        [STAThread]
        static void Main()
        {
            int width = 800;
            int height = 600;

            using (WebBrowser browser = new WebBrowser())
            {
                browser.Width = width;
                browser.Height = height;
                browser.ScrollBarsEnabled = true;

                // This will be called when the page finishes loading
                browser.DocumentCompleted += Program.OnDocumentCompleted;

                browser.Navigate("https://stackoverflow.com/");

                // This prevents the application from exiting until
                // Application.Exit is called
                Application.Run();
            }
        }

        static void OnDocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
        {
            // Now that the page is loaded, save it to a bitmap
            WebBrowser browser = (WebBrowser)sender;

            using (Graphics graphics = browser.CreateGraphics())
            using (Bitmap bitmap = new Bitmap(browser.Width, browser.Height, graphics))
            {
                Rectangle bounds = new Rectangle(0, 0, bitmap.Width, bitmap.Height);
                browser.DrawToBitmap(bitmap, bounds);
                bitmap.Save("screenshot.bmp", ImageFormat.Bmp);
            }

            // Instruct the application to exit
            Application.Exit();
        }
    }
}

要编译它,请创建一个新的控制台应用程序,并确保添加 System.Drawing的程序集引用System.Windows.Forms

更新:我重写了代码以避免必须使用 hacky 轮询 WaitOne/DoEvents 模式。此代码应该更接近于以下最佳实践。

更新 2: 您表明您希望在 Windows 窗体应用程序中使用它。在这种情况下,请忘记动态创建 WebBrowser 控件。您想要的是在表单上创建 WebBrowser 的隐藏 (Visible=false) 实例,并按照我上面显示的相同方式使用它。这是另一个示例,显示表单的用户代码部分,其中包含文本框 (webAddressTextBox)、按钮 (generateScreenshotButton) 和隐藏浏览器 (webBrowser )。当我从事这项工作时,我发现了一个我以前没有处理过的特性——DocumentCompleted 事件实际上可以根据页面的性质多次引发。该示例应该可以正常工作,您可以扩展它以执行您想要的任何操作:

namespace WebBrowserScreenshotFormsSample
{
    using System;
    using System.Drawing;
    using System.Drawing.Imaging;
    using System.IO;
    using System.Windows.Forms;

    public partial class MainForm : Form
    {
        public MainForm()
        {
            this.InitializeComponent();

            // Register for this event; we'll save the screenshot when it fires
            this.webBrowser.DocumentCompleted += 
                new WebBrowserDocumentCompletedEventHandler(this.OnDocumentCompleted);
        }

        private void OnClickGenerateScreenshot(object sender, EventArgs e)
        {
            // Disable button to prevent multiple concurrent operations
            this.generateScreenshotButton.Enabled = false;

            string webAddressString = this.webAddressTextBox.Text;

            Uri webAddress;
            if (Uri.TryCreate(webAddressString, UriKind.Absolute, out webAddress))
            {
                this.webBrowser.Navigate(webAddress);
            }
            else
            {
                MessageBox.Show(
                    "Please enter a valid URI.",
                    "WebBrowser Screenshot Forms Sample",
                    MessageBoxButtons.OK,
                    MessageBoxIcon.Exclamation);

                // Re-enable button on error before returning
                this.generateScreenshotButton.Enabled = true;
            }
        }

        private void OnDocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
        {
            // This event can be raised multiple times depending on how much of the
            // document has loaded, if there are multiple frames, etc.
            // We only want the final page result, so we do the following check:
            if (this.webBrowser.ReadyState == WebBrowserReadyState.Complete &&
                e.Url == this.webBrowser.Url)
            {
                // Generate the file name here
                string screenshotFileName = Path.GetFullPath(
                    "screenshot_" + DateTime.Now.Ticks + ".png");

                this.SaveScreenshot(screenshotFileName);
                MessageBox.Show(
                    "Screenshot saved to '" + screenshotFileName + "'.",
                    "WebBrowser Screenshot Forms Sample",
                    MessageBoxButtons.OK,
                    MessageBoxIcon.Information);

                // Re-enable button before returning
                this.generateScreenshotButton.Enabled = true;
            }
        }

        private void SaveScreenshot(string fileName)
        {
            int width = this.webBrowser.Width;
            int height = this.webBrowser.Height;
            using (Graphics graphics = this.webBrowser.CreateGraphics())
            using (Bitmap bitmap = new Bitmap(width, height, graphics))
            {
                Rectangle bounds = new Rectangle(0, 0, width, height);
                this.webBrowser.DrawToBitmap(bitmap, bounds);
                bitmap.Save(fileName, ImageFormat.Png);
            }
        }
    }
}

A quick and dirty way would be to use the WinForms WebBrowser control and draw it to a bitmap. Doing this in a standalone console app is slightly tricky because you have to be aware of the implications of hosting a STAThread control while using a fundamentally asynchronous programming pattern. But here is a working proof of concept which captures a web page to an 800x600 BMP file:

namespace WebBrowserScreenshotSample
{
    using System;
    using System.Drawing;
    using System.Drawing.Imaging;
    using System.Threading;
    using System.Windows.Forms;

    class Program
    {
        [STAThread]
        static void Main()
        {
            int width = 800;
            int height = 600;

            using (WebBrowser browser = new WebBrowser())
            {
                browser.Width = width;
                browser.Height = height;
                browser.ScrollBarsEnabled = true;

                // This will be called when the page finishes loading
                browser.DocumentCompleted += Program.OnDocumentCompleted;

                browser.Navigate("https://stackoverflow.com/");

                // This prevents the application from exiting until
                // Application.Exit is called
                Application.Run();
            }
        }

        static void OnDocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
        {
            // Now that the page is loaded, save it to a bitmap
            WebBrowser browser = (WebBrowser)sender;

            using (Graphics graphics = browser.CreateGraphics())
            using (Bitmap bitmap = new Bitmap(browser.Width, browser.Height, graphics))
            {
                Rectangle bounds = new Rectangle(0, 0, bitmap.Width, bitmap.Height);
                browser.DrawToBitmap(bitmap, bounds);
                bitmap.Save("screenshot.bmp", ImageFormat.Bmp);
            }

            // Instruct the application to exit
            Application.Exit();
        }
    }
}

To compile this, create a new console application and make sure to add assembly references for System.Drawing and System.Windows.Forms.

UPDATE: I rewrote the code to avoid having to using the hacky polling WaitOne/DoEvents pattern. This code should be closer to following best practices.

UPDATE 2: You indicate that you want to use this in a Windows Forms application. In that case, forget about dynamically creating the WebBrowser control. What you want is to create a hidden (Visible=false) instance of a WebBrowser on your form and use it the same way I show above. Here is another sample which shows the user code portion of a form with a text box (webAddressTextBox), a button (generateScreenshotButton), and a hidden browser (webBrowser). While I was working on this, I discovered a peculiarity which I didn't handle before -- the DocumentCompleted event can actually be raised multiple times depending on the nature of the page. This sample should work in general, and you can extend it to do whatever you want:

namespace WebBrowserScreenshotFormsSample
{
    using System;
    using System.Drawing;
    using System.Drawing.Imaging;
    using System.IO;
    using System.Windows.Forms;

    public partial class MainForm : Form
    {
        public MainForm()
        {
            this.InitializeComponent();

            // Register for this event; we'll save the screenshot when it fires
            this.webBrowser.DocumentCompleted += 
                new WebBrowserDocumentCompletedEventHandler(this.OnDocumentCompleted);
        }

        private void OnClickGenerateScreenshot(object sender, EventArgs e)
        {
            // Disable button to prevent multiple concurrent operations
            this.generateScreenshotButton.Enabled = false;

            string webAddressString = this.webAddressTextBox.Text;

            Uri webAddress;
            if (Uri.TryCreate(webAddressString, UriKind.Absolute, out webAddress))
            {
                this.webBrowser.Navigate(webAddress);
            }
            else
            {
                MessageBox.Show(
                    "Please enter a valid URI.",
                    "WebBrowser Screenshot Forms Sample",
                    MessageBoxButtons.OK,
                    MessageBoxIcon.Exclamation);

                // Re-enable button on error before returning
                this.generateScreenshotButton.Enabled = true;
            }
        }

        private void OnDocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
        {
            // This event can be raised multiple times depending on how much of the
            // document has loaded, if there are multiple frames, etc.
            // We only want the final page result, so we do the following check:
            if (this.webBrowser.ReadyState == WebBrowserReadyState.Complete &&
                e.Url == this.webBrowser.Url)
            {
                // Generate the file name here
                string screenshotFileName = Path.GetFullPath(
                    "screenshot_" + DateTime.Now.Ticks + ".png");

                this.SaveScreenshot(screenshotFileName);
                MessageBox.Show(
                    "Screenshot saved to '" + screenshotFileName + "'.",
                    "WebBrowser Screenshot Forms Sample",
                    MessageBoxButtons.OK,
                    MessageBoxIcon.Information);

                // Re-enable button before returning
                this.generateScreenshotButton.Enabled = true;
            }
        }

        private void SaveScreenshot(string fileName)
        {
            int width = this.webBrowser.Width;
            int height = this.webBrowser.Height;
            using (Graphics graphics = this.webBrowser.CreateGraphics())
            using (Bitmap bitmap = new Bitmap(width, height, graphics))
            {
                Rectangle bounds = new Rectangle(0, 0, width, height);
                this.webBrowser.DrawToBitmap(bitmap, bounds);
                bitmap.Save(fileName, ImageFormat.Png);
            }
        }
    }
}
请持续率性 2024-08-23 13:22:54

https://screenshotlayer.com/documentation 是我最近能找到的唯一免费服务...

您需要使用 HttpWebRequest 下载图像的二进制文件。有关详细信息,请参阅上面提供的网址。

HttpWebRequest request = HttpWebRequest.Create("https://[url]") as HttpWebRequest;
Bitmap bitmap;
using (Stream stream = request.GetResponse().GetResponseStream())
{
    bitmap = new Bitmap(stream);
}
// now that you have a bitmap, you can do what you need to do...

https://screenshotlayer.com/documentation is the only free service I can find lately...

You'll need to use HttpWebRequest to download the binary of the image. See the provided url above for details.

HttpWebRequest request = HttpWebRequest.Create("https://[url]") as HttpWebRequest;
Bitmap bitmap;
using (Stream stream = request.GetResponse().GetResponseStream())
{
    bitmap = new Bitmap(stream);
}
// now that you have a bitmap, you can do what you need to do...
中性美 2024-08-23 13:22:54

这个问题很旧,但是,您也可以使用 nuget 包 Freezer。它是免费的,使用最新的 Gecko 网络浏览器(支持 HTML5 和 CSS3)并且仅位于一个 dll 中。

var screenshotJob = ScreenshotJobBuilder.Create("https://google.com")
              .SetBrowserSize(1366, 768)
              .SetCaptureZone(CaptureZone.FullPage) 
              .SetTrigger(new WindowLoadTrigger()); 

 System.Drawing.Image screenshot = screenshotJob.Freeze();

This question is old but, alternatively, you can use nuget package Freezer. It's free, uses a recent Gecko webbrowser (supports HTML5 and CSS3) and stands only in one dll.

var screenshotJob = ScreenshotJobBuilder.Create("https://google.com")
              .SetBrowserSize(1366, 768)
              .SetCaptureZone(CaptureZone.FullPage) 
              .SetTrigger(new WindowLoadTrigger()); 

 System.Drawing.Image screenshot = screenshotJob.Freeze();
缺⑴份安定 2024-08-23 13:22:54

有一个很棒的基于 Webkit 的浏览器 PhantomJS,它允许从命令行执行任何 JavaScript。

http://phantomjs.org/download.html 安装它并从命令行执行以下示例脚本:

./phantomjs ../examples/rasterize.js http://www.panoramio.com/photo/76188108 test.jpg

它将在 JPEG 文件中创建给定页面的屏幕截图。这种方法的优点是您不依赖任何外部提供商,并且可以轻松地自动进行大量屏幕截图。

There is a great Webkit based browser PhantomJS which allows to execute any JavaScript from command line.

Install it from http://phantomjs.org/download.html and execute the following sample script from command line:

./phantomjs ../examples/rasterize.js http://www.panoramio.com/photo/76188108 test.jpg

It will create a screenshot of given page in JPEG file. The upside of that approach is that you don't rely on any external provider and can easily automate screenshot taking in large quantities.

几味少女 2024-08-23 13:22:54

我使用了 WebBrowser,但它对我来说并不完美,特别是当需要等待 JavaScript 渲染完成时。
我尝试了一些Api,发现Selenium,Selenium最重要的是,它不需要STAThread 可以在简单的控制台应用程序和服务中运行。

尝试一下:

class Program
{
    static void Main()
    {
        var driver = new FirefoxDriver();

        driver.Navigate()
            .GoToUrl("http://stackoverflow.com/");

        driver.GetScreenshot()
            .SaveAsFile("stackoverflow.jpg", ImageFormat.Jpeg);

        driver.Quit();
    }
}

I used WebBrowser and it doesn't work perfect for me, specially when needs to waiting for JavaScript rendering complete.
I tried some Api(s) and found Selenium, the most important thing about Selenium is, it does not require STAThread and could run in simple console app as well as Services.

give it a try :

class Program
{
    static void Main()
    {
        var driver = new FirefoxDriver();

        driver.Navigate()
            .GoToUrl("http://stackoverflow.com/");

        driver.GetScreenshot()
            .SaveAsFile("stackoverflow.jpg", ImageFormat.Jpeg);

        driver.Quit();
    }
}
风蛊 2024-08-23 13:22:54

检查这个。这似乎可以满足您的要求,从技术上讲,它通过网络浏览器控制以非常相似的方式解决问题。它似乎满足了传入的一系列参数的需求,并且还内置了良好的错误处理功能。唯一的缺点是它是您生成的外部进程 (exe),它会创建一个您稍后将阅读的物理文件。从你的描述来看,你甚至考虑了网络服务,所以我认为这不是问题。

在解决您关于如何同时处理多个问题的最新评论时,这将是完美的。您可以在任何时间生成 3、4、5 个或更多进程的并行进程,或者在另一个捕获进程发生时将颜色位分析作为线程运行。

对于图像处理,我最近遇到了 Emgu,我自己没有使用过它,但它看起来很有趣。它声称速度很快,并且对图形分析有很多支持,包括读取像素颜色。如果我现在手头有任何图形处理项目,我会尝试一下。

Check this out. This seems to do what you wanted and technically it approaches the problem in very similar way through web browser control. It seems to have catered for a range of parameters to be passed in and also good error handling built into it. The only downside is that it is an external process (exe) that you spawn and it create a physical file that you will read later. From your description, you even consider webservices, so I dont think that is a problem.

In solving your latest comment about how to process multiple of them simultaneously, this will be perfect. You can spawn say a parallel of 3, 4, 5 or more processes at any one time or have the analysis of the color bit running as thread while another capturing process is happening.

For image processing, I recently come across Emgu, havent used it myself but it seems fascinating. It claims to be fast and have a lot of support for graphic analysis including reading of pixel color. If I have any graphic processing project on hand right now I will give this a try.

水水月牙 2024-08-23 13:22:54

你也可以看看QT jambi
http://qt.nokia。 com/doc/qtjambi-4.4/html/com/trolltech/qt/qtjambi-index.html

进行屏幕截图:

    QPixmap pixmap;
    pixmap = QPixmap.grabWidget(browser);

    pixmap.save(writeTo, "png");

他们有一个很好的基于 webkit 的浏览器 java 实现,你可以简单地通过执行以下操作来 看看示例 - 他们有一个很好的网络浏览器演示。

you may also have a look at QT jambi
http://qt.nokia.com/doc/qtjambi-4.4/html/com/trolltech/qt/qtjambi-index.html

they have a nice webkit based java implementation for a browser where you can do a screenshot simply by doing sth like:

    QPixmap pixmap;
    pixmap = QPixmap.grabWidget(browser);

    pixmap.save(writeTo, "png");

Have a look at the samples - they have a nice webbrowser demo.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文