通过Google Chrome以无头模式下载文件

发布于 2025-01-22 15:27:08 字数 684 浏览 5 评论 0 原文

我在“正常”模式下在Cromedrive中做代码,并且效果很好。当我更改为无头模式时,它不会下载文件。我已经尝试了我找到的Alround Internet的代码,但没有起作用。

chrome_options = Options()
chrome_options.add_argument("--headless")
self.driver = webdriver.Chrome(chrome_options=chrome_options, executable_path=r'{}/chromedriver'.format(os.getcwd()))
self.driver.set_window_size(1024, 768)
self.driver.command_executor._commands["send_command"] = ("POST", '/session/$sessionId/chromium/send_command')

params = {'cmd': 'Page.setDownloadBehavior', 'params': {'behavior': 'allow', 'downloadPath': os.getcwd()}}
self.driver.execute("send_command", params)

有人知道如何解决这个问题吗?

PS:我不需要使用Chomedrive。如果在另一个驱动器中起作用,对我来说很好。

I'm do me code in Cromedrive in 'normal' mode and works fine. When I change to headless mode it don't download the file. I already try the code I found alround internet, but didn't work.

chrome_options = Options()
chrome_options.add_argument("--headless")
self.driver = webdriver.Chrome(chrome_options=chrome_options, executable_path=r'{}/chromedriver'.format(os.getcwd()))
self.driver.set_window_size(1024, 768)
self.driver.command_executor._commands["send_command"] = ("POST", '/session/$sessionId/chromium/send_command')

params = {'cmd': 'Page.setDownloadBehavior', 'params': {'behavior': 'allow', 'downloadPath': os.getcwd()}}
self.driver.execute("send_command", params)

Anyone have any idea about how solve this problem?

PS: I don't need to use Chomedrive necessarily. If it works in another drive it's fine for me.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

紫﹏色ふ单纯 2025-01-29 15:27:08

首先是解决方案

最低先决条件:


下载文件单击该元素的文本,at text as 下载 //www.mockaroo.com/“ rel =“ noreferrer”>此网站您可以使用以下解决方案:

  • 代码块:

     来自Selenium Import WebDriver
    从selenium.webdriver.common.通过进口
    来自selenium.webdriver.support.ui导入WebDriverWait
    从selenium.webdriver.support进口预期_conditions作为ec
    来自selenium.webdriver.chrome.options导入选项
    
    选项=选项()
    options.add_argument(“  - 无头”)
    options.add_argument(“  - 窗口大小= 1920,1080”)
    options.add_experimentim_option(“ dubludeswitches”,[“ enable-automation”])
    options.add_experimentim_option('useAutomationExtension',false)
    驱动程序= WebDriver.Chrome(Chrome_options = options,executable_path = r'c:\ utility \ browserrivers \ chromedrivers \ chromedriver.exe',service_args = [ -  log-path =。/log-path =。/logs/dibiousdan.log“
    打印(“无头铬初始化”)
    params = {'行为':'允许','downloadpath':r'c:\ users \ debanjan.b \ downloads'}
    driver.execute_cdp_cmd('page.setDownloadbehavior',params)
    driver.get(“ https://www.mockaroo.com/”)
    driver.execute_script(“ scroll(0,250)”); 
    WebDriverWait(驱动程序,20).until(ec.element_to_be_clickable(((by.css_selector
    打印(“下载按钮点击”)
    #driver.quit()
     
  • 控制台输出:

     无头铬初始化
    下载按钮点击
     
  • 文件下载快照:

“


通过

无头铬是自无头铬

从那时起,有不同贡献者发表了不同的工作,其中一些是:

现在,好消息是 chromium 团队已正式宣布了功能通过无头铬 下载文件的到来。


在讨论中无头模式不保存文件下载 href =“ https://bugs.chromium.org/u/1164142555/” rel =“ noreferrer”>@eseckler 提到:

在无头工作中下载有所不同。有 page.setDownloadbehavior devtools命令设置下载文件夹。我们正在努力使用DevTools网络截距也通过DevTools流式传输文件。

可以在

最后,@bugdroid 修订似乎已经为我们确定了这个问题。


[Chromedriver]增加了对无头模式下载文件的支持

以前,由于它稀少地解析了给出的优先文件,因此以无头模式运行的Chromedriver无法正确下载文件。无头Chrome团队的工程师建议使用DevTools的“ Page.SetDownloadbehavior”来解决此问题。这个变更者实施了此修复程序。下载的文件默认为当前目录,可以在实例化Chromedriver实例时使用Download_dir设置。还添加了测试以确保适当的下载功能。

这是

​rel =“ noreferrer”> chromedriver v77.0.3865.40(2019-08-20)发行说明:

Resolved issue 2454: Headless mode doesn't save file downloads [Pri-2]

解决方案

  • 更新 chromedriver to最新
  • 更新 chrome to chrome版本77.0 级别。 按
  • ( >注意: chrome v77.0 尚未被盖住/推销以进行释放,因此您可以下载并安装开发构建和测试:


eutro

mac osx 用户可以等待他们的pie,as 在Chromedriver上,在macOSX上发送Page.set downloadbehavior后,无头Chrome崩溃。

First the solution

Minimum Prerequisites:

To download the file clicking on the element with text as Download Data within this website you can use the following solution:

  • Code Block:

    from selenium import webdriver
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    from selenium.webdriver.chrome.options import Options
    
    options = Options()
    options.add_argument("--headless")
    options.add_argument("--window-size=1920,1080")
    options.add_experimental_option("excludeSwitches", ["enable-automation"])
    options.add_experimental_option('useAutomationExtension', False)
    driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe', service_args=["--log-path=./Logs/DubiousDan.log"])
    print ("Headless Chrome Initialized")
    params = {'behavior': 'allow', 'downloadPath': r'C:\Users\Debanjan.B\Downloads'}
    driver.execute_cdp_cmd('Page.setDownloadBehavior', params)
    driver.get("https://www.mockaroo.com/")
    driver.execute_script("scroll(0, 250)"); 
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button#download"))).click()
    print ("Download button clicked")
    #driver.quit()
    
  • Console Output:

    Headless Chrome Initialized
    Download button clicked
    
  • File Downloading snapshot:

ChromeHeadlessDownload


Details

Downloading files through Headless Chromium was one of the most sought functionality since Headless Chrome was introduced.

Since then there were different work-arounds published by different contributors and some of them are:

Now the, the good news is Chromium team have officially announced the arrival of the functionality Downloading file through Headless Chromium.


In the discussion Headless mode doesn't save file downloads @eseckler mentioned:

Downloads in headless work a little differently. There's the Page.setDownloadBehavior devtools command to set a download folder. We're working on a way to use DevTools network interception to stream the downloaded file via DevTools as well.

A detailed discussion can be found at Issue 696481: Headless mode doesn't save file downloads

Finally, @bugdroid revision seems to have nailed the issue for us.


[ChromeDriver] Added support for headless mode to download files

Previously, Chromedriver running in headless mode would not properly download files due to the fact it sparsely parses the preference file given to it. Engineers from the headless chrome team recommended using DevTools's "Page.setDownloadBehavior" to fix this. This changelist implements this fix. Downloaded files default to the current directory and can be set using download_dir when instantiating a chromedriver instance. Also added tests to ensure proper download functionality.

Here is the revision and commit

From ChromeDriver v77.0.3865.40 (2019-08-20) release notes:

Resolved issue 2454: Headless mode doesn't save file downloads [Pri-2]

Solution


Outro

However Mac OSX users have a wait for their pie as On Chromedriver, headless chrome crashes after sending Page.setDownloadBehavior on MacOSX.

拥有 2025-01-29 15:27:08

chomedriver版本:95.0.4638.54
Chrome版本95.0.4638.69

    from selenium.webdriver.chrome.options import Options    
 
    options = Options()
    options.add_argument("--headless")
    options.add_argument("--start-maximized")
    options.add_argument("--no-sandbox")
    options.add_argument("--disable-extensions")
    options.add_argument('--disable-dev-shm-usage')    
    options.add_argument("--disable-gpu")
    options.add_argument('--disable-software-rasterizer')
    options.add_argument("user-agent=Mozilla/5.0 (Windows Phone 10.0; Android 4.2.1; Microsoft; Lumia 640 XL LTE) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Mobile Safari/537.36 Edge/12.10166")
    options.add_argument("--disable-notifications")

    options.add_experimental_option("prefs", {
        "download.default_directory": "C:\\link\\to\\folder",
        "download.prompt_for_download": False,
        "download.directory_upgrade": True,
        "safebrowsing_for_trusted_sources_enabled": False,
        "safebrowsing.enabled": False
        }
    )

似乎有效的是我使用“ \\”而不是“/”用于地址。后一种方法没有丢下任何错误,但也没有下载任何文档。但是,使用双重斜线完成了这项工作。

Chomedriver Version: 95.0.4638.54
Chrome Version 95.0.4638.69

    from selenium.webdriver.chrome.options import Options    
 
    options = Options()
    options.add_argument("--headless")
    options.add_argument("--start-maximized")
    options.add_argument("--no-sandbox")
    options.add_argument("--disable-extensions")
    options.add_argument('--disable-dev-shm-usage')    
    options.add_argument("--disable-gpu")
    options.add_argument('--disable-software-rasterizer')
    options.add_argument("user-agent=Mozilla/5.0 (Windows Phone 10.0; Android 4.2.1; Microsoft; Lumia 640 XL LTE) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Mobile Safari/537.36 Edge/12.10166")
    options.add_argument("--disable-notifications")

    options.add_experimental_option("prefs", {
        "download.default_directory": "C:\\link\\to\\folder",
        "download.prompt_for_download": False,
        "download.directory_upgrade": True,
        "safebrowsing_for_trusted_sources_enabled": False,
        "safebrowsing.enabled": False
        }
    )

What seemed to work was that I used "\\" instead of "/" for the address. The latter approach didn't throw any error, but didn't download any documents either. But, using double back slashes did the job.

霓裳挽歌倾城醉 2025-01-29 15:27:08

对于JavaScript使用以下代码:

    const chrome = require('selenium-webdriver/chrome');
    let options = new chrome.Options();
    options.addArguments('--headless --window-size=1500,1200');
    options.setUserPreferences({ 'plugins.always_open_pdf_externally': true,
    "profile.default_content_settings.popups": 0,
    "download.default_directory": Download_File_Path });
    driver = await new webdriver.Builder().setChromeOptions(options).forBrowser('chrome').build();

然后单击“下载按钮:”,请切换选项卡

    await driver.sleep(1000); 
    var Handle = await driver.getAllWindowHandles();
    await driver.switchTo().window(Handle[1]);

For javascript use below code:

    const chrome = require('selenium-webdriver/chrome');
    let options = new chrome.Options();
    options.addArguments('--headless --window-size=1500,1200');
    options.setUserPreferences({ 'plugins.always_open_pdf_externally': true,
    "profile.default_content_settings.popups": 0,
    "download.default_directory": Download_File_Path });
    driver = await new webdriver.Builder().setChromeOptions(options).forBrowser('chrome').build();

Then switch tabs as soon as you click the download button:

    await driver.sleep(1000); 
    var Handle = await driver.getAllWindowHandles();
    await driver.switchTo().window(Handle[1]);
疯了 2025-01-29 15:27:08

此c#对我有用

注意,请注意新的无头选项 https:// www .selenium.dev/blog/2023/headless-is-going-away/

private IWebDriver StartBrowserChromeHeadlessDriver()
{
    var chromeOptions = new ChromeOptions();
    chromeOptions.AddArgument("--headless=new");
    chromeOptions.AddArgument("--window-size=1920,1080");
    chromeOptions.AddUserProfilePreference("download.default_directory", downloadFolder);

    var chromeDownload = new Dictionary<string, object>
    {
        { "behavior", "allow" },
        { "downloadPath", downloadFolder }
    };

    var driver = new ChromeDriver(driverFolder, chromeOptions, TimeSpan.FromSeconds(timeoutSecs));
    driver.ExecuteCdpCommand("Browser.setDownloadBehavior", chromeDownload);
    return driver;
}

This C# works for me

Note the new headless option https://www.selenium.dev/blog/2023/headless-is-going-away/

private IWebDriver StartBrowserChromeHeadlessDriver()
{
    var chromeOptions = new ChromeOptions();
    chromeOptions.AddArgument("--headless=new");
    chromeOptions.AddArgument("--window-size=1920,1080");
    chromeOptions.AddUserProfilePreference("download.default_directory", downloadFolder);

    var chromeDownload = new Dictionary<string, object>
    {
        { "behavior", "allow" },
        { "downloadPath", downloadFolder }
    };

    var driver = new ChromeDriver(driverFolder, chromeOptions, TimeSpan.FromSeconds(timeoutSecs));
    driver.ExecuteCdpCommand("Browser.setDownloadBehavior", chromeDownload);
    return driver;
}
后eg是否自 2025-01-29 15:27:08

解决方案:

options.add_argument("--headless=new");

我尝试了其他一切。加上这条线对我有用。

options = Options()
options.add_argument("--headless=new");
options.add_argument('--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3')
options.add_argument('--window-size=1920,1080')
options.add_argument('--disable-dev-shm-usage')
options.add_experimental_option("prefs", {"download.default_directory": "C:\\Path\\To\\Directory", "download.directory_upgrade": True, "download.prompt_for_download": False})

driver_path = 'C:\\Path\\To\\Driver.exe'

driver = webdriver.Chrome(executable_path=driver_path, options=options)

Solution:

options.add_argument("--headless=new");

I tried everything else. Adding that line is what worked for me.

options = Options()
options.add_argument("--headless=new");
options.add_argument('--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3')
options.add_argument('--window-size=1920,1080')
options.add_argument('--disable-dev-shm-usage')
options.add_experimental_option("prefs", {"download.default_directory": "C:\\Path\\To\\Directory", "download.directory_upgrade": True, "download.prompt_for_download": False})

driver_path = 'C:\\Path\\To\\Driver.exe'

driver = webdriver.Chrome(executable_path=driver_path, options=options)
哎呦我呸! 2025-01-29 15:27:08
import pathlib
from selenium.webdriver import Chrome
driver = Chrome()
driver.execute_cdp_cmd("Page.setDownloadBehavior", {
    "behavior": "allow",
    "downloadPath": str(pathlib.Path.home().joinpath("Downloads"))
})
import pathlib
from selenium.webdriver import Chrome
driver = Chrome()
driver.execute_cdp_cmd("Page.setDownloadBehavior", {
    "behavior": "allow",
    "downloadPath": str(pathlib.Path.home().joinpath("Downloads"))
})
£冰雨忧蓝° 2025-01-29 15:27:08

我认为您不应该使用浏览器下载内容,将其留给Chrome Developers/Testers。

我相信您应该宁愿获得 href属性使用请求库

如果您的网站需要身份验证,则可以获取 cookies 来自浏览器实例,并将其传递给 requests.session

I don't think you should be using the browser for downloading content, leave it to Chrome developers/testers.

I believe you should rather get href attribute of the element you want to download and obtain it using requests library

If your site requires authentication you could fetch cookies from the browser instance and pass them to requests.Session.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文