无法在Docker上本地运行脚本/软件包
我正在创建一个Web Craper软件包,并正在上传到Docker。虽然我可以构建到本地的Docker存储库,但如果没有以下错误出现以下错误,我无法运行脚本:
selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally.
(unknown error: DevToolsActivePort file doesn't exist)
(The process started from chrome location /usr/bin/google-chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
这是我到目前为止的主要脚本中所拥有的,可以尝试在Docker上运行它:
def __init__(self, url: str = " url goes here ",
options: Optional[ChromeOptions] = None): #default url
options = ChromeOptions()
self.driver = Chrome(ChromeDriverManager().install(), options=options)
options.add_argument("--no-sandbox")
options.binary_location = '/usr/bin/google-chrome'
options.add_argument("--headless")
options.add_argument("--disable-dev-shm-usage")
options.add_argument("--disable-setuid-sandbox")
options.add_argument("--remote-debugging-port=9222")
options.add_argument("start-maximized")
options.add_argument('--disable-gpu')
options.add_argument("window-size=1920,1080")
从其他帖子中我注意到,对于某些IT来说,就像更改options.add_argument
的顺序一样简单,我已经尝试过,但发现它对我不起作用。
我在同一脚本中也有以下模块:
import os
import selenium
from selenium.webdriver import Chrome
from webdriver_manager.chrome import ChromeDriverManager #installs Chrome webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException, NoSuchElementException
from selenium.webdriver import ChromeOptions
from selenium.webdriver.chrome.service import Service
from typing import Optional
import time
import boto3
from sqlalchemy import create_engine
import urllib.request
import tempfile #temporary directory - to be removed after all operations have finished
在我的Dockerfile中:
FROM python:3.8
#Set Chrome Repo
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add -\
&& sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list'\
&& apt-get -y update\
#Install Chrome
&& apt-get install -y google-chrome-stable\
&& wget -O /tmp/chromedriver.zip http://chromedriver.storage.googleapis.com/`curl -sS chromedriver.storage.googleapis.com/LATEST_RELEASE`/chromedriver_linux64.zip\
&& apt-get install -yqq unzip\
&& unzip /tmp/chromedriver.zip chromedriver -d /usr/local/bin/
COPY . .
RUN pip install -r requirements.txt
#When we run the container, this will be the command run
CMD ["python", "scraper/webscraper.py"]
以防万一我在Windows OS上使用Docker和Vscode。
I'm creating a webscraper package and am in the process of uploading to Docker. Whilst I can build to the local Docker repository, I cannot run the script without the following errors appearing:
selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally.
(unknown error: DevToolsActivePort file doesn't exist)
(The process started from chrome location /usr/bin/google-chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
Here is what I have in the main script so far to try and get it running on Docker:
def __init__(self, url: str = " url goes here ",
options: Optional[ChromeOptions] = None): #default url
options = ChromeOptions()
self.driver = Chrome(ChromeDriverManager().install(), options=options)
options.add_argument("--no-sandbox")
options.binary_location = '/usr/bin/google-chrome'
options.add_argument("--headless")
options.add_argument("--disable-dev-shm-usage")
options.add_argument("--disable-setuid-sandbox")
options.add_argument("--remote-debugging-port=9222")
options.add_argument("start-maximized")
options.add_argument('--disable-gpu')
options.add_argument("window-size=1920,1080")
From other posts I note that for some it was as simple as changing the order of options.add_argument
, which I have tried but found it doesn't work for me.
I also have the following modules within the same script:
import os
import selenium
from selenium.webdriver import Chrome
from webdriver_manager.chrome import ChromeDriverManager #installs Chrome webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException, NoSuchElementException
from selenium.webdriver import ChromeOptions
from selenium.webdriver.chrome.service import Service
from typing import Optional
import time
import boto3
from sqlalchemy import create_engine
import urllib.request
import tempfile #temporary directory - to be removed after all operations have finished
In my Dockerfile:
FROM python:3.8
#Set Chrome Repo
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add -\
&& sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list'\
&& apt-get -y update\
#Install Chrome
&& apt-get install -y google-chrome-stable\
&& wget -O /tmp/chromedriver.zip http://chromedriver.storage.googleapis.com/`curl -sS chromedriver.storage.googleapis.com/LATEST_RELEASE`/chromedriver_linux64.zip\
&& apt-get install -yqq unzip\
&& unzip /tmp/chromedriver.zip chromedriver -d /usr/local/bin/
COPY . .
RUN pip install -r requirements.txt
#When we run the container, this will be the command run
CMD ["python", "scraper/webscraper.py"]
Just in case, I am using Docker and VSCode on Windows OS.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我已经意识到,通过更改选项顺序。ADD_ARGUMENTS和self.Driver,该脚本运行良好。这是因为当驾驶员应与以下方式相反时,首先是创建驱动程序的。
I've realised that by changing the order of options.add_arguments and self.driver, the script will run just fine. This is because the driver is being created first when it should be the other way around as follows: