我正在尝试在.gitlab-ci.yml文件中使用缓存,但是时间仅增加(通过添加空白行进行测试)。我想使用PIP安装的python软件包。
这是我安装和使用这些软件包的阶段(其他阶段使用Docker):
image: python:3.8-slim-buster
variables:
PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"
cache:
paths:
- .cache/pip
stages:
- lint
- test
- build
- deploy
test-job:
stage: test
before_script:
- apt-get update
- apt-get install -y --no-install-recommends gcc
- apt install -y default-libmysqlclient-dev
- pip3 install -r requirements.txt
script:
- pytest tests/test.py
运行此管道后,每个管道都会增加管道时间。
我从GitLab文档中遵循这些步骤 -
尽管我不使用VENV,因为它没有它。
我仍然不确定如果不使用pip_cache_dir变量,为什么需要它,但是我遵循了文档。
缓存python依赖性的正确方法是什么?我希望不使用VENV。
I am trying to use cache in my .gitlab-ci.yml file, but the time only increases (testing by adding blank lines). I want to cache python packages I install with pip.
Here is the stage where I install and use these packages (other stages uses Docker):
image: python:3.8-slim-buster
variables:
PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"
cache:
paths:
- .cache/pip
stages:
- lint
- test
- build
- deploy
test-job:
stage: test
before_script:
- apt-get update
- apt-get install -y --no-install-recommends gcc
- apt install -y default-libmysqlclient-dev
- pip3 install -r requirements.txt
script:
- pytest tests/test.py
After running this pipeline, with each pipeline, the pipeline time just increases.
I was following these steps from GitLab documentation - https://docs.gitlab.com/ee/ci/caching/#cache-python-dependencies
Although I am not using venv since it works without it.
I am still not sure why the PIP_CACHE_DIR variable is needed if it is not used, but I followed the documentation.
What is the correct way to cache python dependencies? I would prefer not to use venv.
发布评论
评论(2)
PIP_CACHE_DIR
是一个 pip 功能,可用于设置缓存目录。这个问题的第二个答案解释了这一点。
对此可能存在一些分歧,但我认为对于 pip 包或节点模块之类的东西,为每个管道重新下载它们会更快。
当 Gitlab 使用 Gitlab 创建的缓存进行缓存时,
Gitlab 创建的缓存会被压缩并存储在某处(存储位置取决于运行器配置)。这需要压缩并上传缓存。然后,当创建另一个管道时,需要下载并解压缓存。如果使用缓存会减慢作业执行速度,那么删除缓存可能是有意义的。
PIP_CACHE_DIR
is a pip feature that can be used to set the cache dir.The second answer to this question explains it.
There may be some disagreement on this, but I think that for something like pip packages or node modules, it is quicker to download them fresh for each pipeline.
When the packages are cached by Gitlab by using
The cache that Gitlab creates gets zipped and stored somewhere(where it gets stored depends on runner config). This requires zipping and uploading the cache. Then when another pipeline gets created, the cache needs to be downloaded and unpacked. If using a cache is slowing down job execution, then it might make sense to just remove the cache.
还:
gitlab文档描述应在作业上设置缓存;它不能为管道全球设置。这可能会导致您的配置不起作用。
Also:
Gitlab documentation describes that cache should be set on the job; it cannot be set globally for the pipeline. This may cause your configuration to not work.