集装箱火炬工人下载启动时下载新的序列化文件

发布于 2025-02-10 20:26:18 字数 8065 浏览 1 评论 0 原文

我试图通过此示例,使用验证的快速RCNN模型构建一个运行火炬的容器,以在多合一的Dockerfile中进行对象检测: https://github.com/github.com/pytorch/pytorch/pytorch/serve/serve/serve/serve/tree/主/示例/object_detector/fast-rcnn

dockerfile

FROM pytorch/torchserve:latest

COPY ["config.properties", "model.py", "fasterrcnn_resnet50_fpn_coco-258fb6c6.pth", "index_to_name.json", "/home/model-server/"]

RUN torch-model-archiver \
  --model-name=fastrcnn \
  --version=1.0 \
  --model-file=/home/model-server/model.py \
  --serialized-file=/home/model-server/fasterrcnn_resnet50_fpn_coco-258fb6c6.pth \
  --handler=object_detector \
  --extra-files=/home/model-server/index_to_name.json \
  --export-path=/home/model-server/model-store

RUN rm model.py fasterrcnn_resnet50_fpn_coco-258fb6c6.pth index_to_name.json

CMD ["torchserve", \
     "--start", \
     "--model-store", "model-store", \
     "--ts-config", "config.properties", \
     "--models", "fastrcnn=fastrcnn.mar"]

model.py.py index_to_name.json 是从示例(< a href =“ https://github.com/pytorch/serve/serve/tree/master/master/examples/object_detector/fast-rcnn” rel =“ nofollow noreferrer”> https://github.com/github.com/pytorc.com/pytorch/pytorch/pytorch/serve/serve/tree/tree/master/master/master/master/master/master /示例/object_detector/fast-rcnn )并放置在根目录中。

fasterrcnn_resnet50_fpn_coco-258fb6c6.pth can be downloaded from https:// downlation.pytorch.org/models/fasterrcnn_resnet50_fpn_coco-258fb6c6.pth

config.properties :

inference_address=http://0.0.0.0:8080
management_address=http://0.0.0.0:8081
metrics_address=http://0.0.0.0:8082
number_of_netty_threads=32
job_queue_size=1000
model_store=/home/model-server/model-store
workflow_store=/home/model-server/wf-store
default_workers_per_model=1

使用: docker build-tag aio-fastrcnn。 运行良好(多合一的AIO)。

运行容器: docker run -rm -It -P 8080:8080 -P 8081:8081 -name fastrcnn aio -fastrcnn:最新 也可以运行良好,但是在启动期间,工人下载了另一个序列化模型。

model_log.log

model-server@15838dd41e69:~$ cat logs/model_log.log
2022-06-26T13:45:07,171 [INFO ] W-9000-fastrcnn_1.0-stdout MODEL_LOG - Listening on port: /home/model-server/tmp/.ts.sock.9000
2022-06-26T13:45:07,172 [INFO ] W-9000-fastrcnn_1.0-stdout MODEL_LOG - [PID]35
2022-06-26T13:45:07,173 [INFO ] W-9000-fastrcnn_1.0-stdout MODEL_LOG - Torch worker started.
2022-06-26T13:45:07,173 [INFO ] W-9000-fastrcnn_1.0-stdout MODEL_LOG - Python runtime: 3.8.0
2022-06-26T13:45:07,191 [INFO ] W-9000-fastrcnn_1.0-stdout MODEL_LOG - Connection accepted: /home/model-server/tmp/.ts.sock.9000.
2022-06-26T13:45:07,242 [INFO ] W-9000-fastrcnn_1.0-stdout MODEL_LOG - model_name: fastrcnn, batchSize: 1
2022-06-26T13:45:07,809 [INFO ] W-9000-fastrcnn_1.0-stdout MODEL_LOG - generated new fontManager
2022-06-26T13:45:08,757 [WARN ] W-9000-fastrcnn_1.0-stderr MODEL_LOG - Downloading: "https://download.pytorch.org/models/resnet50-0676ba61.pth" to /home/model-server/.cache/torch/hub/checkpoints/resnet50-0676ba61.pth
2022-06-26T13:45:08,919 [WARN ] W-9000-fastrcnn_1.0-stderr MODEL_LOG -
2022-06-26T13:45:09,021 [WARN ] W-9000-fastrcnn_1.0-stderr MODEL_LOG -   0%|          | 0.00/97.8M [00:00<?, ?B/s]
2022-06-26T13:45:09,125 [WARN ] W-9000-fastrcnn_1.0-stderr MODEL_LOG -   0%|          | 280k/97.8M [00:00<00:36, 2.81MB/s]
2022-06-26T13:45:09,230 [WARN ] W-9000-fastrcnn_1.0-stderr MODEL_LOG -   1%|          | 592k/97.8M [00:00<00:34, 2.97MB/s]
2022-06-26T13:45:09,341 [WARN ] W-9000-fastrcnn_1.0-stderr MODEL_LOG -   1%|          | 960k/97.8M [00:00<00:31, 3.26MB/s]
...
2022-06-26T13:45:45,230 [WARN ] W-9000-fastrcnn_1.0-stderr MODEL_LOG -  99%|█████████▉| 96.7M/97.8M [00:36<00:00, 2.68MB/s]
2022-06-26T13:45:45,344 [WARN ] W-9000-fastrcnn_1.0-stderr MODEL_LOG -  99%|█████████▉| 97.0M/97.8M [00:36<00:00, 2.67MB/s]
2022-06-26T13:45:45,449 [WARN ] W-9000-fastrcnn_1.0-stderr MODEL_LOG -  99%|█████████▉| 97.2M/97.8M [00:36<00:00, 2.63MB/s]
2022-06-26T13:45:45,547 [WARN ] W-9000-fastrcnn_1.0-stderr MODEL_LOG - 100%|█████████▉| 97.5M/97.8M [00:36<00:00, 2.68MB/s]
2022-06-26T13:45:45,548 [WARN ] W-9000-fastrcnn_1.0-stderr MODEL_LOG - 100%|██████████| 97.8M/97.8M [00:36<00:00, 2.80MB/s]

下载文件后,服​​务器运行良好,可以进行调整并运行正确的模型。

ping:

$ curl http://localhost:8080/ping -UseBasicParsing


StatusCode        : 200
StatusDescription : OK
Content           : {
                      "status": "Healthy"
                    }
...

模型:

$ curl http://localhost:8081/models -UseBasicParsing


StatusCode        : 200
StatusDescription : OK
Content           : {
                      "models": [
                        {
                          "modelName": "fastrcnn",
                          "modelUrl": "fastrcnn.mar"
                        }
                      ]
                    }
...

我不明白为什么下载了这个新的序列化文件。我以为火炬模型-Archiver 背后的想法是将所有必要的文件组合到一个文件中。我是否从根本上误解了Torchserve或Docker的工作方式?

startup_log.log

model-server@15838dd41e69:~$ cat logs/config/20220626134506612-startup.cfg
#Saving snapshot
#Sun Jun 26 13:45:06 GMT 2022
inference_address=http\://0.0.0.0\:8080
default_workers_per_model=1
load_models=fastrcnn\=fastrcnn.mar
model_store=model-store
number_of_gpu=0
job_queue_size=1000
python=/home/venv/bin/python
model_snapshot={\n  "name"\: "20220626134506612-startup.cfg",\n  "modelCount"\: 1,\n  "created"\: 1656251106614,\n  "models"\: {\n    "fastrcnn"\: {\n      "1.0"\: {\n        "defaultVersion"\: true,\n        "marName"\: "fastrcnn.mar",\n        "minWorkers"\: 1,\n        "maxWorkers"\: 1,\n        "batchSize"\: 1,\n        "maxBatchDelay"\: 100,\n        "responseTimeout"\: 120\n      }\n    }\n  }\n}
tsConfigFile=config.properties
version=0.6.0
workflow_store=model-store
number_of_netty_threads=32
management_address=http\://0.0.0.0\:8081
metrics_address=http\://0.0.0.0\:8082

我也尝试了这些步骤更换 fasterrcnn_resnet50_fpn_coco-258fb6c6.pth resnet50-0676ba61.pth 在所有必要的地方。但是工人仍然下载 resnet50-0676ba61.pth 在启动期间。


resnet50-0676ba61.pth 在模型构建过程中需要,并且pytorch检查文件是否在 model_dir (如果不是下载)中。我将Dockerfile更新以复制 resnet50-0676ba61.pth 中的火炬容器,并将其作为超额文件包含在火炬模型Arac​​hiver命令中。 dockerfile

FROM pytorch/torchserve:latest

COPY ["config.properties", \
      "model.py", \
      "fasterrcnn_resnet50_fpn_coco-258fb6c6.pth", \
      "resnet50-0676ba61.pth", \
      "index_to_name.json", \
      "/home/model-server/"]

RUN torch-model-archiver \
  --model-name=fastrcnn \
  --version=1.0 \
  --model-file=/home/model-server/model.py \
  --serialized-file=/home/model-server/fasterrcnn_resnet50_fpn_coco-258fb6c6.pth \
  --handler=object_detector \
  --extra-files=/home/model-server/resnet50-0676ba61.pth,/home/model-server/index_to_name.json \
  --export-path=/home/model-server/model-store

CMD ["torchserve", \
     "--start", \
     "--model-store", "model-store", \
     "--ts-config", "config.properties", \
     "--models", "fastrcnn=fastrcnn.mar"]

根据 resnet50-0676ba61.pth 应该可以访问,但仍会在每家初创公司下载它。

I am trying to build a container running torchserve with the pretrained fast-rcnn model for object detection in a all-in-one Dockerfile, based on this example:
https://github.com/pytorch/serve/tree/master/examples/object_detector/fast-rcnn

Dockerfile:

FROM pytorch/torchserve:latest

COPY ["config.properties", "model.py", "fasterrcnn_resnet50_fpn_coco-258fb6c6.pth", "index_to_name.json", "/home/model-server/"]

RUN torch-model-archiver \
  --model-name=fastrcnn \
  --version=1.0 \
  --model-file=/home/model-server/model.py \
  --serialized-file=/home/model-server/fasterrcnn_resnet50_fpn_coco-258fb6c6.pth \
  --handler=object_detector \
  --extra-files=/home/model-server/index_to_name.json \
  --export-path=/home/model-server/model-store

RUN rm model.py fasterrcnn_resnet50_fpn_coco-258fb6c6.pth index_to_name.json

CMD ["torchserve", \
     "--start", \
     "--model-store", "model-store", \
     "--ts-config", "config.properties", \
     "--models", "fastrcnn=fastrcnn.mar"]

model.py and index_to_name.json are taken from the example (https://github.com/pytorch/serve/tree/master/examples/object_detector/fast-rcnn) and placed in the root directory.

fasterrcnn_resnet50_fpn_coco-258fb6c6.pth can be downloaded from https://download.pytorch.org/models/fasterrcnn_resnet50_fpn_coco-258fb6c6.pth

config.properties:

inference_address=http://0.0.0.0:8080
management_address=http://0.0.0.0:8081
metrics_address=http://0.0.0.0:8082
number_of_netty_threads=32
job_queue_size=1000
model_store=/home/model-server/model-store
workflow_store=/home/model-server/wf-store
default_workers_per_model=1

Building the container with:
docker build --tag aio-fastrcnn .
runs fine (aio for all-in-one).

Running the container with:
docker run --rm -it -p 8080:8080 -p 8081:8081 --name fastrcnn aio-fastrcnn:latest
also runs fine, but during start-up the worker downloads a different serialized model.

model_log.log:

model-server@15838dd41e69:~$ cat logs/model_log.log
2022-06-26T13:45:07,171 [INFO ] W-9000-fastrcnn_1.0-stdout MODEL_LOG - Listening on port: /home/model-server/tmp/.ts.sock.9000
2022-06-26T13:45:07,172 [INFO ] W-9000-fastrcnn_1.0-stdout MODEL_LOG - [PID]35
2022-06-26T13:45:07,173 [INFO ] W-9000-fastrcnn_1.0-stdout MODEL_LOG - Torch worker started.
2022-06-26T13:45:07,173 [INFO ] W-9000-fastrcnn_1.0-stdout MODEL_LOG - Python runtime: 3.8.0
2022-06-26T13:45:07,191 [INFO ] W-9000-fastrcnn_1.0-stdout MODEL_LOG - Connection accepted: /home/model-server/tmp/.ts.sock.9000.
2022-06-26T13:45:07,242 [INFO ] W-9000-fastrcnn_1.0-stdout MODEL_LOG - model_name: fastrcnn, batchSize: 1
2022-06-26T13:45:07,809 [INFO ] W-9000-fastrcnn_1.0-stdout MODEL_LOG - generated new fontManager
2022-06-26T13:45:08,757 [WARN ] W-9000-fastrcnn_1.0-stderr MODEL_LOG - Downloading: "https://download.pytorch.org/models/resnet50-0676ba61.pth" to /home/model-server/.cache/torch/hub/checkpoints/resnet50-0676ba61.pth
2022-06-26T13:45:08,919 [WARN ] W-9000-fastrcnn_1.0-stderr MODEL_LOG -
2022-06-26T13:45:09,021 [WARN ] W-9000-fastrcnn_1.0-stderr MODEL_LOG -   0%|          | 0.00/97.8M [00:00<?, ?B/s]
2022-06-26T13:45:09,125 [WARN ] W-9000-fastrcnn_1.0-stderr MODEL_LOG -   0%|          | 280k/97.8M [00:00<00:36, 2.81MB/s]
2022-06-26T13:45:09,230 [WARN ] W-9000-fastrcnn_1.0-stderr MODEL_LOG -   1%|          | 592k/97.8M [00:00<00:34, 2.97MB/s]
2022-06-26T13:45:09,341 [WARN ] W-9000-fastrcnn_1.0-stderr MODEL_LOG -   1%|          | 960k/97.8M [00:00<00:31, 3.26MB/s]
...
2022-06-26T13:45:45,230 [WARN ] W-9000-fastrcnn_1.0-stderr MODEL_LOG -  99%|█████████▉| 96.7M/97.8M [00:36<00:00, 2.68MB/s]
2022-06-26T13:45:45,344 [WARN ] W-9000-fastrcnn_1.0-stderr MODEL_LOG -  99%|█████████▉| 97.0M/97.8M [00:36<00:00, 2.67MB/s]
2022-06-26T13:45:45,449 [WARN ] W-9000-fastrcnn_1.0-stderr MODEL_LOG -  99%|█████████▉| 97.2M/97.8M [00:36<00:00, 2.63MB/s]
2022-06-26T13:45:45,547 [WARN ] W-9000-fastrcnn_1.0-stderr MODEL_LOG - 100%|█████████▉| 97.5M/97.8M [00:36<00:00, 2.68MB/s]
2022-06-26T13:45:45,548 [WARN ] W-9000-fastrcnn_1.0-stderr MODEL_LOG - 100%|██████████| 97.8M/97.8M [00:36<00:00, 2.80MB/s]

https://download.pytorch.org/models/resnet50-0676ba61.pth

Once the file is downloaded the server runs fine, can be pinged and runs the correct model.

Ping:

$ curl http://localhost:8080/ping -UseBasicParsing


StatusCode        : 200
StatusDescription : OK
Content           : {
                      "status": "Healthy"
                    }
...

Model:

$ curl http://localhost:8081/models -UseBasicParsing


StatusCode        : 200
StatusDescription : OK
Content           : {
                      "models": [
                        {
                          "modelName": "fastrcnn",
                          "modelUrl": "fastrcnn.mar"
                        }
                      ]
                    }
...

I don't understand why this new serialized file is downloaded. I thought the idea behind the torch-model-archiver was to combine all necessary files into a single one. Have I fundamentally misunderstood something about how torchserve or docker works?

startup_log.log:

model-server@15838dd41e69:~$ cat logs/config/20220626134506612-startup.cfg
#Saving snapshot
#Sun Jun 26 13:45:06 GMT 2022
inference_address=http\://0.0.0.0\:8080
default_workers_per_model=1
load_models=fastrcnn\=fastrcnn.mar
model_store=model-store
number_of_gpu=0
job_queue_size=1000
python=/home/venv/bin/python
model_snapshot={\n  "name"\: "20220626134506612-startup.cfg",\n  "modelCount"\: 1,\n  "created"\: 1656251106614,\n  "models"\: {\n    "fastrcnn"\: {\n      "1.0"\: {\n        "defaultVersion"\: true,\n        "marName"\: "fastrcnn.mar",\n        "minWorkers"\: 1,\n        "maxWorkers"\: 1,\n        "batchSize"\: 1,\n        "maxBatchDelay"\: 100,\n        "responseTimeout"\: 120\n      }\n    }\n  }\n}
tsConfigFile=config.properties
version=0.6.0
workflow_store=model-store
number_of_netty_threads=32
management_address=http\://0.0.0.0\:8081
metrics_address=http\://0.0.0.0\:8082

I have also tried these steps replacing
fasterrcnn_resnet50_fpn_coco-258fb6c6.pth
with
resnet50-0676ba61.pth
in all necessary places. But the worker still downloads resnet50-0676ba61.pth during startup.


resnet50-0676ba61.pth is needed during model building and pytorch checks if the file is in the model_dir, if not it is downloaded. I updated the Dockerfile to copy resnet50-0676ba61.pth into the torchserve container and included it as an extra-file in the torch-model-archiver command.
Dockerfile:

FROM pytorch/torchserve:latest

COPY ["config.properties", \
      "model.py", \
      "fasterrcnn_resnet50_fpn_coco-258fb6c6.pth", \
      "resnet50-0676ba61.pth", \
      "index_to_name.json", \
      "/home/model-server/"]

RUN torch-model-archiver \
  --model-name=fastrcnn \
  --version=1.0 \
  --model-file=/home/model-server/model.py \
  --serialized-file=/home/model-server/fasterrcnn_resnet50_fpn_coco-258fb6c6.pth \
  --handler=object_detector \
  --extra-files=/home/model-server/resnet50-0676ba61.pth,/home/model-server/index_to_name.json \
  --export-path=/home/model-server/model-store

CMD ["torchserve", \
     "--start", \
     "--model-store", "model-store", \
     "--ts-config", "config.properties", \
     "--models", "fastrcnn=fastrcnn.mar"]

According to https://github.com/pytorch/serve/issues/633#issuecomment-677759331 resnet50-0676ba61.pth should be accessible but it still gets downloaded on every startup.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文