每次都不启动模型,在无需启动模型的情况下,使用gpu内部的模型
每次我将变压器模型加载到GPU中时,都需要大约60秒。
因此,我想在不每次启动的情况下通过烧瓶请求访问GPU中的模型。
因此,我尝试将模型保存在BaseManager
中,然后访问它。
from multiprocessing.managers import BaseManager
manager = BaseManager(('', 37844), b'password')
manager.connect()
generator = pipeline('text-generation', model=MODEL_NAME, device=1)
manager.register('generator', generator)
但是在尝试访问模型的同时 和 generator = manager.generator()
我获得以下错误
无法在分叉子过程中重新定位CUDA。要将CUDA与多处理一起使用,您必须使用'Spawn'start方法
,并且在进一步挖掘错误时,要求使用多处理器
in torch
而不是没有BaseManager
。
from torch.multiprocessing import Pool, Process, set_start_method
那么,如何在烧瓶中有效地使用模型?
Each time I load a transformer model into GPU it takes ~60 seconds.
So, I want to access the model in GPU across flask requests without initiating it each time.
So, I tried to save the model in BaseManager
and then access it.
from multiprocessing.managers import BaseManager
manager = BaseManager(('', 37844), b'password')
manager.connect()
generator = pipeline('text-generation', model=MODEL_NAME, device=1)
manager.register('generator', generator)
but while trying to access the model
withgenerator = manager.generator()
I get the following error
Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
and while digging further into the error it asked to use multiprocessor
from torch
instead but that doesn't have a BaseManager
.
from torch.multiprocessing import Pool, Process, set_start_method
So, how does one efficiently use Models across requests in Flask?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论