尝试在 Flask 应用程序中取消模型时出现 ModuleNotFoundError

发布于 2025-01-10 18:21:16 字数 1883 浏览 0 评论 0 原文

Python 版本:3.6.9

我使用 pickle 将机器学习模型转储到文件中,当我尝试使用 Flask 对其运行预测时,失败并显示 ModuleNotFoundError: No名为“预测器”的模块。我如何修复此错误,以便它识别我的模型,无论我尝试通过 Flask 还是通过 Python 命令(例如 python Predict_edu.py)运行预测?

这是我的文件结构:

 - video_discovery
   __init__.py
   - data_science
     - model
     - __init__.py
     - predict_edu.py
     - predictors.py
     - train_model.py

这是我的 Predict_edu.py 文件:

import pickle

with open('model', 'rb') as f:
        bow_model = pickle.load(f)

这是我的 Predictors.py 文件:

from sklearn.base import TransformerMixin

# Basic function to clean the text
def clean_text(text):
    # Removing spaces and converting text into lowercase
    return text.strip().lower()

# Custom transformer using spaCy
class predictor_transformer(TransformerMixin):
    def transform(self, X, **transform_params):
        # Cleaning Text
        return [clean_text(text) for text in X]

    def fit(self, X, y=None, **fit_params):
        return self

    def get_params(self, deep=True):
        return {}

这是我训练模型的方式:

python data_science/train_model.py

这是我的 train_model.py 文件:

from predictors import predictor_transformer

# pipeline = Pipeline([("cleaner", predictor_transformer()), ('vectorizer', bow_vector), ('classifier', classifier_18p)])
pipeline = Pipeline([("cleaner", predictor_transformer())])

with open('model', 'wb') as f:
        pickle.dump(pipeline, f)

我的 Flask 应用程序位于:video_discovery/__init__.py

这是我运行 Flask 应用程序的方式:

FLASK_ENV=development FLASK_APP=video_discovery flask run

我相信可能会出现这个问题,因为我通过直接运行 Python 脚本而不是使用 Flask 来训练模型,因此可能存在一些命名空间问题,但是我不知道如何解决这个问题。训练我的模型需要一段时间,因此我无法完全等待 HTTP 请求。

我缺少什么可以解决这个问题?

Python version: 3.6.9

I've used pickle to dump a machine learning model into a file, and when I try to run a prediction on it using Flask, it fails with ModuleNotFoundError: No module named 'predictors'. How can I fix this error so that it recognizes my model, whether I try to run a prediction via Flask or via the Python command (e.g. python predict_edu.py)?

Here is my file structure:

 - video_discovery
   __init__.py
   - data_science
     - model
     - __init__.py
     - predict_edu.py
     - predictors.py
     - train_model.py

Here's my predict_edu.py file:

import pickle

with open('model', 'rb') as f:
        bow_model = pickle.load(f)

Here's my predictors.py file:

from sklearn.base import TransformerMixin

# Basic function to clean the text
def clean_text(text):
    # Removing spaces and converting text into lowercase
    return text.strip().lower()

# Custom transformer using spaCy
class predictor_transformer(TransformerMixin):
    def transform(self, X, **transform_params):
        # Cleaning Text
        return [clean_text(text) for text in X]

    def fit(self, X, y=None, **fit_params):
        return self

    def get_params(self, deep=True):
        return {}

Here's how I train my model:

python data_science/train_model.py

Here's my train_model.py file:

from predictors import predictor_transformer

# pipeline = Pipeline([("cleaner", predictor_transformer()), ('vectorizer', bow_vector), ('classifier', classifier_18p)])
pipeline = Pipeline([("cleaner", predictor_transformer())])

with open('model', 'wb') as f:
        pickle.dump(pipeline, f)

My Flask app is in: video_discovery/__init__.py

Here's how I run my Flask app:

FLASK_ENV=development FLASK_APP=video_discovery flask run

I believe the issue may be occurring because I'm training the model by running the Python script directly instead of using Flask, so there might be some namespace issues, but I'm not sure how to fix this. It takes a while to train my model, so I can't exactly wait on an HTTP request.

What am I missing that might fix this issue?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

〃安静 2025-01-17 18:21:16

执行 predict_edu.py 时出现该错误似乎有点奇怪,因为它与 predictors.py 位于同一目录中,因此使用绝对导入,例如from Predictors import Predictor_transformer(不带点 . 运算符)通常应按预期工作。但是,如果错误仍然存​​在,您可以尝试以下几个选项。

选项 1

您可以在尝试导入模块之前将 predictors 文件的父目录添加到系统 PATH 变量中,如 此处。这对于较小的项目应该可以正常工作。

import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).resolve().parent))
from predictors import predictor_transformer

选项 2

使用相对导入,例如 from .predictors import...,并确保从包的父目录运行脚本,如下所示。 -m 选项 ”搜索 sys.path对于指定的模块并执行其内容作为 __main__ module",而不是顶级脚本。在以下参考文献中了解有关 -m 选项的更多信息: [1][2],<一个href="https://stackoverflow.com/questions/52441280/what-does-the-m-option-stand-for-in-python">[3][4], [5][6]。在这里阅读有关“相对导入”的更多信息:[1][2][3][4]

python -m video_discovery.data_science.predict_edu

但是,PEP 8 样式指南 建议一般使用绝对导入。

建议使用绝对导入,因为它们通常更具可读性
并且往往表现得更好(或者至少给出更好的错误消息)
如果导入系统配置不正确(例如当
包内的目录最终位于 sys.path 上)

但是,在某些情况下,绝对导入可能会变得非常冗长,具体取决于目录结构的复杂性,如下所示。另一方面,“相对导入可能会很混乱,特别是对于目录结构可能发生变化的共享项目”。它们也“不如绝对的可读,并且很难判断导入资源的位置”。详细了解 Python 导入绝对导入与相对导入

from package1.subpackage2.subpackage3.subpackage4.module5 import function6

选项 3

将包含您的包目录的目录包含在 PYTHONPATH 并使用绝对导入代替。 PYTHONPATH 用于设置用户定义模块的路径,以便可以将它们直接导入到Python脚本中。 PYTHONPATH 变量是一个字符串,其中包含需要添加到 sys.path Python 目录列表。该变量的主要用途是允许用户导入尚未制作成可安装的 Python 包的模块。了解更多相关信息此处此处

例如,假设您有一个名为 video_discovery 的包(在 /Users/my_user/code/video_discovery 下),并且想要添加目录 /Users/my_user/代码PYTHONPATH

在Mac

  1. 打开Terminal.app
  2. 打开文件~/.bash_profile在您的文本编辑器 - 例如 atom ~/.bash_profile
  3. 将以下行添加到末尾: export PYTHONPATH="/Users/my_user/code"
  4. 保存文件。
  5. 关闭 Terminal.app
  6. 再次启动 Terminal.app,读取新设置,然后输入
    回显$PYTHONPATH。它应该显示类似 /Users/my_user/code 的内容。

在 Linux 上

  1. 打开您最喜欢的终端程序

  2. 在文本编辑器中打开文件~/.bashrc - 例如atom ~/.bashrc

  3. 将以下行添加到末尾:export PYTHONPATH=/首页/my_user/代码

  4. 保存文件。

  5. 关闭您的终端应用程序。

  6. 再次启动终端应用程序,读取新设置,
    并输入echo $PYTHONPATH。它应该显示类似 /home/my_user/code 的内容。

在 Windows 上

  1. 打开此电脑(或计算机),右键单击内部并选择
    属性
  2. 从计算机属性对话框中,选择左侧的高级系统设置
  3. 从高级系统设置对话框中,选择环境变量按钮。
  4. 在“环境变量”对话框中,单击“新建”按钮
    对话框的上半部分
    ,创建一个新的用户变量
  5. 将变量名称指定为PYTHONPATH并输入 添加路径
    你的模块目录。选择确定并再次选择确定以保存此变量。
  6. 现在打开 cmd 窗口并输入 echo %PYTHONPATH% 以确认环境变量设置正确。 记住打开一个新的 cmd 窗口来运行您的 Python 程序,以便它获取 PYTHONPATH 中的新设置。

选项 4

另一种解决方案是以可编辑状态安装软件包(对 .py 文件所做的所有编辑都将自动包含在已安装的软件包中),如所述 此处此处。但是,要使其发挥作用所需的工作量可能会使选项 3 成为您更好的选择。

setup.py 的内容应如下所示,安装包的命令应为 pip install -e . (-e 标志代表“可编辑”,. 代表“当前目录”)。

from setuptools import setup, find_packages
setup(name='myproject', version='1.0', packages=find_packages())

It seems a bit strange that you get that error when executing predict_edu.py, as it is in the same directory as predictors.py, and thus, using absolute import such as from predictors import predictor_transformer (without the dot . operator) should normally work as expected. However, below are a few options that you could try out, if the error persists.

Option 1

You could add the parent directory of the predictors file to the system PATH variable, before attempting to import the module, as described here. This should work fine for smaller projects.

import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).resolve().parent))
from predictors import predictor_transformer

Option 2

Use relative imports, e.g., from .predictors import..., and make sure you run the script from the parent directory of your package, as shown below. The -m option "searches the sys.path for the named module and execute its contents as the __main__ module", and not as the top-level script. Read more about the -m option in the following references: [1], [2], [3], [4], [5], [6]. Read more about "relative imports" here: [1], [2], [3], [4].

python -m video_discovery.data_science.predict_edu

However, the PEP 8 style guide recommends using absolute imports in general.

Absolute imports are recommended, as they are usually more readable
and tend to be better behaved (or at least give better error messages)
if the import system is incorrectly configured (such as when a
directory inside a package ends up on sys.path)

In certain cases, however, absolute imports can get quite verbose, depending on the complexity of the directory structure, as shown below. On the other hand, "relative imports can be messy, particularly for shared projects where directory structure is likely to change". They are also "not as readable as absolute ones, and it is hard to tell the location of the imported resources". Read more about Python Import and Absolute vs Relative Imports.

from package1.subpackage2.subpackage3.subpackage4.module5 import function6

Option 3

Include the directory containing your package directory in PYTHONPATH and use absolute imports instead. PYTHONPATH is used to set the path for user-defined modules, so that they can be directly imported into a Python script. The PYTHONPATH variable is a string with a list of directories that need to be added to the sys.path directory list by Python. The primary use of this variable is to allow users to import modules that have not yet made into an installable Python package. Read more about it here and here.

For instance, let’s say you have a package named video_discovery (under /Users/my_user/code/video_discovery) and wanted to add the directory /Users/my_user/code to the PYTHONPATH:

On Mac

  1. Open Terminal.app
  2. Open the file ~/.bash_profile in your text editor – e.g. atom ~/.bash_profile
  3. Add the following line to the end: export PYTHONPATH="/Users/my_user/code"
  4. Save the file.
  5. Close Terminal.app
  6. Start Terminal.app again, to read in the new settings, and type
    echo $PYTHONPATH. It should show something like /Users/my_user/code.

On Linux

  1. Open your favorite terminal program

  2. Open the file ~/.bashrc in your text editor – e.g. atom ~/.bashrc

  3. Add the following line to the end: export PYTHONPATH=/home/my_user/code

  4. Save the file.

  5. Close your terminal application.

  6. Start your terminal application again, to read in the new settings,
    and type echo $PYTHONPATH. It should show something like /home/my_user/code.

On Windows

  1. Open This PC (or Computer), right-click inside and select
    Properties.
  2. From the computer properties dialog, select Advanced system settings on the left.
  3. From the advanced system settings dialog, choose the Environment variables button.
  4. In the Environment variables dialog, click the New button in the
    top half of the dialog
    , to make a new user variable:
  5. Give the variable name as PYTHONPATH and in value add the path to
    your module directory. Choose OK and OK again to save this variable.
  6. Now open a cmd window and type echo %PYTHONPATH% to confirm the environment variable is correctly set. Remember to open a new cmd window to run your Python program, so that it picks up the new settings in PYTHONPATH.

Option 4

Another solution would be to install the package in an editable state (all edits made to the .py files will be automatically included in the installed package), as described here and here. However, the amount of work required to get this to work might make Option 3 a better choice for you.

The contents for the setup.py should be as shown below, and the command for installing the package should be pip install -e . (-e flag stands for "editable" and . stands for "current directory").

from setuptools import setup, find_packages
setup(name='myproject', version='1.0', packages=find_packages())
无需解释 2025-01-17 18:21:16

来自 https://docs.python.org/3/library/pickle.html

pickle 可以透明地保存和恢复类实例,但是类定义必须是可导入的,并且与存储对象时位于同一模块中。

当您运行 python data_science/train_model.py 并导入 from Predictors 时,Python 会将 predictors 导入为顶级模块,并将 predictor_transformer< /code> 位于该模块中。

但是,当您通过 Flask 从 video_discovery 的父文件夹运行预测时,predictor_transformer 位于 video_discovery.data_science.predictors 模块中。

使用相对导入并从一致的路径运行

train_model.py:使用相对导入

# from predictors import predictor_transformer  # -
from .predictors import predictor_transformer   # +

训练模型:使用video_discoverytrain_model > 作为顶级模块

# python data_science/train_model.py                # -
python -m video_discovery.data_science.train_model  # +

通过 Python 命令运行预测:使用 video_discovery 作为顶级模块运行 predict_edu

# python predict_edu.py                             # -
python -m video_discovery.data_science.predict_edu  # +

运行预测通过烧瓶: (没有变化,已经使用 video_discovery 作为顶级模块运行)

FLASK_ENV=development FLASK_APP=video_discovery flask run

From https://docs.python.org/3/library/pickle.html:

pickle can save and restore class instances transparently, however the class definition must be importable and live in the same module as when the object was stored.

When you run python data_science/train_model.py and import from predictors, Python imports predictors as a top-level module and predictor_transformer is in that module.

However, when you run a prediction via Flask from the parent folder of video_discovery, predictor_transformer is in the video_discovery.data_science.predictors module.

Use relative imports and run from a consistent path

train_model.py: Use relative import

# from predictors import predictor_transformer  # -
from .predictors import predictor_transformer   # +

Train model: Run train_model with video_discovery as top-level module

# python data_science/train_model.py                # -
python -m video_discovery.data_science.train_model  # +

Run a prediction via a Python command: Run predict_edu with video_discovery as top-level module

# python predict_edu.py                             # -
python -m video_discovery.data_science.predict_edu  # +

Run a prediction via Flask: (no change, already run with video_discovery as top-level module)

FLASK_ENV=development FLASK_APP=video_discovery flask run
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文