我们可以使用FastApi直接在Model.predict()内部使用Pydantic Models(basemodel),如果不是,为什么?

发布于 2025-01-21 06:45:16 字数 1314 浏览 2 评论 0 原文

我正在使用fastapi的pydantic模型( basemodel ),然后将输入转换为 dictionary ,然后将其转换为pandas dataframe ,为了将其传递到 model.predict()用于机器学习预测的功能,如下所示:

from fastapi import FastAPI
import uvicorn
from pydantic import BaseModel
import pandas as pd
from typing import List

class Inputs(BaseModel):
    f1: float,
    f2: float,
    f3: str

@app.post('/predict')
def predict(features: List[Inputs]):
    output = []

    # loop the list of input features
    for data in features:
         result = {}

         # Convert data into dict() and then into a DataFrame
            data = data.dict()
            df = pd.DataFrame([data])

         # get predictions
            prediction = classifier.predict(df)[0]

         # get probability
            probability = classifier.predict_proba(df).max()

         # assign to dictionary 
            result["prediction"] = prediction
            result["probability"] = probability

         # append dictionary to list (many outputs)
            output.append(result)

    return output

它可以正常工作,我不确定它是优化还是正确的方法,因为我将输入转换两次以获取预测。另外,我不确定在拥有大量输入的情况下,它是否可以正常工作 fast 。有什么改进吗?如果有一种方法(甚至除了使用Pydantic模型之外),我可以直接工作并避免进行转换和循环。

I'm using a Pydantic model (Basemodel) with FastAPI and converting the input into a dictionary, and then converting it into a Pandas DataFrame, in order to pass it into model.predict() function for Machine Learning predictions, as shown below:

from fastapi import FastAPI
import uvicorn
from pydantic import BaseModel
import pandas as pd
from typing import List

class Inputs(BaseModel):
    f1: float,
    f2: float,
    f3: str

@app.post('/predict')
def predict(features: List[Inputs]):
    output = []

    # loop the list of input features
    for data in features:
         result = {}

         # Convert data into dict() and then into a DataFrame
            data = data.dict()
            df = pd.DataFrame([data])

         # get predictions
            prediction = classifier.predict(df)[0]

         # get probability
            probability = classifier.predict_proba(df).max()

         # assign to dictionary 
            result["prediction"] = prediction
            result["probability"] = probability

         # append dictionary to list (many outputs)
            output.append(result)

    return output

It works fine, I'm just not quite sure if it's optimized or the right way to do it, since I convert the input two times to get the predictions. Also, I'm not sure if it is going to work fast in the case of having a huge number of inputs. Any improvements on this? If there's a way (even other than using Pydantic models), where I can work directly and avoid going through conversions and the loop.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

苍暮颜 2025-01-28 06:45:16

首先,您应该为变量/对象使用更多描述性名称。例如:

@app.post('/predict')
def predict(inputs: List[Inputs]):
    for i in inputs:
    # ...

您不能将pydantic模型直接传递到 predict()函数,因为它接受数据 array 而不是pydantic模型。可用选项在下面列出。

选项1

您可以使用以下内容(以下 i 表示 inputs list中的项目):

# Getting prediction
prediction = model.predict([[i.f1, i.f2, i.f3]])[0]

# Getting probability
probability = model.predict_proba([[i.f1, i.f2, i.f3]])

选项2

您可以使用 list :

# Getting prediction
prediction = model.predict([list(i.__dict__.values())])[0]

# Getting probability
probability = model.predict_proba([list(i.__dict__.values())])

或,最好是,使用pydantic的 dict() 方法(注: /17865804“> dict()已被 model_dump() ):

# Getting prediction
prediction = model.predict([list(i.dict().values())])[0]

# Getting probability
probability = model.predict_proba([list(i.dict().values())])

选项3

使用pandas dataframe (如下)同样,在pydantic v2 dict()已被 model_dump())替换:

import pandas as pd

# Converting input data into a Pandas DataFrame
df = pd.DataFrame([i.dict()])

# Getting prediction
prediction = model.predict(df)[0]

# Getting probability
probability = model.predict_proba(df)

选项4

您可以避免通过单个项目循环并调用 predict()< /code>多次函数,通过使用以下(再次,在pydantic v2中,用 dict() model_dump()

import pandas as pd

df = pd.DataFrame([i.dict() for i in inputs])
prediction = model.predict(df)
probability = model.predict_proba(df)
return {'prediction': prediction.tolist(), 'probability': probability.tolist()}

替换您不想使用pandas dataframe ):

inputs_list = [list(i.dict().values()) for i in inputs]
prediction = model.predict(inputs_list)
probability = model.predict_proba(inputs_list)
return {'prediction': prediction.tolist(), 'probability': probability.tolist()}

First, you should use more descriptive names for your variables/objects. For example:

@app.post('/predict')
def predict(inputs: List[Inputs]):
    for i in inputs:
    # ...

You cannot pass the Pydantic model directly to the predict() function, as it accepts a data array, not a Pydantic model. Available options are listed below.

Option 1

You could use the following (The i below represents an item from the inputs list):

# Getting prediction
prediction = model.predict([[i.f1, i.f2, i.f3]])[0]

# Getting probability
probability = model.predict_proba([[i.f1, i.f2, i.f3]])

Option 2

You could use the __dict__ method to get the values of all attributes in the model and convert them into a list:

# Getting prediction
prediction = model.predict([list(i.__dict__.values())])[0]

# Getting probability
probability = model.predict_proba([list(i.__dict__.values())])

or, preferably, use the Pydantic's dict() method (Note: In Pydantic V2 dict() has been replaced by model_dump()):

# Getting prediction
prediction = model.predict([list(i.dict().values())])[0]

# Getting probability
probability = model.predict_proba([list(i.dict().values())])

Option 3

Use a Pandas DataFrame as follows (again, in Pydantic V2 dict() has been replaced by model_dump()):

import pandas as pd

# Converting input data into a Pandas DataFrame
df = pd.DataFrame([i.dict()])

# Getting prediction
prediction = model.predict(df)[0]

# Getting probability
probability = model.predict_proba(df)

Option 4

You could avoid looping over individual items and calling the predict() function multiple times, by using, instead, the below (once again, in Pydantic V2, replace dict() with model_dump()):

import pandas as pd

df = pd.DataFrame([i.dict() for i in inputs])
prediction = model.predict(df)
probability = model.predict_proba(df)
return {'prediction': prediction.tolist(), 'probability': probability.tolist()}

or (in case you wouldn't like using Pandas DataFrame):

inputs_list = [list(i.dict().values()) for i in inputs]
prediction = model.predict(inputs_list)
probability = model.predict_proba(inputs_list)
return {'prediction': prediction.tolist(), 'probability': probability.tolist()}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文