无法使用dtype =＆＃x27; numeric＆＃x27;无法将字节/字符串的数组转换为十进制数字。

发布于 2025-02-06 05:01:27 字数 1359 浏览 3 评论 0原文

我正在研究一个分类模型，其中使用逻辑回归算法。结果我得到了：预测lr“：” ['surainPersonalityDisorder']”。现在我需要计算出此结果的概率，并且有很多问题。这是代码，如果任何人都知道

# code in colab notebook
x = df.text.values.tolist()
y = df.label.values.tolist()

vectorizer = CountVectorizer()
data1 = vectorizer.fit_transform(x)
x_train, x_test, y_train, y_test = train_test_split(data1, y, test_size=0.2, random_state=32,stratify=y)
#Training model
lr = LogisticRegression(random_state=40)
lr.fit(x_train, y_train)

y_pred_train = lr.predict(x_train)
y_pred_test = lr.predict(x_test)

我使用fastapi进行部署的问题来源

class EmotionAdoPrediction():
    def __init__(self):
        # Note: for model_path you should make full path (see docker volume)
        self.vector = load("C:/Users/aabid/PycharmProjects/emotion-detection-multilabels/model1/Lrvector.joblib")
        self.model = load("C:/Users/aabid/PycharmProjects/emotion-detection-multilabels/model1/LRClassifier.joblib")
        self.classes_names = {0: ["angry"], 1: ["joy"], 2: ["sadness"], 3: ["fear"]}
        return

 def predict(self, text):
        text = self.data_cleaning(text)
        text_clean = self.standardization(text)
        probs = self.model.predict([[text_clean]])[:, 1]
        proba = np.max(probs[0])
        class_ind = np.argmax(probs[0])

        return self.classes_names[class_ind], proba```

原文

I am working on a classification model in which I use the logistic regression algorithm. I got as result: Prediction LR": "['AnxiousPersonalityDisorder']". Now I need to calculate the probability of this result and I have a lot of problem.
here is the code if anyone has an idea of the source of the problem

# code in colab notebook
x = df.text.values.tolist()
y = df.label.values.tolist()

vectorizer = CountVectorizer()
data1 = vectorizer.fit_transform(x)
x_train, x_test, y_train, y_test = train_test_split(data1, y, test_size=0.2, random_state=32,stratify=y)
#Training model
lr = LogisticRegression(random_state=40)
lr.fit(x_train, y_train)

y_pred_train = lr.predict(x_train)
y_pred_test = lr.predict(x_test)

i use fastapi for the deployment

class EmotionAdoPrediction():
    def __init__(self):
        # Note: for model_path you should make full path (see docker volume)
        self.vector = load("C:/Users/aabid/PycharmProjects/emotion-detection-multilabels/model1/Lrvector.joblib")
        self.model = load("C:/Users/aabid/PycharmProjects/emotion-detection-multilabels/model1/LRClassifier.joblib")
        self.classes_names = {0: ["angry"], 1: ["joy"], 2: ["sadness"], 3: ["fear"]}
        return

 def predict(self, text):
        text = self.data_cleaning(text)
        text_clean = self.standardization(text)
        probs = self.model.predict([[text_clean]])[:, 1]
        proba = np.max(probs[0])
        class_ind = np.argmax(probs[0])

        return self.classes_names[class_ind], proba```

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

完美的未来在梦里 2025-02-13 05:01:27

我想您使用 scikit-learn logistic回归的软件包。如果是这种情况，则可以计算结果的概率。 logisticRegress类包含prective_log_proba（x）和prective_proba（x）函数。

根据文档：

x）：预测概率估计的对数。
preditive_log_proba （如果将Multi_class设置为“多项式”，则使用SoftMax函数来查找每个类的预测概率。

使用prective_proba函数，因为您有多类问题。

y_prob = lr.predict_proba(x_test)

编辑：还将标签从分类值编码为数值。使用pandas.get_dummies转换分类值。

I suppose that you use the scikit-learn package for logistic regression. If this is the case than you can calculate the probability of the result. The LogisticRegression class contains the predict_log_proba(X) and the predict_proba(X) function.

According to the documentation:

predict_log_proba(X): Predict logarithm of probability estimates.
predict_proba(X): For a multi_class problem, if multi_class is set to be “multinomial” the softmax function is used to find the predicted probability of each class.

Use the predict_proba function, because you have a multi class problem.