KNN：如何反转看不见的编码标签？

发布于 2025-01-13 10:57:14 字数 1325 浏览 3 评论 0原文

我尝试使用 KNN 进行预测，但由于数据是浮点数，我需要对其进行编码，以便 scikitlearn 接受它。这是我的方法，效果很好。我可以训练和预测。但输出显然是经过编码的：

df = pd.read_csv('data.csv', index_col = 'date', parse_dates = True)

X = df.drop(["predictor_pct_chg"], axis=1).values
y = df["predictor_pct_chg"].values

X_train, X_test, y_train, y_test = train_test_split(
   X,
   y,
   test_size=0.2,
   shuffle=False,
)

lab_enc = preprocessing.LabelEncoder()
training_scores_encoded = lab_enc.fit_transform(y_train)
print(training_scores_encoded)
print(utils.multiclass.type_of_target(y_train))
print(utils.multiclass.type_of_target(y_train.astype('int')))
print(utils.multiclass.type_of_target(training_scores_encoded))

knn = KNeighborsClassifier()
knn.fit(
   X_train,
   training_scores_encoded,
)

y_pred = knn.predict(X_test)

训练和预测工作正常，但现在我想绘制预测并将其与我的 y_test 进行比较：

y_pred = lab_enc.inverse_transform(y_pred)

plt.plot(y_test, color ='red', label = 'Actual')
plt.plot(y_pred, color ='blue', label = 'Prediction')
plt.xlabel('Time')
plt.ylabel('% Change')
plt.legend()
plt.show

现在 inverse_transform() 不起作用，因为 LabelEncoder 以前从未见过预测。那么我该如何逆转呢？我的意思是我也可以在 y_test 上使用 LabelEncoder，然后将其与 y_pred 进行比较。但这没有意义，因为我需要以实际单位进行有用的预测（此处：％）。否则我无法解释这些预测。

错误：

ValueError: y contains previously unseen labels:

原文

I try to make a prediction with KNN, but since the data is float I need to encode it so that scikitlearn accepts it. This is my approach, which works fine. I can train and predict. But the output is obviously encoded:

df = pd.read_csv('data.csv', index_col = 'date', parse_dates = True)

X = df.drop(["predictor_pct_chg"], axis=1).values
y = df["predictor_pct_chg"].values

X_train, X_test, y_train, y_test = train_test_split(
   X,
   y,
   test_size=0.2,
   shuffle=False,
)

lab_enc = preprocessing.LabelEncoder()
training_scores_encoded = lab_enc.fit_transform(y_train)
print(training_scores_encoded)
print(utils.multiclass.type_of_target(y_train))
print(utils.multiclass.type_of_target(y_train.astype('int')))
print(utils.multiclass.type_of_target(training_scores_encoded))

knn = KNeighborsClassifier()
knn.fit(
   X_train,
   training_scores_encoded,
)

y_pred = knn.predict(X_test)

Training and making a prediction works fine, but now I want to plot the prediction and compare it to my y_test:

y_pred = lab_enc.inverse_transform(y_pred)

plt.plot(y_test, color ='red', label = 'Actual')
plt.plot(y_pred, color ='blue', label = 'Prediction')
plt.xlabel('Time')
plt.ylabel('% Change')
plt.legend()
plt.show

Now the inverse_transform() does not work, because the LabelEncoder has never seen the prediction before. So how can I reverse it then? I mean I could use the LabelEncoder on the y_test as well and then compare that to the y_pred. But this doesnt make sense, since I need a useful prediction in the actual unit (here: %). Otherwise I cannot interpret the predictions.

Error:

ValueError: y contains previously unseen labels:

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

未蓝澄海的烟 2025-01-20 10:57:14

我对这个 ValueError 的起源的猜测：
由于您尚未在 train_test_split 中按 y 对数据进行分层，因此 y_test 可能包含训练数据中不存在的标签。因此，尝试设置train_test_split参数stratify = y。

有关详细说明，请参阅sklearn 用户指南的分层部分

回复收藏 0 原文

~没有更多了~

关于作者

墨落成白

暂无简介

文章

27 人气

关注发私信

友情链接

文江博客

KNN：如何反转看不见的编码标签？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

alipaysp_snBf0MSZIv

梦断已成空

瞎闹

凯凯我们等你回来

寄意

似梦非梦

友情链接

KNN：如何反转看不见的编码标签？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

alipaysp_snBf0MSZIv

梦断已成空

瞎闹

凯凯我们等你回来

寄意

似梦非梦

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。