KNN 分类器 Python

发布于 2025-01-12 09:48:55 字数 1033 浏览 2 评论 0原文

我目前正在使用 scikit learn 模块来帮助解决犯罪预测问题。我在使用 knn.predict 方法对整个 Dataframe 进行批量编码时遇到问题。

如何使用 knn.predict() 方法对 Dataframe 的整个两列进行批量编码,以便将输出存储在另一个 Dataframe 中?

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split

knn_df = pd.read_csv("/Users/helenapunset/Desktop/knn_dataframe.csv")

# x is the set of features 
x = knn_df[['latitude', 'longitude']]

# y is the target variable 
y = knn_df['Class']

# train and test data 
x_train, x_test, y_train, y_test = train_test_split(x, y, random_state=0)

from sklearn.neighbors import KNeighborsClassifier
knn = KNeighborsClassifier(n_neighbors = 5)

# training the data 
knn.fit(x_train,y_train)

# test score was approximately 69% 
knn.score(x_test,y_test)

# this is predicted to be a safe zone 
crime_prediction = knn.predict([[25.787882, -80.358427]])
print(crime_prediction)

在代码的最后一行,我能够添加我正在使用的两个功能,它们是标记为 knn_df 的数据帧中的纬度和经度。但是,这是我一直在搜索有关简化整个 Dataframe 的 knn 预测的过程的文档的一个点,但似乎找不到一种方法来做到这一点。是否有可能为此使用 for 循环?

I am currently using the scikit learn module in order to help with a crime prediction problem. I am having an issue batch coding the entire Dataframe that I have with the knn.predict method.

How can I batch code the entire two columns of my Dataframe with the knn.predict() method in order to store in another Dataframe the output?

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split

knn_df = pd.read_csv("/Users/helenapunset/Desktop/knn_dataframe.csv")

# x is the set of features 
x = knn_df[['latitude', 'longitude']]

# y is the target variable 
y = knn_df['Class']

# train and test data 
x_train, x_test, y_train, y_test = train_test_split(x, y, random_state=0)

from sklearn.neighbors import KNeighborsClassifier
knn = KNeighborsClassifier(n_neighbors = 5)

# training the data 
knn.fit(x_train,y_train)

# test score was approximately 69% 
knn.score(x_test,y_test)

# this is predicted to be a safe zone 
crime_prediction = knn.predict([[25.787882, -80.358427]])
print(crime_prediction)

In the last line of the code I was able to add the two features I am using which are latitude and longitude from my Dataframe labeled knn_df. But, this is a single point I have been searching through the documentation on a process for streamlining this knn prediction for the entire Dataframe and cannot seem to find a way to do this. Is there somehow a possibility of using a for loop for this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

南风几经秋 2025-01-19 09:48:56

让要预测的新集合为“knn_df_predict”。假设列名相同,请尝试以下代码行:

x_new = knn_df_predict[['latitude', 'longitude']] #formating features
crime_prediction = knn.predict(x_new) #predicting for the new set
knn_df_predict['prediction'] = crime_prediction #Adding the prediction to dataframe

Let the new set to be predicted is 'knn_df_predict'. Assuming same column names,try the following lines of code :

x_new = knn_df_predict[['latitude', 'longitude']] #formating features
crime_prediction = knn.predict(x_new) #predicting for the new set
knn_df_predict['prediction'] = crime_prediction #Adding the prediction to dataframe
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文