混淆了为什么我的KNN代码抛出Valueerror

发布于 2025-02-03 16:12:37 字数 674 浏览 4 评论 0原文

我正在使用Sklearn进行KNN回归器:

#importing libraries and data
import pandas as pd
from sklearn.neighbors import KNeighborsRegressor as KNR
theta = pd.read_csv("train.csv")#pandas dataframe
#getting data wanted from theta and putting it in a new dataframe
a = theta.get("YearBuilt")
b = theta.get("YrSold")
A = a.to_frame()
B = b.to_frame()
glasses = [A,B]
x = pd.concat(glasses)
#getting target data
y = theta.get("SalePrice")
#using KNN
horses = KNR(n_neighbors = 3)
horses.fit(x,y)

我收到此错误消息:

ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

有人可以解释一下吗?我的数据是目标的数十万,用于输入。数据中没有空白。

I am using sklearn for KNN regressor:

#importing libraries and data
import pandas as pd
from sklearn.neighbors import KNeighborsRegressor as KNR
theta = pd.read_csv("train.csv")#pandas dataframe
#getting data wanted from theta and putting it in a new dataframe
a = theta.get("YearBuilt")
b = theta.get("YrSold")
A = a.to_frame()
B = b.to_frame()
glasses = [A,B]
x = pd.concat(glasses)
#getting target data
y = theta.get("SalePrice")
#using KNN
horses = KNR(n_neighbors = 3)
horses.fit(x,y)

I get this error message:

ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

Could someone please explain this? My data is in the hundred thousands for target and the thousands for input. And there is no blanks in the data.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

猫烠⑼条掵仅有一顆心 2025-02-10 16:12:37

在回答问题之前,让我重构代码。您使用的是数据框,因此您可以索引数据框的单个或muliple字段,而无需浏览您使用的额外步骤:

#importing libraries and data
import pandas as pd
from sklearn.neighbors import KNeighborsRegressor as KNR

theta = pd.read_csv("train.csv") # pandas dataframe
#getting data wanted from theta and putting it in a new dataframe
x = theta[["YearBuilt", "YrSold"]] # index multiple fields
#getting target data
y = theta["SalePrice"] # index single field
#using KNN
horses = KNR(n_neighbors = 3)
horses.fit(x,y) # fit KNN

关于错误,它表明您有一些naninf,数据中的大值。您可以使用以下方式过滤nan inf 值来确保这些发生:

theta = theta.replace([np.inf, -np.inf], np.nan)

theta.dropna(inplace=True)

Before answering the question, Let me refactor the code. You are using a dataframe so you can index single or muliple fields of the dataframe without going through the extra steps you've used:

#importing libraries and data
import pandas as pd
from sklearn.neighbors import KNeighborsRegressor as KNR

theta = pd.read_csv("train.csv") # pandas dataframe
#getting data wanted from theta and putting it in a new dataframe
x = theta[["YearBuilt", "YrSold"]] # index multiple fields
#getting target data
y = theta["SalePrice"] # index single field
#using KNN
horses = KNR(n_neighbors = 3)
horses.fit(x,y) # fit KNN

Regarding your error, it indicates that you have some NaN, Inf, large values in your data. You can ensure these doesnt occur by filtering out the NaN and inf values using this:

theta = theta.replace([np.inf, -np.inf], np.nan)

theta.dropna(inplace=True)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文