X具有1个功能,但是线性重试期望5个功能作为输入

发布于 2025-02-07 18:32:08 字数 535 浏览 2 评论 0原文

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import sklearn.linear_model

dados = pd.read_csv("dados.csv", thousands=',', sep = ";", header = 0, encoding='latin-1')

dados.drop('pais', axis = 1, inplace=True)

df = dados.to_numpy()
g = [df[:,1]]
h = [df[:,0]]

#plt.scatter(x,y, color = 'blue')
plt.scatter(g,h, color = 'blue')

model=sklearn.linear_model.LinearRegression()
model.fit(g,h)

G_new=[[22500]]
print(model.predict(G_new))

X具有1个功能,但是线性重试期望5个功能作为输入。

如何解决这个问题?

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import sklearn.linear_model

dados = pd.read_csv("dados.csv", thousands=',', sep = ";", header = 0, encoding='latin-1')

dados.drop('pais', axis = 1, inplace=True)

df = dados.to_numpy()
g = [df[:,1]]
h = [df[:,0]]

#plt.scatter(x,y, color = 'blue')
plt.scatter(g,h, color = 'blue')

model=sklearn.linear_model.LinearRegression()
model.fit(g,h)

G_new=[[22500]]
print(model.predict(G_new))

X has 1 features, but LinearRegression is expecting 5 features as input.

How to solve this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

千纸鹤带着心事 2025-02-14 18:32:08

X没有预期的5个功能 - 具有1个功能或100,000个功能的功能很好 - 但确实需要是2D数组。您正在传递1D数组(嗯,是Pandas系列,但相当于同一件事)。

这是我将如何定义xy(您调用gh):

X = [df[:,1]].values.reshape(-1, 1)
y = [df[:,0]].values

reshape方法将1D数组转换为2D数组(如果愿意,则'列向量');如果您选择超过1列,则不需要此重塑。

我用.values将它们投射到numpy阵列中,因为我更喜欢吊索sklearn数据。 Pandas非常适合数据争吵,但是一旦我制作Xy,ML任务便会移至Numpy。个人喜好。

顺便说一句,人们使用大写x表示它应该是矩阵,即2d。这是数学惯例。

X does not expect 5 features — it's fine with 1 feature or 100,000 features — but it does need to be a 2D array. You are passing a 1D array (well, a Pandas Series, but it amounts to the same thing).

Here's how I would define X and y (which you call g and h):

X = [df[:,1]].values.reshape(-1, 1)
y = [df[:,0]].values

The reshape method transforms the 1D array into a 2D array (a 'column vector' if you like); if you were selecting more than 1 column you would not need this reshaping.

I cast them to NumPy arrays with .values because I prefer NumPy for slinging sklearn data around. Pandas is great for data wrangling, but once I make X and y for the ML task, I move to NumPy. Personal preference.

By the way, people use uppercase X to indicate that it should be a matrix, i.e. 2D. It's a mathematical convention.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文