X具有1个功能,但是线性重试期望5个功能作为输入
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import sklearn.linear_model
dados = pd.read_csv("dados.csv", thousands=',', sep = ";", header = 0, encoding='latin-1')
dados.drop('pais', axis = 1, inplace=True)
df = dados.to_numpy()
g = [df[:,1]]
h = [df[:,0]]
#plt.scatter(x,y, color = 'blue')
plt.scatter(g,h, color = 'blue')
model=sklearn.linear_model.LinearRegression()
model.fit(g,h)
G_new=[[22500]]
print(model.predict(G_new))
X具有1个功能,但是线性重试期望5个功能作为输入。
如何解决这个问题?
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import sklearn.linear_model
dados = pd.read_csv("dados.csv", thousands=',', sep = ";", header = 0, encoding='latin-1')
dados.drop('pais', axis = 1, inplace=True)
df = dados.to_numpy()
g = [df[:,1]]
h = [df[:,0]]
#plt.scatter(x,y, color = 'blue')
plt.scatter(g,h, color = 'blue')
model=sklearn.linear_model.LinearRegression()
model.fit(g,h)
G_new=[[22500]]
print(model.predict(G_new))
X has 1 features, but LinearRegression is expecting 5 features as input.
How to solve this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
X
没有预期的5个功能 - 具有1个功能或100,000个功能的功能很好 - 但确实需要是2D数组。您正在传递1D数组(嗯,是Pandas系列,但相当于同一件事)。这是我将如何定义
x
和y
(您调用g
和h
):reshape
方法将1D数组转换为2D数组(如果愿意,则'列向量');如果您选择超过1列,则不需要此重塑。我用
.values
将它们投射到numpy阵列中,因为我更喜欢吊索sklearn
数据。 Pandas非常适合数据争吵,但是一旦我制作X
和y
,ML任务便会移至Numpy。个人喜好。顺便说一句,人们使用大写
x
表示它应该是矩阵,即2d。这是数学惯例。X
does not expect 5 features — it's fine with 1 feature or 100,000 features — but it does need to be a 2D array. You are passing a 1D array (well, a Pandas Series, but it amounts to the same thing).Here's how I would define
X
andy
(which you callg
andh
):The
reshape
method transforms the 1D array into a 2D array (a 'column vector' if you like); if you were selecting more than 1 column you would not need this reshaping.I cast them to NumPy arrays with
.values
because I prefer NumPy for slingingsklearn
data around. Pandas is great for data wrangling, but once I makeX
andy
for the ML task, I move to NumPy. Personal preference.By the way, people use uppercase
X
to indicate that it should be a matrix, i.e. 2D. It's a mathematical convention.