“火车”和“类” R 中的长度不同
我正在做一些测试工作。我面临以下错误:
Error in knn(train = Data_train, test = Data_test, cl = Data_train_label, :
'train' and 'class' have different lengths
可重现的代码:
# Load data
library(data.table)
Data = fread('https://raw.githubusercontent.com/justmarkham/DAT5/master/data/auto_mpg.txt', stringsAsFactors = FALSE)
Data
str(Data)
# K-mean
Data$cylinders = factor(Data$cylinders, levels = c('3', '4', '5', '6', '8'), label = c('3_cylinders', '4_cylinders', '5_cylinders',
'6_cylinders', '8_cylinders'))
round(prop.table(table(Data$cylinders))*100, digits = 1)
Data = Data[, c('cylinders', 'mpg', 'horsepower', 'displacement', 'weight', 'acceleration', 'model_year', 'origin', 'car_name')]
head(Data)
summary(Data[, c('mpg', 'horsepower')])
normalize = function(x) {return ((x-min(x) / max(x) - min(x)))}
Data_n = as.data.frame(lapply(Data[, 2:8], normalize))
smp_size = floor(0.8 * nrow(Data))
set.seed(123)
train_ind = sample(seq_len(nrow(Data)), size = smp_size)
Data_train = Data_n[train_ind, ]
Data_test = Data_n[-train_ind, ]
dim(Data_train)
dim(Data_test)
Data_train_label = Data[train_ind, 1]
Data_test_label = Data[-train_ind, 1]
length(Data_train_label)
length(Data_test_label)
#install.packages("class")
library('class')
Data_test_pred = knn(train = Data_train, test = Data_test, cl = Data_train_label, k=19)
#install.packages("gmodels")
library(gmodels)
CrossTable(x = Data_test_label$cylinders, y= Data_test_pred, prop.chisq=FALSE)
显然,问题是在检查长度时,我得到以下信息:
> length(Data_train_label)
[1] 1
> length(Data_test_label)
[1] 1
并且应该收到:
> length(Data_train_label)
[1] 313
> length(Data_test_label)
[1] 79
这很奇怪,我查看了有关此主题的其他问题,但没有找到任何内容这可以帮助我。也许Data_train_label
需要转换成向量?
I'm doing some test work. I am facing the following error:
Error in knn(train = Data_train, test = Data_test, cl = Data_train_label, :
'train' and 'class' have different lengths
Reproducible code:
# Load data
library(data.table)
Data = fread('https://raw.githubusercontent.com/justmarkham/DAT5/master/data/auto_mpg.txt', stringsAsFactors = FALSE)
Data
str(Data)
# K-mean
Data$cylinders = factor(Data$cylinders, levels = c('3', '4', '5', '6', '8'), label = c('3_cylinders', '4_cylinders', '5_cylinders',
'6_cylinders', '8_cylinders'))
round(prop.table(table(Data$cylinders))*100, digits = 1)
Data = Data[, c('cylinders', 'mpg', 'horsepower', 'displacement', 'weight', 'acceleration', 'model_year', 'origin', 'car_name')]
head(Data)
summary(Data[, c('mpg', 'horsepower')])
normalize = function(x) {return ((x-min(x) / max(x) - min(x)))}
Data_n = as.data.frame(lapply(Data[, 2:8], normalize))
smp_size = floor(0.8 * nrow(Data))
set.seed(123)
train_ind = sample(seq_len(nrow(Data)), size = smp_size)
Data_train = Data_n[train_ind, ]
Data_test = Data_n[-train_ind, ]
dim(Data_train)
dim(Data_test)
Data_train_label = Data[train_ind, 1]
Data_test_label = Data[-train_ind, 1]
length(Data_train_label)
length(Data_test_label)
#install.packages("class")
library('class')
Data_test_pred = knn(train = Data_train, test = Data_test, cl = Data_train_label, k=19)
#install.packages("gmodels")
library(gmodels)
CrossTable(x = Data_test_label$cylinders, y= Data_test_pred, prop.chisq=FALSE)
Obviously the thing is that when checking the length, I get the following:
> length(Data_train_label)
[1] 1
> length(Data_test_label)
[1] 1
and should receive :
> length(Data_train_label)
[1] 313
> length(Data_test_label)
[1] 79
Which is rather strange, I looked at other questions on this topic, but I didn’t find anything that could help me. Maybe Data_train_label
needs to be converted into a vector?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您的类的参数
cl
需要一个向量而不是数据帧。以下代码应该可以工作:
The argument
cl
for your classes is expecting a vector rather than data frame.The following code should work:
这对我来说是工作
It`s work for me