caret rfe()错误“ x和y&quot中应有相同数量的样本数量。
我很难解决错误:“ x和y中应该有相同数量的样本”。我注意到其他人在此网站上发布了有关此错误的信息,但他们的解决方案对我没有起作用。我在此处附上我的数据集的缩写版本。
x_train
在这里:
x_train <- structure(list(laterality = c("Left", "Right", "Right", "Right",
"Left", "Left", "Left", "Left", "Left", "Right"), age = c(66L,
56L, 69L, 49L, 60L, 70L, 58L, 53L, 59L, 64L), insurance = c("MEDICARE",
"UNITED", "MEDICARE", "UNITED", "COMMERCIAL", "MEDICARE", "AETNA",
"AETNA", "OXFORD", "MEDICARE_MANAGED"), employment = c("Retired",
"FullTime", "Retired", "FullTime", "Disabled", "SelfEmployed",
"Retired", "FullTime", "FullTime", "Disabled"), sex = c("Female",
"Male", "Female", "Female", "Female", "Female", "Male", "Male",
"Female", "Male"), race = c("WhiteorCaucasian", "WhiteorCaucasian",
"WhiteorCaucasian", "WhiteorCaucasian", "WhiteorCaucasian", "WhiteorCaucasian",
"Other", "BlackorAfricanAmerican", "WhiteorCaucasian", "WhiteorCaucasian"
), ethnicity = c("NotHispanicorLatino", "NotHispanicorLatino",
"NotHispanicorLatino", "NotHispanicorLatino", "NotHispanicorLatino",
"NotHispanicorLatino", "NotHispanicorLatino", "NotHispanicorLatino",
"NotHispanicorLatino", "NotHispanicorLatino"), bmi = c(22.3,
33, 34.3, 36, 30, 20, 29.5, 33.4, 26.5, 34.2), PreferredLanguage = c("English",
"English", "English", "English", "English", "English", "English",
"English", "English", "English"), married = c("Married", "Married",
"Married", "Married", "Married", "Married", "Divorced", "Single",
"Married", "Married"), RadiographSevere = c("No", "No", "No",
"No", "No", "No", "No", "No", "No", "No"), HxAnxietyDepression = c("No",
"No", "No", "Yes", "Yes", "Yes", "No", "No", "No", "No"), SurgeryYear = c(2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L
), operativetime = c(82L, 79L, 85L, 76L, 84L, 86L, 67L, 75L,
72L, 100L), HipApproach = c("Anterior", "Posterior", "Posterior",
"Posterior", "Posterior", "Anterior", "Posterior", "Posterior",
"Posterior", "Posterior")), row.names = c(NA, -10L), class = c("data.table",
"data.frame"))
y_train
在这里:
y_train <- structure(list(POD1AverageNrsScoreCut = c("[0,5)", "[0,5)", "[0,5)",
"[0,5)", "[5,10)", "[0,5)", "[0,5)", "[5,10)", "[0,5)", "[0,5)"
)), row.names = c(NA, -10L), class = c("data.table", "data.frame"
))
我正在使用RFE的代码在这里:
library(caret)
control <- rfeControl(functions = rfFuncs, # random forest
method = "repeatedcv", # repeated cv
repeats = 3, # number of repeats
number = 10) # number of folds
result_rfe <- rfe(x = x_train, y = y_train, sizes = c(1:30), rfeControl = control)
I am having difficulties solving the error "there should be the same number of samples in x and y". I notice that others have posted on this site regarding this error, but their solutions have not worked for me. I am attaching an abbreviated version of my dataset here.
x_train
is here:
x_train <- structure(list(laterality = c("Left", "Right", "Right", "Right",
"Left", "Left", "Left", "Left", "Left", "Right"), age = c(66L,
56L, 69L, 49L, 60L, 70L, 58L, 53L, 59L, 64L), insurance = c("MEDICARE",
"UNITED", "MEDICARE", "UNITED", "COMMERCIAL", "MEDICARE", "AETNA",
"AETNA", "OXFORD", "MEDICARE_MANAGED"), employment = c("Retired",
"FullTime", "Retired", "FullTime", "Disabled", "SelfEmployed",
"Retired", "FullTime", "FullTime", "Disabled"), sex = c("Female",
"Male", "Female", "Female", "Female", "Female", "Male", "Male",
"Female", "Male"), race = c("WhiteorCaucasian", "WhiteorCaucasian",
"WhiteorCaucasian", "WhiteorCaucasian", "WhiteorCaucasian", "WhiteorCaucasian",
"Other", "BlackorAfricanAmerican", "WhiteorCaucasian", "WhiteorCaucasian"
), ethnicity = c("NotHispanicorLatino", "NotHispanicorLatino",
"NotHispanicorLatino", "NotHispanicorLatino", "NotHispanicorLatino",
"NotHispanicorLatino", "NotHispanicorLatino", "NotHispanicorLatino",
"NotHispanicorLatino", "NotHispanicorLatino"), bmi = c(22.3,
33, 34.3, 36, 30, 20, 29.5, 33.4, 26.5, 34.2), PreferredLanguage = c("English",
"English", "English", "English", "English", "English", "English",
"English", "English", "English"), married = c("Married", "Married",
"Married", "Married", "Married", "Married", "Divorced", "Single",
"Married", "Married"), RadiographSevere = c("No", "No", "No",
"No", "No", "No", "No", "No", "No", "No"), HxAnxietyDepression = c("No",
"No", "No", "Yes", "Yes", "Yes", "No", "No", "No", "No"), SurgeryYear = c(2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L
), operativetime = c(82L, 79L, 85L, 76L, 84L, 86L, 67L, 75L,
72L, 100L), HipApproach = c("Anterior", "Posterior", "Posterior",
"Posterior", "Posterior", "Anterior", "Posterior", "Posterior",
"Posterior", "Posterior")), row.names = c(NA, -10L), class = c("data.table",
"data.frame"))
y_train
is here:
y_train <- structure(list(POD1AverageNrsScoreCut = c("[0,5)", "[0,5)", "[0,5)",
"[0,5)", "[5,10)", "[0,5)", "[0,5)", "[5,10)", "[0,5)", "[0,5)"
)), row.names = c(NA, -10L), class = c("data.table", "data.frame"
))
Code I am using for rfe is here:
library(caret)
control <- rfeControl(functions = rfFuncs, # random forest
method = "repeatedcv", # repeated cv
repeats = 3, # number of repeats
number = 10) # number of folds
result_rfe <- rfe(x = x_train, y = y_train, sizes = c(1:30), rfeControl = control)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我看到您的输出是两个类别的限制间隔。也许如果您将它们作为因素
y = as.factor(linrist(y_train))
?它对我的输出有用:
注意:我不知道这是否是您所期望的,我不知道数据上下文和您的方法。
原始答案:
subscript caret rfe函数中的错误
I see your output is two classes of limit intervals. Maybe if you try them as factors
y = as.factor(unlist(y_train))
? It worked for meOutput:
Note: I don't know if this is what you expected, I don't know the data context and your approach.
Original answer:
Subscript out of bounds error in caret's rfe function