match.arg(what) 中的错误:'arg'应该是“中位数”、“平均值”之一
我试图使用 R 中的 with() 和 impute() 函数来估算 Year_of_Release 变量中的缺失值,但出现此错误:Error in match.arg(what) : 'arg' should be one of “median” , “意思是”。 以下是我的库和代码:
#Libraries used:
library(tidyverse)
library(dplyr)
library(tidyr)
library(stringr)
library(reshape2)
library(Hmisc)
library(mctest)
library(rpart)
library(e1071)
library(caTools)
library(rpart.plot)
library(neuralnet)
library(RColorBrewer)
library(rattle)
library(graphics)
library(missForest)
library(VIM)
library(caret)
library(ggplot2)
library(fmsb)
library(hrbrthemes)
library(babynames)
library(kernlab)
library(scales)
#Phase One: Data Preprocessing:
#Loading in the "vgsales.csv" data:
game_sales <- read.csv("vgsales.csv", header = T, stringsAsFactors = F)
#turning the structure of the data to tibble for ease of use:
game_sales <- as_tibble(game_sales)
#Replacing the "N/A" character values in Year_of_Release with real NA values:
game_sales %>% filter(game_sales$Year_of_Release == "N/A")
game_sales <- game_sales %>% mutate( Year_of_Release = gsub("N/A","", Year_of_Release))
#Changing the data type of column Year_of_release from "chr" to "int":
game_sales$Year_of_Release <- as.integer(game_sales$Year_of_Release)
#Imputing Year_of_Release variable and inserting the imputed values:
imputeyear <- with(game_sales,impute(game_sales$Year_of_Release, 'random'))
game_sales <- game_sales %>% mutate (Year_of_Release = imputeyear)
I'm trying to impute missing values in the Year_of_Release variable using the with() and impute() functions in R , but i get this error : Error in match.arg(what) : 'arg' should be one of “median”, “mean”.
below are my libraries and the code:
#Libraries used:
library(tidyverse)
library(dplyr)
library(tidyr)
library(stringr)
library(reshape2)
library(Hmisc)
library(mctest)
library(rpart)
library(e1071)
library(caTools)
library(rpart.plot)
library(neuralnet)
library(RColorBrewer)
library(rattle)
library(graphics)
library(missForest)
library(VIM)
library(caret)
library(ggplot2)
library(fmsb)
library(hrbrthemes)
library(babynames)
library(kernlab)
library(scales)
#Phase One: Data Preprocessing:
#Loading in the "vgsales.csv" data:
game_sales <- read.csv("vgsales.csv", header = T, stringsAsFactors = F)
#turning the structure of the data to tibble for ease of use:
game_sales <- as_tibble(game_sales)
#Replacing the "N/A" character values in Year_of_Release with real NA values:
game_sales %>% filter(game_sales$Year_of_Release == "N/A")
game_sales <- game_sales %>% mutate( Year_of_Release = gsub("N/A","", Year_of_Release))
#Changing the data type of column Year_of_release from "chr" to "int":
game_sales$Year_of_Release <- as.integer(game_sales$Year_of_Release)
#Imputing Year_of_Release variable and inserting the imputed values:
imputeyear <- with(game_sales,impute(game_sales$Year_of_Release, 'random'))
game_sales <- game_sales %>% mutate (Year_of_Release = imputeyear)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您已使用
e1071::impute
屏蔽了Hmisc::impute
,它没有“随机”选项。您可以通过在代码中指定 Hmisc::impute(...) 来修复。您可以通过缩小脚本(例如,用于加载、清理和建模数据的单独脚本)并且不加载不必要的包来减少此类问题的发生。如果您只需要一两个函数一两次,请使用
::
而不是加载整个包。并注意什么来自哪里。tidyverse
已加载dplyr
、tidyr
、stringr
、ggplot2
。希望tidyr
应该意味着您不需要reshape2
。graphics
是基础 R 的一部分,默认加载。You've masked
Hmisc::impute
withe1071::impute
, which has no "random" option. You can fix by specifying Hmisc::impute(...) in your code.You can make problems like this more rare by making your scripts smaller (e.g., separate scripts for loading, cleaning, and modeling data), and not loading unnecessary packages. If you only need one or two functions one or two times, use
::
instead of loading the whole package. And be aware of what comes from where.tidyverse
already loadsdplyr
,tidyr
,stringr
,ggplot2
. Hopefullytidyr
should mean you don't needreshape2
.graphics
is part of base R and loaded by default.