根据另一个变量替换Na,而将NA替换为观察Na,而Na没有无误的邻居
在这里,我有一个看起来像这样的数据:
year <- c(2000,2001,2002,2003,2005,2006,2007,2008,2009,2010)
x <- c(1,2,3,NA,5,NA,NA,NA,9,10)
dat <- data.frame(year, x)
- 我想根据年度变量将
na
替换为最近的邻居。
例如,数据的第四位(第一个na
)从其左邻居而不是右邻居中获取值,因为其年度“ 2003”更接近“ 2002”,而不是“ 2005”
- 我想在没有最近的nonna邻居时将
na
留在那里。
例如,数据的第七名(第三个NA)仍然是Na,因为它没有非NA邻居。
插算后,结果x应为1,2,3,3,5,5,Na,9,9,9,10
Here I have a data that looks like this:
year <- c(2000,2001,2002,2003,2005,2006,2007,2008,2009,2010)
x <- c(1,2,3,NA,5,NA,NA,NA,9,10)
dat <- data.frame(year, x)
- I want to replace
NA
with the nearest neighbor according to the year variable.
For example, The fourth place of the data (the first NA
) takes the value from its left neighbor rather than its right neighbor because its year "2003" is closer to "2002" instead of "2005"
- I want to leave the
NA
there when it does not have nearest nonNA neighbor.
For example, the seventh place of the data (the third NA) will still be NA because it does not have non-NA neighbor.
After imputing, the resulting x should be 1, 2, 3, 3, 5, 5, NA, 9, 9, 10
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
一个选项是使用
case_when
从tidyverse
使用。从本质上讲,如果上一行的年份较近,并且不是na
,则从该行返回x
。如果不是,请选择下面的行。或者,如果年份更近,但是有一个na
,请返回下面的行。然后,如果以下行有较近的一年,但具有na
,请返回上面的行。如果一行没有na
,则只需返回x
。输出
One option would be to make use of
case_when
fromtidyverse
. Essentially, if the previous row has a closer year and is notNA
, then returnx
from that row. If not, then choose the row below. Or if the year is closer above but there is anNA
, then return the row below. Then, same for if the row below has a closer year, but has anNA
, then return the row above. If a row does not have anNA
, then just returnx
.Output
使用
imap()
的方法:A method using
imap()
:data.table
方法a
data.table
approach