绘制缺失值Hugh DataFrame r

发布于 2025-02-12 16:51:20 字数 2146 浏览 0 评论 0原文

我有一个巨大的数据框（Fortig），其中有815个变量和约5000个观测值。其中之一列，$ date包含年度为价值。我想在一年内可视化不同变量的缺失值。以下命令naniar :: gg_miss_fct（fertig，date） 有效，但是有太多的观察值可涉水。

因此，如何可视化前20个变量，然后是接下来的20个变量，依此类推。（甚至最好是通过变量名的前5个字母（因为它们对变量进行分组）将它们分开。谢谢。

我的数据结构的一部分：

    head(structure(Fertig),10)
  1Berlin_Briefkurs Staatsschuldscheine 4%
1                                       NA
  1Berlin_Geldkurs Staatsschuldscheine 4% 1Berlin_BK Staatsschuldscheine 3,5%
1                                      NA                                  NA
  1Berlin_GK Staatsschuldscheine 3,5% 1Berlin_BK Pr.-Englische Obligation 1830
1                                  NA                                       NA
  1Berlin_GK Pr.-Englische Obligation 1830
1                                       NA
  1Berlin_BK Prämienscheine Seehandlung 1Berlin_GK Prämienscheine Seehandlung
1                                    NA                                    NA
  1Berlin_BK Kurmärkische Obligation 1Berlin_GK Kurmärkische Obligation
1                                 NA                                 NA
  1Berlin_BK Neumärkische Interimsscheine
1                                      NA
  1Berlin_GK Neumärkische Interimsscheine
1                                      NA
  1Berlin_BK Berliner Stadtobligationen 4%
1                                       NA
  1Berlin_GK Berliner Stadtobligationen 4%
1                                       NA
  1Berlin_BK Berliner Stadtobligationen 3,5%

    > dput(head(Fertig[, 1:5]))
structure(list(`1Berlin_Briefkurs Staatsschuldscheine 4%` = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `1Berlin_Geldkurs Staatsschuldscheine 4%` = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `1Berlin_BK Staatsschuldscheine 3,5%` = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `1Berlin_GK Staatsschuldscheine 3,5%` = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `1Berlin_BK Pr.-Englische Obligation 1830` = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_)), row.names = c(NA, 
6L), class = "data.frame")

原文

I have a huge dataframe (Fertig) with 815 variables and about 5000 observations.
One of the columns, $date contains years as values.
I would like to visualize missing values for the different variables in one year.
The following command naniar::gg_miss_fct(Fertig, date)
worked, but there are too many observations to wade through.

So, how can I visualize the first 20 variables, then the next 20 variables, and so on. (Even better would be to separate them by the first 5 letters of the variable name (since they group the variables)).
Thanks.

Part of my data structure:

    head(structure(Fertig),10)
  1Berlin_Briefkurs Staatsschuldscheine 4%
1                                       NA
  1Berlin_Geldkurs Staatsschuldscheine 4% 1Berlin_BK Staatsschuldscheine 3,5%
1                                      NA                                  NA
  1Berlin_GK Staatsschuldscheine 3,5% 1Berlin_BK Pr.-Englische Obligation 1830
1                                  NA                                       NA
  1Berlin_GK Pr.-Englische Obligation 1830
1                                       NA
  1Berlin_BK Prämienscheine Seehandlung 1Berlin_GK Prämienscheine Seehandlung
1                                    NA                                    NA
  1Berlin_BK Kurmärkische Obligation 1Berlin_GK Kurmärkische Obligation
1                                 NA                                 NA
  1Berlin_BK Neumärkische Interimsscheine
1                                      NA
  1Berlin_GK Neumärkische Interimsscheine
1                                      NA
  1Berlin_BK Berliner Stadtobligationen 4%
1                                       NA
  1Berlin_GK Berliner Stadtobligationen 4%
1                                       NA
  1Berlin_BK Berliner Stadtobligationen 3,5%

    > dput(head(Fertig[, 1:5]))
structure(list(`1Berlin_Briefkurs Staatsschuldscheine 4%` = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `1Berlin_Geldkurs Staatsschuldscheine 4%` = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `1Berlin_BK Staatsschuldscheine 3,5%` = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `1Berlin_GK Staatsschuldscheine 3,5%` = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `1Berlin_BK Pr.-Englische Obligation 1830` = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_)), row.names = c(NA, 
6L), class = "data.frame")

分享到QQ

分享到微博