R:映射_discrete`对象只能从数字向量创建

发布于 2025-01-30 15:09:42 字数 2899 浏览 3 评论 0 原文

我在R中有以下数据:

df <- structure(list(t0 = c(3.82, -4.88, NA, -3.83, -3.99, NA, NA, 
NA, 6.35, 2.47, 0.28, 0.3, NA, 8.31, NA, NA, NA, 2.76, NA, 1.38
), t1 = c(NA, NA, NA, NA, NA, NA, -1.23, 2.19, 4.13, 3.49, -0.42, 
NA, 3.78, 2.7, 1.17, NA, NA, NA, NA, NA), t2 = c(-1.85, NA, 1.46, 
0.17, NA, NA, -2.81, 1.75, NA, 2.32, -3.08, -1.39, NA, 7.53, 
1.77, NA, 0.1, NA, NA, -2.61), t3 = c(-2.05, 3.73, -2.04, -0.22, 
-4.29, NA, NA, -0.11, 0.43, NA, -0.78, 3.24, NA, NA, -1.13, 1.09, 
NA, NA, 2.7, NA), t4 = c(1.01, -2.77, NA, -3.05, -2.33, 3.78, 
NA, NA, NA, NA, -2.04, -4.01, -2.32, 4, -0.28, NA, NA, 9.04, 
NA, -4.12), t5 = c(1.56, NA, 4.89, NA, NA, NA, NA, NA, 0.88, 
3.15, NA, NA, 2.59, NA, 2.04, NA, NA, NA, -0.26, NA), t6 = c(0.34, 
-0.99, NA, 1.93, NA, NA, NA, NA, 0.35, NA, -6.46, NA, NA, NA, 
2.57, NA, NA, 4.89, NA, -5.63), t7 = c(0.52, NA, 0.5, 1.85, -6.23, 
NA, NA, 1.59, 7.82, 0.82, NA, NA, -1.77, NA, NA, NA, 2.01, NA, 
0.7, -1.55), t8 = c(NA, NA, 4.9, -3.93, -8.13, 3.14, 0.03, 1.67, 
3.55, NA, -1.55, 2.57, -0.87, NA, 0.71, -0.1, NA, NA, 2.04, NA
), t9 = c(-1.09, NA, -0.52, NA, NA, NA, NA, NA, NA, 2.05, -5.21, 
-0.89, -0.03, NA, 0.66, 3.72, -1.96, NA, NA, NA)), row.names = c(NA, 
20L), class = "data.frame")

使用以下教程(),我正在尝试进行一个可视化,以显示缺少数据的百分比:

library(dplyr)
library(ggplot2)
library(tidyverse)

row.plot <- df %>%
  mutate(id = row_number()) %>%
  gather(-id, key = "key", value = "val") %>%
  mutate(isna = is.na(val)) %>%
  ggplot(aes(key, id, fill = isna)) +
    geom_raster(alpha=0.8) +
    scale_fill_manual(name = "",
        values = c('steelblue', 'tomato3'),
        labels = c("Present", "Missing")) +
    scale_x_discrete(limits = levels) +
    labs(x = "Variable",
           y = "Row Number", title = "Missing values in rows") +
    coord_flip()

当我尝试查看结果时,这是我得到的错误:

row.plot

Error in `new_mapped_discrete()`:
! `mapped_discrete` objects can only be created from numeric vectors
Run `rlang::last_error()` to see where the error occurred.
Warning messages:
1: In structure(in_domain, pos = match(in_domain, breaks)) :
  Calling 'structure(NULL, *)' is deprecated, as NULL cannot have attributes.
  Consider 'structure(list(), *)' instead.
2: In structure(in_domain, pos = match(in_domain, breaks)) :
  Calling 'structure(NULL, *)' is deprecated, as NULL cannot have attributes.
  Consider 'structure(list(), *)' instead.
3: Removed 200 rows containing missing values (geom_raster). 

我的问题:我的问题:<<< /strong>有人可以告诉我我做错了什么,我该如何解决此错误?最后,我想得到这种图片:

I have the following data in R:

df <- structure(list(t0 = c(3.82, -4.88, NA, -3.83, -3.99, NA, NA, 
NA, 6.35, 2.47, 0.28, 0.3, NA, 8.31, NA, NA, NA, 2.76, NA, 1.38
), t1 = c(NA, NA, NA, NA, NA, NA, -1.23, 2.19, 4.13, 3.49, -0.42, 
NA, 3.78, 2.7, 1.17, NA, NA, NA, NA, NA), t2 = c(-1.85, NA, 1.46, 
0.17, NA, NA, -2.81, 1.75, NA, 2.32, -3.08, -1.39, NA, 7.53, 
1.77, NA, 0.1, NA, NA, -2.61), t3 = c(-2.05, 3.73, -2.04, -0.22, 
-4.29, NA, NA, -0.11, 0.43, NA, -0.78, 3.24, NA, NA, -1.13, 1.09, 
NA, NA, 2.7, NA), t4 = c(1.01, -2.77, NA, -3.05, -2.33, 3.78, 
NA, NA, NA, NA, -2.04, -4.01, -2.32, 4, -0.28, NA, NA, 9.04, 
NA, -4.12), t5 = c(1.56, NA, 4.89, NA, NA, NA, NA, NA, 0.88, 
3.15, NA, NA, 2.59, NA, 2.04, NA, NA, NA, -0.26, NA), t6 = c(0.34, 
-0.99, NA, 1.93, NA, NA, NA, NA, 0.35, NA, -6.46, NA, NA, NA, 
2.57, NA, NA, 4.89, NA, -5.63), t7 = c(0.52, NA, 0.5, 1.85, -6.23, 
NA, NA, 1.59, 7.82, 0.82, NA, NA, -1.77, NA, NA, NA, 2.01, NA, 
0.7, -1.55), t8 = c(NA, NA, 4.9, -3.93, -8.13, 3.14, 0.03, 1.67, 
3.55, NA, -1.55, 2.57, -0.87, NA, 0.71, -0.1, NA, NA, 2.04, NA
), t9 = c(-1.09, NA, -0.52, NA, NA, NA, NA, NA, NA, 2.05, -5.21, 
-0.89, -0.03, NA, 0.66, 3.72, -1.96, NA, NA, NA)), row.names = c(NA, 
20L), class = "data.frame")

Using the following tutorial (https://jenslaufer.com/data/analysis/visualize_missing_values_with_ggplot.html), I am trying to make a visualization that shows the percentage of missing data:

library(dplyr)
library(ggplot2)
library(tidyverse)

row.plot <- df %>%
  mutate(id = row_number()) %>%
  gather(-id, key = "key", value = "val") %>%
  mutate(isna = is.na(val)) %>%
  ggplot(aes(key, id, fill = isna)) +
    geom_raster(alpha=0.8) +
    scale_fill_manual(name = "",
        values = c('steelblue', 'tomato3'),
        labels = c("Present", "Missing")) +
    scale_x_discrete(limits = levels) +
    labs(x = "Variable",
           y = "Row Number", title = "Missing values in rows") +
    coord_flip()

When I try to see the results, this is the error that I get:

row.plot

Error in `new_mapped_discrete()`:
! `mapped_discrete` objects can only be created from numeric vectors
Run `rlang::last_error()` to see where the error occurred.
Warning messages:
1: In structure(in_domain, pos = match(in_domain, breaks)) :
  Calling 'structure(NULL, *)' is deprecated, as NULL cannot have attributes.
  Consider 'structure(list(), *)' instead.
2: In structure(in_domain, pos = match(in_domain, breaks)) :
  Calling 'structure(NULL, *)' is deprecated, as NULL cannot have attributes.
  Consider 'structure(list(), *)' instead.
3: Removed 200 rows containing missing values (geom_raster). 

My Question: Can someone please show me what I am doing wrong and how can I fix this error? In the end, I would like to get this kind of picture:

enter image description here

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

心凉怎暖 2025-02-06 15:09:42

错误是由 scale_x_discrete
引起的
您不需要它,因为在您的示例中 ID 是数字,并且没有 calle 作为 factor 将:

df  %>%
  mutate(id = row_number()) %>%
  gather(-id, key = "key", value = "val") %>%
  mutate(isna = is.na(val)) %>%
  ggplot(aes(key, id, fill = isna)) +
  geom_raster(alpha=0.8) +
  scale_fill_manual(name = "",
                    values = c('steelblue', 'tomato3'),
                    labels = c("Present", "Missing")) +
  #scale_x_discrete(limits = levels) 
  labs(x = "Variable",
       y = "Row Number", title = "Missing values in rows") +
  coord_flip()

The error is caused by scale_x_discrete.
You don't need it because in your example id is numeric and doesn't have levels as a factor would:

df  %>%
  mutate(id = row_number()) %>%
  gather(-id, key = "key", value = "val") %>%
  mutate(isna = is.na(val)) %>%
  ggplot(aes(key, id, fill = isna)) +
  geom_raster(alpha=0.8) +
  scale_fill_manual(name = "",
                    values = c('steelblue', 'tomato3'),
                    labels = c("Present", "Missing")) +
  #scale_x_discrete(limits = levels) 
  labs(x = "Variable",
       y = "Row Number", title = "Missing values in rows") +
  coord_flip()

enter image description here

羁〃客ぐ 2025-02-06 15:09:42

看来您想在每行中丢失数据而不是每个变量(尽管我在这里都提供)。主要问题是未提供级别,因此我们可以在此处创建该级别,然后作为 scale_x_discrete 的因素。

library(tidyverse)

output <- df %>%
  mutate(id = row_number()) %>%
  pivot_longer(-id, names_to = "key", values_to = "val") %>%
  select(-key) %>%
  group_by(id) %>%
  mutate(isna = is.na(val),
         total = n()) %>%
  group_by(id, total, isna) %>%
  summarise(num.isna = n()) %>%
  mutate(pct = num.isna / total * 100)

levels <- output %>% filter(isna == T) %>% arrange(desc(pct)) %>% pull(id)

row.plot <- output %>% 
  ggplot() +
  geom_bar(aes(
    x = reorder(id, desc(pct)),
    y = pct,
    fill = isna
  ),
  stat = 'identity',
  alpha = 0.8) +
  scale_x_discrete(limits = factor(levels)) +
  scale_fill_manual(
    name = "",
    values = c('steelblue', 'tomato3'),
    labels = c("Present", "Missing")
  ) +
  coord_flip() +
  labs(title = "Percentage of missing values", x =
         'Row Number', y = "% of missing values")

output

output <- df %>%
  pivot_longer(everything(), names_to = "key", values_to = "val") %>%
  group_by(key) %>%
  mutate(isna = is.na(val),
         total = n()) %>%
  group_by(key, total, isna) %>%
  summarise(num.isna = n()) %>%
  mutate(pct = num.isna / total * 100)

levels <- output %>% filter(isna == T) %>% arrange(desc(pct)) %>% pull(key)


row.plot <- output %>% 
  ggplot() +
  geom_bar(aes(
    x = reorder(key, desc(pct)),
    y = pct,
    fill = isna
  ),
  stat = 'identity',
  alpha = 0.8) +
  scale_x_discrete(limits = levels) +
  scale_fill_manual(
    name = "",
    values = c('steelblue', 'tomato3'),
    labels = c("Present", "Missing")
  ) +
  coord_flip() +
  labs(title = "Percentage of missing values", x =
         'Variable', y = "% of missing values")

​/bhcph.png“ rel =” nofollow noreferrer“>

It looks like you were wanting to produce this plot for missing data on each row rather than for each variable (though I've provided both here). The main issue is that levels is not provided, so we can create that here, then provide as a factor to scale_x_discrete.

library(tidyverse)

output <- df %>%
  mutate(id = row_number()) %>%
  pivot_longer(-id, names_to = "key", values_to = "val") %>%
  select(-key) %>%
  group_by(id) %>%
  mutate(isna = is.na(val),
         total = n()) %>%
  group_by(id, total, isna) %>%
  summarise(num.isna = n()) %>%
  mutate(pct = num.isna / total * 100)

levels <- output %>% filter(isna == T) %>% arrange(desc(pct)) %>% pull(id)

row.plot <- output %>% 
  ggplot() +
  geom_bar(aes(
    x = reorder(id, desc(pct)),
    y = pct,
    fill = isna
  ),
  stat = 'identity',
  alpha = 0.8) +
  scale_x_discrete(limits = factor(levels)) +
  scale_fill_manual(
    name = "",
    values = c('steelblue', 'tomato3'),
    labels = c("Present", "Missing")
  ) +
  coord_flip() +
  labs(title = "Percentage of missing values", x =
         'Row Number', y = "% of missing values")

Output

enter image description here

Or if you want to do it by variable, then:

output <- df %>%
  pivot_longer(everything(), names_to = "key", values_to = "val") %>%
  group_by(key) %>%
  mutate(isna = is.na(val),
         total = n()) %>%
  group_by(key, total, isna) %>%
  summarise(num.isna = n()) %>%
  mutate(pct = num.isna / total * 100)

levels <- output %>% filter(isna == T) %>% arrange(desc(pct)) %>% pull(key)


row.plot <- output %>% 
  ggplot() +
  geom_bar(aes(
    x = reorder(key, desc(pct)),
    y = pct,
    fill = isna
  ),
  stat = 'identity',
  alpha = 0.8) +
  scale_x_discrete(limits = levels) +
  scale_fill_manual(
    name = "",
    values = c('steelblue', 'tomato3'),
    labels = c("Present", "Missing")
  ) +
  coord_flip() +
  labs(title = "Percentage of missing values", x =
         'Variable', y = "% of missing values")

Output

enter image description here

柏林苍穹下 2025-02-06 15:09:42

当我使用您的数据从您的教程中运行代码时,没有错误。也许您想要这样的东西:

library(tidyverse)
missing.values <- df %>%
  gather(key = "key", value = "val") %>%
  mutate(isna = is.na(val)) %>%
  group_by(key) %>%
  mutate(total = n()) %>%
  group_by(key, total, isna) %>%
  summarise(num.isna = n()) %>%
  mutate(pct = num.isna / total * 100)

levels <- (missing.values  %>% filter(isna == T) %>% arrange(desc(pct)))$key

percentage.plot <- missing.values %>%
  ggplot() +
  geom_bar(aes(x = reorder(key, desc(pct)), y = pct, fill=isna), stat = 'identity', alpha=0.8, width = 1) +
  scale_x_discrete(limits = levels) +
  scale_fill_manual(name = "", values = c('goldenrod3', 'firebrick3'), labels = c("Present", "Missing")) +
  coord_flip() +
  labs(title = "Percentage of missing values", x = 'Variable', y = "% of missing values") + 
  theme_bw() +
  theme(panel.grid = element_blank(),
        panel.border = element_blank())

输出:

”在此处输入图像说明”

When I run the code from your tutorial with your data, there is no error. Maybe you want something like this:

library(tidyverse)
missing.values <- df %>%
  gather(key = "key", value = "val") %>%
  mutate(isna = is.na(val)) %>%
  group_by(key) %>%
  mutate(total = n()) %>%
  group_by(key, total, isna) %>%
  summarise(num.isna = n()) %>%
  mutate(pct = num.isna / total * 100)

levels <- (missing.values  %>% filter(isna == T) %>% arrange(desc(pct)))$key

percentage.plot <- missing.values %>%
  ggplot() +
  geom_bar(aes(x = reorder(key, desc(pct)), y = pct, fill=isna), stat = 'identity', alpha=0.8, width = 1) +
  scale_x_discrete(limits = levels) +
  scale_fill_manual(name = "", values = c('goldenrod3', 'firebrick3'), labels = c("Present", "Missing")) +
  coord_flip() +
  labs(title = "Percentage of missing values", x = 'Variable', y = "% of missing values") + 
  theme_bw() +
  theme(panel.grid = element_blank(),
        panel.border = element_blank())

Output:

enter image description here

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文