使用二进制值时 R 中的直方图

发布于 2024-09-26 14:42:17 字数 240 浏览 6 评论 0原文

我有几所学校的学生数据。我想使用 R 显示每所学校通过测试的所有学生百分比的直方图。 我的数据看起来像这样(id,学校,通过/失败):

432342 school1通过

454233 school2失败

543245 school1失败

等'

(重点是我只对通过的学生的百分比感兴趣,显然是那些没有通过的学生我想为每所学校有一栏,显示该学校通过的学生的百分比)

谢谢

I have data of students from several schools. I want to show a histogram of the percentage of all students that passed the test in each school, using R.
My data looks like this (id,school,passed/failed):

432342 school1 passed

454233 school2 failed

543245 school1 failed

etc'

(The point is that I am only interested in the percent of students that passed, obviously those that didn't passed have failed. I want to have one column for each school that shows the percent of the students in that school that passed)

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

梦回梦里 2024-10-03 14:42:17

有很多方法可以做到这一点。
一是:

df<-data.frame(ID=sample(100),
school=factor(sample(3,100,TRUE),labels=c("School1","School2","School3")),
result=factor(sample(2,100,TRUE),labels=c("passed","failed")))

p<-aggregate(df$result=="passed"~school, mean, data=df)
barplot(p[,2]*100,names.arg=p[,1])

there are many ways to do that.
one is:

df<-data.frame(ID=sample(100),
school=factor(sample(3,100,TRUE),labels=c("School1","School2","School3")),
result=factor(sample(2,100,TRUE),labels=c("passed","failed")))

p<-aggregate(df$result=="passed"~school, mean, data=df)
barplot(p[,2]*100,names.arg=p[,1])
燃情 2024-10-03 14:42:17

我之前的回答并没有完全解决。这是重做。示例来自 @eyjo 的回答。

students <- 400
schools <- 5

df <- data.frame(
  id = 1:students,
  school = sample(paste("school", 1:schools, sep = ""), size = students, replace = TRUE),
  results = sample(c("passed", "failed"), size = students, replace = TRUE, prob = c(.8, .2)))

r <- aggregate(results ~ school, FUN = table, data = df)
r <- do.call(cbind, r) # "flatten" the result
r <- as.data.frame(cbind(r, sum = rowSums(r)))

r$perc.passed <- round(with(r, (passed/sum) * 100), 0)

library(ggplot2)

ggplot(r, aes(x = school, y = perc.passed)) +
  theme_bw() +
  geom_bar(stat = "identity")

输入图片此处描述

My previous answer didn't go all the way. Here's a redo. Example is the one from @eyjo's answer.

students <- 400
schools <- 5

df <- data.frame(
  id = 1:students,
  school = sample(paste("school", 1:schools, sep = ""), size = students, replace = TRUE),
  results = sample(c("passed", "failed"), size = students, replace = TRUE, prob = c(.8, .2)))

r <- aggregate(results ~ school, FUN = table, data = df)
r <- do.call(cbind, r) # "flatten" the result
r <- as.data.frame(cbind(r, sum = rowSums(r)))

r$perc.passed <- round(with(r, (passed/sum) * 100), 0)

library(ggplot2)

ggplot(r, aes(x = school, y = perc.passed)) +
  theme_bw() +
  geom_bar(stat = "identity")

enter image description here

风透绣罗衣 2024-10-03 14:42:17

由于您有个人记录(id)并且想要根据索引(学校)进行计算,我建议tapply为此。

students <- 400
schools <- 5

df <- data.frame("id" = 1:students,
    "school" = sample(paste("school", 1:schools, sep = ""),
        size = students, replace = TRUE),
    "results" = sample(c("passed", "failed"),
        size = students, replace = TRUE, prob = c(.8, .2)))

p <- tapply(df$results == "passed", df$school, mean) * 100

barplot(p)

Since you have individual records (id) and want to calculate based on index (school) I would suggest tapply for this.

students <- 400
schools <- 5

df <- data.frame("id" = 1:students,
    "school" = sample(paste("school", 1:schools, sep = ""),
        size = students, replace = TRUE),
    "results" = sample(c("passed", "failed"),
        size = students, replace = TRUE, prob = c(.8, .2)))

p <- tapply(df$results == "passed", df$school, mean) * 100

barplot(p)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文