需要帮助在散点图中绘制两个变量的计数,然后在 R 中拟合该线

发布于 2025-01-11 06:56:16 字数 682 浏览 0 评论 0原文

我需要帮助解决所有这些问题,但特别是绘制散点图并拟合线性回归模型。

  • 过滤掉紧急访问次数较少的任何邮政编码 超过 20
  • 绘制流感样疾病和/或肺炎的计数 就诊次数与所有急诊科就诊次数的比较
  • 绘制图表 最佳拟合线(线性回归)和 R
  • 平方 some.zips 数据集,按邮政编码聚合 ED 访问的平均值。

这是我的代码,但它不起作用。我不断收到“abline(m) 警告: 仅使用 135 个回归系数中的前两个”。有人可以帮忙吗?代码如下。 另外,这是数据集:

fromJSON("https://data.cityofnewyork.us/resource/2nwg-uqyg.json")

library(jsonlite)
library(tidyverse)
library(ALSM)
data(package="ALSM")

filtered_data = filter(er, emergency.visits > 20)

plot(ili_pne_visits~total_ed_visits,data=filtered_data,xlab="Total ER Visits",ylab="Influenza Visits")

m <-lm(ili_pne_visits~total_ed_visits,data=filtered_data)

abline(m)

I need help with all these questions, but specifically plotting the scatterplot and fitting the linear regression model.

  • Filter out any zip code where the number of emergency visits was less
    than 20
  • Plot the Count of influenza-like illness and/or pneumonia
    visits against Count of all emergency department visits
  • Plot the
    line of best fit (linear regression) and the R-squared
  • From the
    some.zips data set, aggregate the mean of ED visits by zip code.

Here is my code, but it is not working. I keep getting "Warning in abline(m) :
only using the first two of 135 regression coefficients". Can someone help? Code below.
Also, here is the dataset :

fromJSON("https://data.cityofnewyork.us/resource/2nwg-uqyg.json")

library(jsonlite)
library(tidyverse)
library(ALSM)
data(package="ALSM")

filtered_data = filter(er, emergency.visits > 20)

plot(ili_pne_visits~total_ed_visits,data=filtered_data,xlab="Total ER Visits",ylab="Influenza Visits")

m <-lm(ili_pne_visits~total_ed_visits,data=filtered_data)

abline(m)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

无畏 2025-01-18 06:56:16

从代码角度来看,这可以完成这项工作:

df <- fromJSON("https://data.cityofnewyork.us/resource/2nwg-uqyg.json")
    
df %>%
    ## convert variables from character to numeric where appropriate:
    mutate(across(mod_zcta:ili_pne_admissions, ~ as.integer(.x))) %>%
    filter(total_ed_visits > 20) %>%
    ggplot(aes(x = total_ed_visits, y = ili_pne_admissions)) +
    geom_point() +
    ## add regression line and confidence band
    geom_smooth(method = 'lm')

但是,将数据不加区别地倒入一个散点图/线性模型中会隐藏有趣的模式 - 例如季节性。绘制 ili_pne 相对于时间的总访问量份额,瞧!

library(lubridate) ## for easy date-time-manipulation

df %>%
    ## convert variables from character to numeric where appropriate:
    mutate(
        across(mod_zcta:ili_pne_admissions, ~ as.integer(.x)),
        date = lubridate::as_datetime(date),
        ili_pne_share = ili_pne_visits / total_ed_visits
        ) %>% 
    filter(total_ed_visits > 20) %>%
    arrange(date) %>%
    ggplot(aes(x = date, y = ili_pne_share)) + 
    geom_line() +
    geom_smooth(span = .1)

code-wise, this will do the job:

df <- fromJSON("https://data.cityofnewyork.us/resource/2nwg-uqyg.json")
    
df %>%
    ## convert variables from character to numeric where appropriate:
    mutate(across(mod_zcta:ili_pne_admissions, ~ as.integer(.x))) %>%
    filter(total_ed_visits > 20) %>%
    ggplot(aes(x = total_ed_visits, y = ili_pne_admissions)) +
    geom_point() +
    ## add regression line and confidence band
    geom_smooth(method = 'lm')

However, pouring the data indiscriminately into one scatterplot/linear model hides interesting patterns - e.g. seasonality. Plotting the share of ili_pne to total visits against time, voila!

library(lubridate) ## for easy date-time-manipulation

df %>%
    ## convert variables from character to numeric where appropriate:
    mutate(
        across(mod_zcta:ili_pne_admissions, ~ as.integer(.x)),
        date = lubridate::as_datetime(date),
        ili_pne_share = ili_pne_visits / total_ed_visits
        ) %>% 
    filter(total_ed_visits > 20) %>%
    arrange(date) %>%
    ggplot(aes(x = date, y = ili_pne_share)) + 
    geom_line() +
    geom_smooth(span = .1)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文