如何找到大数据框的非NA值(样本量)?

发布于 2025-01-17 21:06:52 字数 585 浏览 4 评论 0原文

我有一个大的数据框,其中包含大量NAS。这些行是来自不同图的土壤样品,柱是化学变量。我想创建一个具有每个变量的样本大小的列或数据框架,以识别可以绘制的变量。

当我尝试在线查找时,有针对相关测试和答案的答案集中在寻找特定值的发生数量,而不仅仅是非NA矢量的存在,因此对我没有帮助。

我可以通过对每列中的NAS进行计数并从样本#中减去nas来实现问题,但是我有400列,并且不知道如何编写函数?

样本IDC:N%Fe
Plot1463
Plot2Na5

如果这是表,我想要“ C:N样本大小”的列或数据框架,= 1,%Fe = 2。因为每个列变量只有1行,所以我想我想将其作为新的数据框架或表。

如果有指向R的良好指南的任何链接,以用于R的数据框架,我也很感激 - 这是我的第一个问题。

谢谢你!

I have a large data frame that contains lots of NAs. The rows are soil samples from different plots, and the columns are chemical variables. I wanted to create a column or data frame with the sample size of each variable to identify which variables may be undersampled.

When I tried looking online, there were answers that were specific to correlation tests and answers focused on finding number of occurrences of specific values, not just the presence of a non-NA vector, so that did not help me.

I can brute-force the issue by counting NAs in each column and subtracting those from the # of samples, but I have 400 columns and don't know how to write a function?

Sample IDC:N%Fe
Plot1463
Plot2NA5

If this were the table, I'd want a column or data frame of "C:N sample size" = 1, %Fe = 2. This is where it's odd, because there would only be 1 row for each column variable, so I guess I'd want to make it as a new data frame or table.

If there's any links to good guides for making reprexes for data frames for R, I'd also appreciate that- this is my first question.

Thank you!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

生寂 2025-01-24 21:06:52

这将为您提供 your_dataframe 中每列的 NA

library(dplyr)
library(purrr)

your_dataframe %>% 
  map_df(~sum(is.na(.)))

This will give you the NAs per column in your_dataframe

library(dplyr)
library(purrr)

your_dataframe %>% 
  map_df(~sum(is.na(.)))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文