如何找到大数据框的非NA值(样本量)?
我有一个大的数据框,其中包含大量NAS。这些行是来自不同图的土壤样品,柱是化学变量。我想创建一个具有每个变量的样本大小的列或数据框架,以识别可以绘制的变量。
当我尝试在线查找时,有针对相关测试和答案的答案集中在寻找特定值的发生数量,而不仅仅是非NA矢量的存在,因此对我没有帮助。
我可以通过对每列中的NAS进行计数并从样本#中减去nas来实现问题,但是我有400列,并且不知道如何编写函数?
样本ID | C:N | %Fe |
---|---|---|
Plot1 | 46 | 3 |
Plot2 | Na | 5 |
如果这是表,我想要“ C:N样本大小”的列或数据框架,= 1,%Fe = 2。因为每个列变量只有1行,所以我想我想将其作为新的数据框架或表。
如果有指向R的良好指南的任何链接,以用于R的数据框架,我也很感激 - 这是我的第一个问题。
谢谢你!
I have a large data frame that contains lots of NAs. The rows are soil samples from different plots, and the columns are chemical variables. I wanted to create a column or data frame with the sample size of each variable to identify which variables may be undersampled.
When I tried looking online, there were answers that were specific to correlation tests and answers focused on finding number of occurrences of specific values, not just the presence of a non-NA vector, so that did not help me.
I can brute-force the issue by counting NAs in each column and subtracting those from the # of samples, but I have 400 columns and don't know how to write a function?
Sample ID | C:N | %Fe |
---|---|---|
Plot1 | 46 | 3 |
Plot2 | NA | 5 |
If this were the table, I'd want a column or data frame of "C:N sample size" = 1, %Fe = 2. This is where it's odd, because there would only be 1 row for each column variable, so I guess I'd want to make it as a new data frame or table.
If there's any links to good guides for making reprexes for data frames for R, I'd also appreciate that- this is my first question.
Thank you!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这将为您提供
your_dataframe
中每列的NA
This will give you the
NA
s per column inyour_dataframe