组内的DPLYR相对频率
(希望)简化
我问过我的特定 farmType (有机和传统)的农民,我要求提供有关 sterm 的报告(a,b ) occ ur(0/1)在他们的土地上。
因此,我有
df<-data.frame(id=1:10,
farmtype=c(rep("org",4), rep("conv",6)),
spA=c(0,0,0,1,1,1,1,1,1,1),
spB=c(1,1,1,0,0,0,0,0,0,0)
)
,我的问题很简单...在发生这种物种的有机农场或传统农场中,发生了什么?
解决方案
sp a发生在25%的组织农场和100%的Conv Farms中 SP B发生在75%的组织农场和0%的Conv Farms中,
以下概述的解决方案都没有实现。
**其他问题**
我想要的只是一个简单的ggplot,上面有X轴上的物种和Y轴上的检测百分比(一次用于org,一次用于CORV)。
ggplot(df.melt)+
geom_bar(aes(x=species, fill=farmtype))
### but, of course the species recognitions not just the farm types
(hopefully) simplified
I have asked farmers of a specific farmtype (organic and conventional) that I asked for a report on species (A,B) occur (0/1) on their land.
So, I have
df<-data.frame(id=1:10,
farmtype=c(rep("org",4), rep("conv",6)),
spA=c(0,0,0,1,1,1,1,1,1,1),
spB=c(1,1,1,0,0,0,0,0,0,0)
)
And my question is pretty simple... In what percentage of organic or conventional farms do the species occur?
solution
sp A occurs in 25% of org farms and 100% of conv farms
sp B occurs in 75% of org farms and 0% of conv farms
None of the solutions outlined below achieve that.
**additional question **
All I want is a simple ggplot with the species on the x-axis and the percentage of detection on the y-axis (once for org and once for conv).
ggplot(df.melt)+
geom_bar(aes(x=species, fill=farmtype))
### but, of course the species recognitions not just the farm types
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
Janitor
'STabyl
是您的朋友。您正在计算的是“行”等级,但您想要的是“ col”等级。例如,您也可以使用自己的方法。由farmType组在第二组中,并记住保存数据框架。这很容易与GGPLOT2一起使用,因为它已经以较长的格式使用。
更新:第二种选项的变体也由艾萨克·布拉沃(Isaac Bravo)建议。
janitor
'stabyl
is your friend. What you're calculating is "row"-percentages, but what you want is "col"-percentages. E.g.But you could also use your own approach. Group by farmtype in the second group_by and remember to save the dataframe. This would be easier to use with ggplot2 as it is already in a long format.
Update: A variant of second option also suggested by Isaac Bravo.
在这里,您可以使用您的方法有另一个选项:
输出:
Here you can have another option using your approach:
OUTPUT:
如果我正确理解了海报的第一个问题,那么海报会在生长给定物种的农场中寻求有机农场类型的比例。这也可以使用Data.Table软件包如下完成。
首先,通过设置种子来重新创建示例数据集。
接下来,“无”答案被过滤掉了,因为我们只对报告“发生”列中种植物种的农场感兴趣。然后,我们计算每种农场类型的物种的发生。列“ N”给出了计数。
然后计算每种农场类型的每种物种的总出现。作为对此结果的检查,给定物种的每一行应给出相同的物种。
最后,将色谱柱合并为计算报告的每个物种的有机或常规农场的比例。为了反对结果,每个物种的有机比例和常规比例应总计为1,因为只有两种农场类型。
另一方面,如果人们想计算每个物种的分数到针对有机或常规农场报告的所有物种发生的物种,则可以使用此代码:
这意味着,例如,常规农民报告物种“ A”约24.2他们报告了任何物种的时间。可以通过选择一种物种和农型,并手动计算作为点检查来验证结果。
If I understand the poster's first question correctly, the poster seeks the proportion of organic versus conventional farm types among farms that grew a given species. This can also be accomplished using the data.table package as follows.
First, the example data set is recreated by setting the seed.
Next, the "no" answers are filtered out because we are only interested in farms that reported growing the species in the "occur" column. We then count the occurrences of the species for each farm type. The column "N" gives the count.
The total occurrences of each species for either farm type are then counted. As a check for this result, each row for a given species should give the same species total.
Finally, the columns are combined to calculate the proportion of organic or conventional farms for each species that was reported. As a check against the result, the proportion of organic and the proportion of conventional for each species should sum to 1 because there are only two farm types.
If, on the other hand, one wanted to calculate the fraction of each species to all species occurrences reported for organic or conventional farms, you could use this code:
This result means that, for example, conventional farmers reported species "a" about 24.2% of the times that they reported any species. The result can be verified by selecting a species and farmtype and calculating manually as a spot check.