将两种抽样方法收集的社区数据转换为素食主义者矩阵
我通过两种采样方法收集了社区数据,我想将其转换为一个矩阵(或两个?不确定哪个是正确的输入),以便使用 vegan 包进行下游分析,以比较每种方法在检测社区差异方面的表现(使用布雷-柯蒂斯和阿诺西姆)。
这里有两个示例数据框:
method1 <- data.frame(site = c('site1','site1','site1','site1','site2','site2','site2','site2','site3','site3','site3','site3'),
sampleID = c("site1.net1.2018", "site1.net2.2018","site1.net1.2019", "site1.net2.2019", "site2.net1.2018", "site2.net2.2018","site2.net1.2019","site2.net2.2019","site3.net1.2018", "site3.net2.2018", "site3.net1.2019", "site3.net2.2019"),
year = c("2018", "2018", "2019", "2019","2018", "2018", "2019", "2019","2018", "2018", "2019", "2019"),
species = c('Sp1','Sp2','Sp1','Sp3','Sp4','Sp2','Sp1','Sp2','Sp1','Sp3','Sp4','Sp2'),
abundance = c(1,7,1,6,2,5,2,1,6,3,2,1),
method = c("method1","method1","method1","method1","method1","method1","method1","method1","method1","method1","method1","method1"))
method2 <- data.frame(site = c('site1','site1','site1','site1','site2','site2','site2','site2','site3','site3','site3','site3'),
sampleID = c("site1.net1.2018", "site1.net2.2018","site1.net1.2019", "site1.net2.2019", "site2.net1.2018", "site2.net2.2018","site2.net1.2019","site2.net2.2019","site3.net1.2018", "site3.net2.2018", "site3.net1.2019", "site3.net2.2019"),
year = c("2018", "2018", "2019", "2019","2018", "2018", "2019", "2019","2018", "2018", "2019", "2019"),
species = c('Sp2','Sp4','Sp5','Sp1','Sp3','Sp1','Sp6','Sp1','Sp3','Sp4','Sp1','Sp5'),
abundance = c(2,1,3,3,5,2,10,6,4,2,1,1),
method = c("method2","method2","method2","method2","method2","method2","method2","method2","method2","method2","method2","method2"))
> head(method1)
site sampleID year species abundance method
1 site1 site1.net1.2018 2018 Sp1 1 method1
2 site1 site1.net2.2018 2018 Sp2 7 method1
3 site1 site1.net1.2019 2019 Sp1 1 method1
4 site1 site1.net2.2019 2019 Sp3 6 method1
5 site2 site2.net1.2018 2018 Sp4 2 method1
6 site2 site2.net2.2018 2018 Sp2 5 method1
我不清楚数据应该如何以矩阵形式格式化作为素食包的输入,特别是因为有多年、样本和方法。例如,vegan 的文档显示以下内容,指示将单独的 df 用于分类/环境变量:
data(dune)
data(dune.env)
dune.dist <- vegdist(dune)
attach(dune.env)
dune.ano <- anosim(dune.dist, Management)
此示例有一个用于多种管理类型的社区矩阵,但我不清楚是否需要制作一个或两个矩阵每种采样方法的矩阵,以及如何将数据合并为按方法、年份和样本 ID 格式化的二进制存在/不存在矩阵。
I have community data collected by two sampling methods that I want to transform into a matrix (or two? not sure which would be correct input) for a downstream analysis using the vegan package to compare how well each method performs at detecting community dissimilarity (using bray-curtis and anosim).
Here are two example dataframes:
method1 <- data.frame(site = c('site1','site1','site1','site1','site2','site2','site2','site2','site3','site3','site3','site3'),
sampleID = c("site1.net1.2018", "site1.net2.2018","site1.net1.2019", "site1.net2.2019", "site2.net1.2018", "site2.net2.2018","site2.net1.2019","site2.net2.2019","site3.net1.2018", "site3.net2.2018", "site3.net1.2019", "site3.net2.2019"),
year = c("2018", "2018", "2019", "2019","2018", "2018", "2019", "2019","2018", "2018", "2019", "2019"),
species = c('Sp1','Sp2','Sp1','Sp3','Sp4','Sp2','Sp1','Sp2','Sp1','Sp3','Sp4','Sp2'),
abundance = c(1,7,1,6,2,5,2,1,6,3,2,1),
method = c("method1","method1","method1","method1","method1","method1","method1","method1","method1","method1","method1","method1"))
method2 <- data.frame(site = c('site1','site1','site1','site1','site2','site2','site2','site2','site3','site3','site3','site3'),
sampleID = c("site1.net1.2018", "site1.net2.2018","site1.net1.2019", "site1.net2.2019", "site2.net1.2018", "site2.net2.2018","site2.net1.2019","site2.net2.2019","site3.net1.2018", "site3.net2.2018", "site3.net1.2019", "site3.net2.2019"),
year = c("2018", "2018", "2019", "2019","2018", "2018", "2019", "2019","2018", "2018", "2019", "2019"),
species = c('Sp2','Sp4','Sp5','Sp1','Sp3','Sp1','Sp6','Sp1','Sp3','Sp4','Sp1','Sp5'),
abundance = c(2,1,3,3,5,2,10,6,4,2,1,1),
method = c("method2","method2","method2","method2","method2","method2","method2","method2","method2","method2","method2","method2"))
> head(method1)
site sampleID year species abundance method
1 site1 site1.net1.2018 2018 Sp1 1 method1
2 site1 site1.net2.2018 2018 Sp2 7 method1
3 site1 site1.net1.2019 2019 Sp1 1 method1
4 site1 site1.net2.2019 2019 Sp3 6 method1
5 site2 site2.net1.2018 2018 Sp4 2 method1
6 site2 site2.net2.2018 2018 Sp2 5 method1
It's unclear to me how the data should be formatted in matrix form as input into the vegan package, especially since there are multiple years, samples, and methods. For example, the documentation for vegan shows the following that indicates a separate df is to be used for categorical/environmental variables:
data(dune)
data(dune.env)
dune.dist <- vegdist(dune)
attach(dune.env)
dune.ano <- anosim(dune.dist, Management)
This example has one community matrix for multiple management types, but it's unclear to me whether i need to make one matrix or two matrices for each sampling method, and how to coalesce the data into a binary presence/absence matrix formatted by method, year, and sampleID.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
dune
和dune.env
data.frame 很好地说明了您所需的数据结构。您需要一个群落矩阵,相当于
沙丘
,行中包含站点,列中包含物种。所以像这样:您需要为每个站点/年份/方法单独的行。 (我无法真正理解数据中的
sampleID
列的含义,因此上表可能是错误的)。您还需要一个独立的自变量 data.frame,相当于
dune.env
,解释社区矩阵中每一行的特征(请注意,dune
和dune.env
具有相同的行数)。所以像这样:等等...
然后您可以计划您的分析。您可以轻松使用
adonis
等函数来测试使用method1
和method2
检测到的社区之间是否存在差异,同时考虑地点
和年份
。但是,您说您想调查“每种方法在检测社区差异方面的表现如何” - 您是否有要检测的已知社区?确切的分析将取决于您的目标。The
dune
anddune.env
data.frames do a pretty good job of illustrating the data structure that you need.You want a community matrix, equivalent to
dune
, with sites in rows and species in columns. So something like this:You would want separate rows for each site/year/method. (I can't really understand what the
sampleID
column in your data means, so the above table may be wrong).You also want a separate data.frame of independent variables, equivalent to
dune.env
, explaining the characteristics of each row in your community matrix (note thatdune
anddune.env
have the same number of rows). So something like this:etc...
You can then plan your analysis. You could easily use a function like
adonis
to test whether there are differences between the communities detected usingmethod1
andmethod2
, while accounting forSite
andYear
. However, you say you want to investigate "how well each method performs at detecting community dissimilarity" - do you have known communities that you're trying to detect? The exact analysis will depend on your aim.