在分析中将行名称引用为数字(geiger 包)
我正在尝试在 R 中的geiger包中执行tip.disparity函数。
我的数据:
Family Length Wing Tail
Alced 2.21416 1.88129 1.66744
Brachypt 2.36734 2.02373 2.03335
Bucco 2.23563 1.91364 1.80675
当我使用函数“name.check”检查数据中的名称与我的树上的名称是否匹配时,它返回
$data.not.tree
[1] "1" "10" "11" "12" "2" etc
显示它是按数字指代名称。我尝试过转换为字符向量等,
我尝试过运行它,
data.names=NULL
我只是想编辑我的数据框,以便包将名称与我的树中的名称相匹配(树是 newick 格式)
希望这更清楚 谢谢
I'm trying to carry out tip.disparity function in the geiger package in R.
My data:
Family Length Wing Tail
Alced 2.21416 1.88129 1.66744
Brachypt 2.36734 2.02373 2.03335
Bucco 2.23563 1.91364 1.80675
When I use the function "name.check" to check the names from my data match those on my tree, it returns
$data.not.tree
[1] "1" "10" "11" "12" "2" etc
Showing that it is referring to the names by number. Ive tried converting to character vector etc
I've tried running it with
data.names=NULL
I'm looking simply to edit my data frame so that the package matches the names to those in my tree (tree is newick format)
Hope this is clearer
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我相信线索在文档中(
?check.names
):如果您希望程序返回数据框中包含但不存在于树中的分类单元的名称,您可以需要指定相应的名称作为数据框的行名称,或者在
data.names
参数中单独指定它们。请注意,数据框的默认行名称是行号的字符,与您在上面看到的完全相同......根据上面的附加信息进行编辑:
R 无法猜测(或不想)这些名称包含在数据框的
Family
元素中。尝试:从长远来看可能更好:
因为你不想将
Family
包含在你的特征数据集中 - 它是一个标识符,而不是一个特征......如果你看一下包中给出的示例数据的结构:
您可以看到分类单元名称作为行名称包含在内,而不是作为数据框本身的列...
PS 它不是一个很好的界面和 StackOverflow 一样,但是有一个非常友好和活跃的R-for-phylogeny 邮件列表位于
[电子邮件受保护]
...I believe the clue is in the documentation (
?check.names
):If you want the program to return the names of the taxa that are included in the data frame but not present in the tree, you either need to assign the corresponding names as row names of your data frame, or specify them separately in the
data.names
argument. Note that the default row names of a data frame are the character equivalent of the row number, exactly what you're seeing above ...edit based on additional information above:
R can't guess (or doesn't want to) that the names are contained in the
Family
element of your data frame. Try:Probably better in the long run to do:
Because you don't want to have
Family
included in your data set of traits -- it's an identifier, not a trait ...If you look at the structure of the example data given in the package:
you can see that the taxon names are included as row names, not as a column in the data frame itself ...
PS it's not as nice an interface as StackOverflow, but there's a very friendly and active R-for-phylogeny mailing list at
[email protected]
...