一列组合之间的公共列
我有我的分析数据集。为了解释结果,我尝试构建一个数据框
结果应类似于:
gene_name | Motif_id_1 |主题_id_2 |发生次数 | Matched_sequence
这里一些motif_id可能共享gene_name,结果应该是motif_id的两个组合(允许重叠)。
我已经尝试过以下代码,但是结果没有给出motif_id内的组合。
merge_practice <- reshape2::dcast(group_geneid_CT,
motif_id+ motif_id~gene_name,
value.var ="matched_sequence",
drop = T,fill = 0,
fun.aggregate = length )
如果可能的话,我想让它提高内存和时间效率,并减少对包的依赖。谁能给我另一个视角?
I have dataset from my analysis. To interpretate the result, I am trying to build a dataframe
Result should be like :
gene_name | Motif_id_1 | Motif_id_2 | Occurence | Matched_sequence
here some motif_id may share gene_name and result should be two combination of motif_id(overlap allowed.)
I have tried following code, however the result does not give combination within motif_id.
merge_practice <- reshape2::dcast(group_geneid_CT,
motif_id+ motif_id~gene_name,
value.var ="matched_sequence",
drop = T,fill = 0,
fun.aggregate = length )
If possible, I want to make it memory and time efficient and less dependency with packages. Can anyone give me an another perspective?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
由reprex 包 (v2.0.0) 于 2022 年 3 月 1 日创建
< code>co_occurrent 如果在两个主题中都找到,则应为 2;如果仅在一个主题中找到,则应为 1。
Created on 2022-03-01 by the reprex package (v2.0.0)
co_occurrent
should be either 2 if it was found in both motifs or 1 if it was only found in one motif.