从循环粘贴到数据框架r的粘贴值r
我在R,Recurrent和L1HS中有两个数据帧。我正在尝试找到一种方法:
如果复发中的序列与L1HS中的序列匹配,请从列中的一列中粘贴一个从复发的列中粘贴到L1HS中的新列中。
复发数据帧看起来像这样:
> head(recurrent)
chr start end X Y level unique
1: chr4 56707846 56708347 0 38 03 chr4_56707846_56708347
2: chr1 20252181 20252682 0 37 03 chr1_20252181_20252682
3: chr2 224560903 224561404 0 37 03 chr2_224560903_224561404
4: chr5 131849595 131850096 0 36 03 chr5_131849595_131850096
5: chr7 46361610 46362111 0 36 03 chr7_46361610_46362111
6: chr1 20251169 20251670 0 36 03 chr1_20251169_20251670
L1HS数据集包含许多列包含遗传序列底部和一个列“序列”的列,希望在复发数据框架中与“唯一”匹配,例如:
> head(L1HS$Sequence)
"chr1_35031657_35037706"
"chr1_67544575_67550598"
"chr1_81404889_81410942"
"chr1_84518073_84524089"
"chr1_87144764_87150794"
我知道如何使用匹配项:我知道如何使用匹配
test <- recurrent$unique %in% L1HS$Sequence
要获得布尔人:
> head(test)
[1] FALSE FALSE FALSE FALSE FALSE FALSE
但是我从这里有几个问题。如果找到了序列,我想将“级别”值从重复数据集复制到新列中的L1HS数据集。例如,如果在全长数据中找到了从复发数据中找到的序列“ CHR4_56707846_56708347”,我希望全长的数据框架看起来像:
Sequence level other_columns
chr4_56707846_56708347 03 gggtttcatgaccc....
我正在考虑尝试类似的东西:
for (i in L1HS){
if (recurrent$unique %in% L1HS$Sequence{
L1HS$level <- paste(recurrent$level[i])}
}
但是,这当然是' t工作,我无法弄清楚。
我想知道最好的方法是什么!我想知道合并/相交/应用是否更容易/更好,或者对于这样一个简单的问题,最佳实践可能是什么样的。我发现了一些类似的python/pandas示例,但我却陷入困境。
提前致谢!
I have two dataframes in R, recurrent and L1HS. I am trying to find a way to do this:
If a sequence in recurrent matches sequence in L1HS, paste a value from a column in recurrent into new column in L1HS.
The recurrent dataframe looks like this:
> head(recurrent)
chr start end X Y level unique
1: chr4 56707846 56708347 0 38 03 chr4_56707846_56708347
2: chr1 20252181 20252682 0 37 03 chr1_20252181_20252682
3: chr2 224560903 224561404 0 37 03 chr2_224560903_224561404
4: chr5 131849595 131850096 0 36 03 chr5_131849595_131850096
5: chr7 46361610 46362111 0 36 03 chr7_46361610_46362111
6: chr1 20251169 20251670 0 36 03 chr1_20251169_20251670
The L1HS dataset contains many columns containing genetic sequence basepairs and a column "Sequence" that should hopefully have some matches with "unique" in the recurrent data frame, like so:
> head(L1HS$Sequence)
"chr1_35031657_35037706"
"chr1_67544575_67550598"
"chr1_81404889_81410942"
"chr1_84518073_84524089"
"chr1_87144764_87150794"
I know how to search for matches using
test <- recurrent$unique %in% L1HS$Sequence
to get the Booleans:
> head(test)
[1] FALSE FALSE FALSE FALSE FALSE FALSE
But I have a couple of problems from here. If the sequence is found, I want to copy the "level" value from the recurrent dataset to the L1HS dataset in a new column. For example, if the sequence "chr4_56707846_56708347" from the recurrent data was found in the full-length data, I'd like the full-length data frame to look like:
Sequence level other_columns
chr4_56707846_56708347 03 gggtttcatgaccc....
I was thinking of trying something like:
for (i in L1HS){
if (recurrent$unique %in% L1HS$Sequence{
L1HS$level <- paste(recurrent$level[i])}
}
but of course this isn't working and I can't figure it out.
I am wondering what the best approach is here! I'm wondering if merge/intersect/apply might be easier/better, or just what best practice might look like for a somewhat simple question like this. I've found some similar examples for Python/pandas, but am stuck here.
Thanks in advance!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以使用
l1Hs
使用dplyr
进行简单的left_join
将级别添加到。或使用
合并
:输出
*注意:这仍然将保留
l1Hs
中的所有列。我只是没有在下面的示例数据中创建任何其他列。数据
You can do a simple
left_join
to addlevel
toL1HS
withdplyr
.Or with
merge
:Output
*Note: This will still retain all the columns in
L1HS
. I just didn't create any additional columns in the example data below.Data