获取 DNA 子串的原始顺序
我想获取长DNA序列的子串
例如,给定:
1/ATXGAAATTXXGGAAGGGGTGG
2/AATXGAAGGAAGGAAGGGGATATTX
3/AAAAAATTXXGGAAGGGGXTTTA
4/AAAATTXXATAXXGGAAGGGGXTXG
5/ATTATTGTTXAXTATTT
输出为:
1/TXG - TTXX
2/TXG -
3/ - TTXX
4/TTXX - TXG
5/ -
我尝试了以下正则表达式模式:
(TXG|TTXX)
它有效,结果放入列表中,但我不知道如何检索原始序列中出现的每个结果的顺序。那是, TTXX
和 TXG
是否分别出现在序列 4 中的第一个和第二个,以及序列 1 中的第二个和第一个;在第二个和第三个结果中,这更困难,因为 match-xx 函数调用不提供从相关序列中获取的子字符串的索引。感谢您的见解。
I'd like to get the substrings of long DNA sequences
For example, given:
1/ATXGAAATTXXGGAAGGGGTGG
2/AATXGAAGGAAGGAAGGGGATATTX
3/AAAAAATTXXGGAAGGGGXTTTA
4/AAAATTXXATAXXGGAAGGGGXTXG
5/ATTATTGTTXAXTATTT
the output is to be:
1/TXG - TTXX
2/TXG -
3/ - TTXX
4/TTXX - TXG
5/ -
I tried the following regex pattern:
(TXG|TTXX)
and it works, and the results are put in a list but I don't know how to retrieve the order of each result that has appeared in the original sequences. That is,
whether TTXX
and TXG
appear first and second respectively as in sequence 4 but second and first as in sequence 1; and in 2nd and 3rd results, that is harder because match-xx function call doesn't offer an index of the substring which it took from the sequence in question. Thank you for your insights.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
怎么样:
输出:
How about:
output:
如果你放置 2 个匹配函数怎么办?
What if you put 2 matching functions?