获取 DNA 子串的原始顺序
我想获取长DNA序列的子串
例如,给定:
1/ATXGAAATTXXGGAAGGGGTGG
2/AATXGAAGGAAGGAAGGGGATATTX
3/AAAAAATTXXGGAAGGGGXTTTA
4/AAAATTXXATAXXGGAAGGGGXTXG
5/ATTATTGTTXAXTATTT
输出为:
1/TXG - TTXX
2/TXG -
3/ - TTXX
4/TTXX - TXG
5/ -
我尝试了以下正则表达式模式:
(TXG|TTXX)
它有效,结果放入列表中,但我不知道如何检索原始序列中出现的每个结果的顺序。那是, TTXX
和 TXG
是否分别出现在序列 4 中的第一个和第二个,以及序列 1 中的第二个和第一个;在第二个和第三个结果中,这更困难,因为 match-xx 函数调用不提供从相关序列中获取的子字符串的索引。感谢您的见解。
I'd like to get the substrings of long DNA sequences
For example, given:
1/ATXGAAATTXXGGAAGGGGTGG
2/AATXGAAGGAAGGAAGGGGATATTX
3/AAAAAATTXXGGAAGGGGXTTTA
4/AAAATTXXATAXXGGAAGGGGXTXG
5/ATTATTGTTXAXTATTT
the output is to be:
1/TXG - TTXX
2/TXG -
3/ - TTXX
4/TTXX - TXG
5/ -
I tried the following regex pattern:
(TXG|TTXX)
and it works, and the results are put in a list but I don't know how to retrieve the order of each result that has appeared in the original sequences. That is,
whether TTXX
and TXG
appear first and second respectively as in sequence 4 but second and first as in sequence 1; and in 2nd and 3rd results, that is harder because match-xx function call doesn't offer an index of the substring which it took from the sequence in question. Thank you for your insights.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
怎么样:
输出:
How about:
output:
如果你放置 2 个匹配函数怎么办?
What if you put 2 matching functions?