如何从一行中提取几个标记的字符串(python)
我的朋友们,
我在这个问题上花了相当多的时间......但还无法找到更好的方法来做到这一点。顺便说一句,我正在用 python 编码。
因此,这是我正在使用的文件中的一行文本,例如:
“>ref|ZP_01631227.1| 3-脱氢奎宁合酶 [Nodularia spumigena CCY9414]...”
如何提取两个字符串“ZP_01631227” .1”和“Nodularia spumigena CCY9414”来自生产线?
成对的“||”括号就像标记,所以我们知道我们想要将字符串放在两者之间......
我想我可能可以循环遍历该行中的所有字符并以困难的方式完成它。只是需要花费很多时间...想知道是否有 python 库或其他聪明的方法可以很好地做到这一点?
感谢大家!
My Friends,
I spent quite some time on this one... but cannot yet figure out a better way to do it. I am coding in python, by the way.
So, here is a line of text in a file I am working with, for example:
">ref|ZP_01631227.1| 3-dehydroquinate synthase [Nodularia spumigena CCY9414]..."
How can I extract the two strings "ZP_01631227.1" and "Nodularia spumigena CCY9414" from the line?
The pairs of "| |" and brackets are like markers so we know we want to get the strings in between the two...
I guess I can probably loop over all the characters in the line and do it the hard way. It just takes so much time... Wondering if there is a python library or other smart ways to do it nicely?
Thanks to all!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
一种简洁的替代方案是正则表达式(由于某种原因,它们在 Python 社区中名声不佳,但它们确实为简单的文本处理提供了简洁性和强大功能):
One concise alternative is a regular expression (for some reason they have a bad rep in the Python community, but they do provide conciseness and power for simple text handling):