Phylo BioPython 构建树

发布于 2024-09-29 22:36:43 字数 1184 浏览 9 评论 0原文

我尝试使用 BioPython、Phylo 模块构建一棵树。
到目前为止我所做的是这张图片: alt text

每个名称都有一个四位数字,后跟 - 和一个数字:这个数字指的是序列被表示的次数。这意味着 1578 - 22,该节点应该代表 22 个序列。

序列对齐的文件:文件
具有构建树的距离的文件: file

所以现在我知道如何更改节点的每个大小。每个节点都有不同的大小,这很容易制作不同值的数组:

    fh = open(MEDIA_ROOT + "groupsnp.txt")    
    list_size = {}
    for line in fh:
        if '>' in line:
            labels = line.split('>')
            label = labels[-1]
            label = label.split()
            num = line.split('-')
            size = num[-1]
            size = size.split()
            for lab in label:
                for number in size:
                    list_size[lab] = int(number)

    a = array(list_size.values())

但是该数组是任意的,我想将正确的节点大小放入正确的节点中,我尝试了这个:

         for elem in list_size.keys():
             if labels == elem:
                 Phylo.draw_graphviz(tree_xml, prog="neato", node_size=a)

但是当我使用 if 时什么也没有出现陈述。

无论如何要这样做吗?

我真的很感激!

谢谢大家

I trying to build a tree with BioPython, Phylo module.
What I've done so far is this image: alt text

each name has a four digit number followed by - and a number: this number refer to the number of times that sequence is represented. That means 1578 - 22, that node should represent 22sequences.

the file with the sequences aligned: file
the file with the distance to build a tree: file

So now I known how to change each size of the node. Each node has a different size, this is easy doing an array of the different values:

    fh = open(MEDIA_ROOT + "groupsnp.txt")    
    list_size = {}
    for line in fh:
        if '>' in line:
            labels = line.split('>')
            label = labels[-1]
            label = label.split()
            num = line.split('-')
            size = num[-1]
            size = size.split()
            for lab in label:
                for number in size:
                    list_size[lab] = int(number)

    a = array(list_size.values())

But the array is arbitrary, I would like to put the correct node size into the right node, I tried this:

         for elem in list_size.keys():
             if labels == elem:
                 Phylo.draw_graphviz(tree_xml, prog="neato", node_size=a)

but nothing appears when I use the if statement.

Anyway of doing this?

I would really appreciate!

Thanks everybody

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

偷得浮生 2024-10-06 22:36:43

我终于成功了。基本前提是您将使用 labels/nodelist 来构建 node_sizes。这样它们就可以正确关联。我确信我缺少一些重要的选项来使树看起来 100%,但节点大小似乎正确显示。

#basically a stripped down rewrite of Phylo.draw_graphviz
import networkx, pylab
from Bio import Phylo


#taken from draw_graphviz
def get_label_mapping(G, selection): 
    for node in G.nodes(): 
        if (selection is None) or (node in selection): 
            try: 
                label = str(node) 
                if label not in (None, node.__class__.__name__): 
                    yield (node, label) 
            except (LookupError, AttributeError, ValueError): 
                pass


kwargs={}
tree = Phylo.read('tree.dnd', 'newick')
G = Phylo.to_networkx(tree)
Gi = networkx.convert_node_labels_to_integers(G, discard_old_labels=False)

node_sizes = []
labels = dict(get_label_mapping(G, None))
kwargs['nodelist'] = labels.keys()

#create our node sizes based on our labels because the labels are used for the node_list
#this way they should be correct
for label in labels.keys():
    if str(label) != "Clade":
        num = label.name.split('-')
        #the times 50 is just a guess on what would look best
        size = int(num[-1]) * 50
        node_sizes.append(size)

kwargs['node_size'] = node_sizes
posi = networkx.pygraphviz_layout(Gi, 'neato', args='') 
posn = dict((n, posi[Gi.node_labels[n]]) for n in G) 

networkx.draw(G, posn, labels=labels, node_color='#c0deff', **kwargs)

pylab.show()

结果树
替代文字

I finally got this working. The basic premise is that you're going to use the labels/nodelist to build your node_sizes. This way they correlate properly. I'm sure I'm missing some important options to make the tree look 100% but it appears the node sizes are showing up properly.

#basically a stripped down rewrite of Phylo.draw_graphviz
import networkx, pylab
from Bio import Phylo


#taken from draw_graphviz
def get_label_mapping(G, selection): 
    for node in G.nodes(): 
        if (selection is None) or (node in selection): 
            try: 
                label = str(node) 
                if label not in (None, node.__class__.__name__): 
                    yield (node, label) 
            except (LookupError, AttributeError, ValueError): 
                pass


kwargs={}
tree = Phylo.read('tree.dnd', 'newick')
G = Phylo.to_networkx(tree)
Gi = networkx.convert_node_labels_to_integers(G, discard_old_labels=False)

node_sizes = []
labels = dict(get_label_mapping(G, None))
kwargs['nodelist'] = labels.keys()

#create our node sizes based on our labels because the labels are used for the node_list
#this way they should be correct
for label in labels.keys():
    if str(label) != "Clade":
        num = label.name.split('-')
        #the times 50 is just a guess on what would look best
        size = int(num[-1]) * 50
        node_sizes.append(size)

kwargs['node_size'] = node_sizes
posi = networkx.pygraphviz_layout(Gi, 'neato', args='') 
posn = dict((n, posi[Gi.node_labels[n]]) for n in G) 

networkx.draw(G, posn, labels=labels, node_color='#c0deff', **kwargs)

pylab.show()

Resulting Tree
alt text

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文