如何可视化基因网络和基因聚类组?
我正在处理生物数据——即基因组。例如:
group 1: geneA geneB geneC
group 2: geneD geneE
group 3: geneF geneG geneH
对于每对基因,geneX
和 geneY
我有一个分数,说明这两个基因的相似程度(实际上,我有两个分数,因为我使用了 BLAST是“定向”:我首先针对所有其他基因搜索 geneX
,然后针对所有其他基因搜索 geneY
,所以我有两个 geneX--geneY
> 分数,但我想我可以取两者中较低的分数,或者平均值)。
所以,假设我对每对基因只有一个分数。我的数据可以被视为无向图:
并回想一下每条边都有一个分数。
现在,我想做的是:
以交互方式可视化我的数据:能够单击基因节点 并打开附加到它们的链接,仅显示高于/低于某个阈值的边缘,控制网络如何“传播”等。
将组聚集在一起 是相似的,即具有 相似的基因。
我该怎么做有什么想法吗?我想这是基本的集群,我希望任何有关软件包/软件的提示可以在这里提供任何帮助。
谢谢。
I'm working with biological data - namely groups of genes. For example:
group 1: geneA geneB geneC
group 2: geneD geneE
group 3: geneF geneG geneH
For each pair of genes, geneX
and geneY
I have a score telling how similiar the two genes are (actually, I have two scores, since I used BLAST which is 'directional': I first searched geneX
against all the other genes then geneY
against all the other genes, so I have two geneX--geneY
scores, but I guess I can take the lower score of the two, or the average).
So, let's suppose I have only one score for each pair of genes. My data can be viewed as a undirected graph:
and recall each edge has a score attached to it.
Now, what I would like to do is:
Visualize my data interactively: being able to click on gene nodes
and open a link attached to them, show only edges above/below some threshold, control how the network is "spread", etc.Cluster together groups which
are similar, i.e. groups that have
similar genes.
Any ideas of how can I do that? I guess it's basic clustering and I would appreciate any hints on packages/software that can be of any help here.
Thank you.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果您在生物信息学 stackexchange BioStar 上询问这个问题,您可能会得到更好的答复。
具体来说,该线程中的许多答案可能是相关的:
哪个是在有向图(网络)中表示生物路径的最佳软件?
You'll probably get better responses if you ask this over at BioStar, the bioinformatics stackexchange.
Specifically, many of the answers in this thread might be relevant:
Which is the best software to represent biological pathways in a directed graph (network) ?
您可以尝试 cluto。您必须将三元组(gene_1、gene_2、相似性)转换为矩阵并使用“scluster”。
You can try cluto. You will have to transform your triples (gene_1, gene_2, similarity) into a matrix and use 'scluster'.