当前位置：文江博客话题详情

评估 BLASTn 分数的重要性？

发布于 2024-08-11 23:26:32 字数 203 浏览 4 评论 0原文

我正在运行独立的命令行blast，将许多查询序列与大型数据库核苷酸序列进行对齐。我可以修改blastn程序的命令行参数来更改各种参数，例如匹配/不匹配分数。

我想知道 - 对于blastn输出的“位分数”，比较具有相同查询和数据库序列但不同匹配/不匹配参数的比对的位分数是否有意义？我正在尝试评估爆炸在各种参数值下的表现如何，但我想确保所有内容都在平等的基础上进行比较。谢谢。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

醉生梦死 2024-08-18 23:26:32

我不清楚为什么您认为比较位分数可以让您了解 BLAST 的性能。通常的方法

不幸的是，BLAST 和其他比对程序的大部分工作都是基于查看局部的、无间隙的比对，并凭经验将这些理论扩展到有间隙的比对。具体来说，位分数的计算方式如下：

S' = ( lambda * S - ln(K) ) / ln(2)

在上面的公式中，K 和 lambda 是替换矩阵的常数，S 是分数（替换分数和间隙分数之和），S' 是位分数。这意味着您的位分数肯定会由于改变间隙打开/间隙扩展参数而改变，这意味着您的比较无效。这是一个不幸的结果，因为关于缺口比对的理论很少，因此必须根据经验测量给定系统的最佳缺口分数。

由于位分数不具有可比性，因此我建议您根据不涉及对齐分数的备用数据集进行评估。例如，如果我对用于比较蛋白质序列的最佳缺口打开/缺口延伸参数感兴趣，我可以查看已知结构的蛋白质，并根据其进行具有结构意义的比对的能力来评估每个参数集。这避免了完全比较对齐分数，这很好，因为单独比较位分数显然没有用。

It's not clear to me why you think that comparing bit scores will give you an insight as to how well BLAST is performing. The usual method for doing

Unfortunately, much of the work on BLAST and other alignment programs is based on looking at local, ungapped alignments and empirically extending those that theory to gapped alignments. In particular, the bit scores are calculated like this:

S' = ( lambda * S - ln(K) ) / ln(2)

In the formula above, K and lambda are constants for your substitution matrix, S is the score (sum of substitution and gap scores), and S' is the bit score. This means that your bit scores will certainly change as a result of varying the gap open/gap extend parameters, which means that your comparison is invalid. This is an unfortunate result of the fact that there is little theory about gapped alignments, so the optimal gap scores for a given system have to be measured empirically.

Because bit scores aren't comparable, I suggest you do your assessment based on an alternate set of data that doesn't involve the alignment scores. For example, if I'm interested in the optimal gap opening/gap extension parameters for comparing protein sequences, I can look at proteins of known structure and assess each parameter set based on its ability make alignments that make structural sense. This avoids comparing the alignment scores entirely, which is good because comparing bit scores on their own isn't obviously useful.

回复收藏 0 原文