需要 Perl 中峰值信号检测的帮助
大家好,我从酵母菌落板的图像中得到了一些强度值。我需要能够从强度值中找到峰值。下面的示例图像显示了绘制图表时值的外观。
一些值的示例
5.7 5.3 8.2 16.5 34.2 58.8 **75.4** 75 65.9 62.6 58.6 66.4 71.4 53.5 40.5 26.8 14.2 8.6 5.9 7.7 14.9 30.5 49.9 69.1 **75.3** 69.8 58.8 57.2 56.3 67.1 69 45.1 27.6 13.4 8 5
这些值在 75.4 和 75.3 处显示两个峰值,您可以看到值先增加然后减少。变化并不总是相同的。
强度值图
http://lh4.ggpht.com/_aEDyS6ECO8s/THKTLgDPhaI/AAAAAAAAAio/HQW7Ut-HBhA/s400/peaks.pngFrom research我正在考虑做的事情之一是将每个组(即山脉)存储在散列中,然后查找组中的最大值。我遇到的问题之一是如何确定每个组的边界。
这是我迄今为止拥有的代码的链接: http://paste-it.net/public/y485822/
这是一个链接完整的数据集: http://paste-it.net/public/ub121b4/
我正在编写我的代码在 Perl 中。任何帮助将不胜感激。谢谢
Hi everyone I have some values of intensities from images of yeast colony plates. I need to be able to find the peak values from the intensity values. Below is an example image showing how the values look when graphed.
Example of some of the values
5.7 5.3 8.2 16.5 34.2 58.8 **75.4** 75 65.9 62.6 58.6 66.4 71.4 53.5 40.5 26.8 14.2 8.6 5.9 7.7 14.9 30.5 49.9 69.1 **75.3** 69.8 58.8 57.2 56.3 67.1 69 45.1 27.6 13.4 8 5
These values show two peaks at 75.4 and 75.3, you can see that the values increase then decrease. The change is not always the same.
Graph of intensity values
http://lh4.ggpht.com/_aEDyS6ECO8s/THKTLgDPhaI/AAAAAAAAAio/HQW7Ut-HBhA/s400/peaks.pngFrom research
One of the things that I am thinking of doing is to store each of the groups i.e. mountains in a hash then look for the largest value in a group. One if the issues that I am seeing though is how to determine the boundaries of each of the groups.
Here is a link to the code that I have so far:
http://paste-it.net/public/y485822/
Here is a link to a complete data set:
http://paste-it.net/public/ub121b4/
I am writing my code in Perl. Any help would be greatly appreciated. Thank you
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您需要决定峰的本地化程度。这里的方法可以在广泛的数据区域内找到波峰和波谷。
}
输出:
You need to decide how local you want the peaks to be. The approach here finds peaks and troughs within broad regions of the data.
}
Output:
请注意,这将第一个值视为局部最大值,以及值 71.4 和 69。我不确定您如何区分要包含哪些值。
Note that this considers the first value a local maximum, and also the values 71.4 and 69. I'm not sure how you are distinguishing which ones you want included.
你有控制数据集吗?如果是这样,我建议使用酵母强度和对照图像之间的简单对数比率来标准化您的数据。
然后,您可以使用 ChiPOTle 的 Perl 端口 来获取重要的峰值,这听起来更重要比搜索局部/全局最大值等更强大。ChiPOTle
“是一种用于分析 ChIP 芯片微阵列数据的峰值查找算法”,但我已经在许多其他应用中成功使用了它(例如 ChIP-seq,无可否认,它更接近于它的最初目的比你的情况要多)。
所得的对数(酵母/对照)负值将用于构建高斯背景模型以进行显着性估计。然后,该算法使用错误发现率进行多重测试校正。
这是原始论文。
Do you have a control data set? If so, I'd recommend normalizing your data using say, a simple log ratio between yeast intensities and control images.
You could then use the perl port of ChiPOTle to grab the significant peaks, which sounds way more robust than searching local/global maxima, etc.
ChiPOTle "is a peak-finding algorithm used to analyze ChIP-chip microarray data", but I've used it successfully in many other applications (like ChIP-seq, which admittedly is closer to its original purpose than in your case).
The resulting log(yeast/control) negative values would be used to build a Gaussian background model for significance estimation. The algorithm then uses the false-discovery rate for multiple testing correction.
Here's the original paper.