关于火山图的一个问题
我试图使用 log2(ratio) 用一些真实数据制作火山图 与 Z 值显着性的比较。然而,点的分散与“正常”火山图的反差太小,我得到了一个尖锐的“V”形图。
据我了解,如果相同的 X 值具有不同的 Y 值,则会发生分散。但我在这里缺少什么?
情节看起来很奇怪: http://img402.imageshack.us/i/volcanoi.jpg/< /a>
数据(比率)可从pastebin 或附加文件中获得: http://pastebin.com/m2Jss3qF
R代码:我在这里做错了什么吗?
data <- read.table("data.txt",header=FALSE)
ratio <- data$V1
ratio.mean <- mean(ratio)
ratio.sd <- sd(ratio)
ratio.log <- log2(ratio)
z <- (ratio-ratio.mean)/(ratio.sd)
z.sig <- 2*pnorm(-abs(z))
z.tsig <- 2*pt(-abs(z),df=length(ratio)-1) ## sig from t-dist
op <- par(mfrow=c(1,4))
plot(ratio.log,-log10(z.sig))
plot(ratio.log, -log10(z.tsig))
plot(ratio.log,z.sig)
plot(ratio,z)
par(op)
I was trying to make a volcano plot with some real data, using log2(ratio)
vs. Z-value significance. However the scatter of the points is too less contrary to 'normal' volcano plots and I'm getting a sharp 'V' shaped plot.
I understand that scatter occurs if one has different Y values for the Same X values. But what I'm missing here?
The plot looks strange: http://img402.imageshack.us/i/volcanoi.jpg/
The data(ratio) is available from pastebin or the file attached:
http://pastebin.com/m2Jss3qF
The R Code:Am I doing something wrong here?
data <- read.table("data.txt",header=FALSE)
ratio <- data$V1
ratio.mean <- mean(ratio)
ratio.sd <- sd(ratio)
ratio.log <- log2(ratio)
z <- (ratio-ratio.mean)/(ratio.sd)
z.sig <- 2*pnorm(-abs(z))
z.tsig <- 2*pt(-abs(z),df=length(ratio)-1) ## sig from t-dist
op <- par(mfrow=c(1,4))
plot(ratio.log,-log10(z.sig))
plot(ratio.log, -log10(z.tsig))
plot(ratio.log,z.sig)
plot(ratio,z)
par(op)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我对您的数据的含义以及为什么以这种方式生成 p 值感到有点困惑。
无论如何,火山图通常在 x 轴上显示倍数差异,在 y 轴上显示 p 值。您会得到一个奇怪的形状,因为本质上您是根据特定数据点与数据平均值的距离(这有点奇怪)来生成特定数据点的 p 值。
考虑高于平均值的数据。随着数据点越来越接近平均值,相关的 p 值单调增加。相反,倍数变化也单调递减。
I'm bit confused with what your data means and why you are generating the p-values in this way.
Anyway, a volcano plot usually has the fold difference on the x-axis and the p-value on the y-axis. You are getting a strange shape because essentially you generate your p-value for a particular data point based on how far away it is from the mean of the data (which is a bit odd).
Consider the data above the mean value. As the data point gets closer to the mean value, the associated p-value monotonically increases. Conversely, the fold change also monotonically decreases.