具有对数刻度和自定义中断的直方图

发布于 2024-07-30 00:06:49 字数 454 浏览 6 评论 0原文

我正在尝试在 R 中生成一个具有 y 对数刻度的直方图。 目前我这样做:

hist(mydata$V3, breaks=c(0,1,2,3,4,5,25))

这给了我一个直方图,但是 0 到 1 之间的密度是如此之大(大约一百万个值差异),以至于你几乎无法辨认出任何其他条形。

然后我尝试这样做:

mydata_hist <- hist(mydata$V3, breaks=c(0,1,2,3,4,5,25), plot=FALSE)
plot(rpd_hist$counts, log="xy", pch=20, col="blue")

它给了我一些我想要的东西,但底部显示了值1-6而不是0、1、2、3、4、5、25。它还将数据显示为点而不是酒吧。 barplot 有效,但我没有得到任何底轴。

I'm trying to generate a histogram in R with a logarithmic scale for y. Currently I do:

hist(mydata$V3, breaks=c(0,1,2,3,4,5,25))

This gives me a histogram, but the density between 0 to 1 is so great (about a million values difference) that you can barely make out any of the other bars.

Then I've tried doing:

mydata_hist <- hist(mydata$V3, breaks=c(0,1,2,3,4,5,25), plot=FALSE)
plot(rpd_hist$counts, log="xy", pch=20, col="blue")

It gives me sorta what I want, but the bottom shows me the values 1-6 rather than 0, 1, 2, 3, 4, 5, 25. It's also showing the data as points rather than bars. barplot works but then I don't get any bottom axis.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

初与友歌 2024-08-06 00:06:49

从您的问题中尚不完全清楚您是否需要记录的 x 轴或记录的 y 轴。 使用条形图时,记录 y 轴不是一个好主意,因为它们锚定在零,记录后会变成负无穷大。 您可以使用频数多边形或密度图来解决此问题。

It's not entirely clear from your question whether you want a logged x-axis or a logged y-axis. A logged y-axis is not a good idea when using bars because they are anchored at zero, which becomes negative infinity when logged. You can work around this problem by using a frequency polygon or density plot.

酸甜透明夹心 2024-08-06 00:06:49

另一种选择是使用 ggplot2 包。

ggplot(mydata, aes(x = V3)) + geom_histogram() + scale_x_log10()

Another option would be to use the package ggplot2.

ggplot(mydata, aes(x = V3)) + geom_histogram() + scale_x_log10()
倾城花音 2024-08-06 00:06:49

直方图是穷人的密度估计。 请注意,在使用默认参数调用 hist() 时,您会得到频率而不是概率 - 将 ,prob=TRUE 添加到调用中,如果你想要概率。

至于对数轴问题,如果您不想转换 x 轴,请不要使用“x”:

plot(mydata_hist$count, log="y", type='h', lwd=10, lend=2)

在对数 y 刻度上获取条形图 - 外观和感觉仍然有点不同,但可能可以被调整。

最后,您还可以执行 hist(log(x), ...) 来获取数据日志的直方图。

A histogram is a poor-man's density estimate. Note that in your call to hist() using default arguments, you get frequencies not probabilities -- add ,prob=TRUE to the call if you want probabilities.

As for the log axis problem, don't use 'x' if you do not want the x-axis transformed:

plot(mydata_hist$count, log="y", type='h', lwd=10, lend=2)

gets you bars on a log-y scale -- the look-and-feel is still a little different but can probably be tweaked.

Lastly, you can also do hist(log(x), ...) to get a histogram of the log of your data.

挽袖吟 2024-08-06 00:06:49

运行 hist() 函数而不制作图形,对计数进行对数转换,然后绘制图形。

hist.data = hist(my.data, plot=F)
hist.data$counts = log(hist.data$counts, 2)
plot(hist.data)

它应该看起来像常规直方图,但 y 轴将是 log2 频率。

Run the hist() function without making a graph, log-transform the counts, and then draw the figure.

hist.data = hist(my.data, plot=F)
hist.data$counts = log(hist.data$counts, 2)
plot(hist.data)

It should look just like the regular histogram, but the y-axis will be log2 Frequency.

霊感 2024-08-06 00:06:49

德克的回答很好。 如果您想要像 hist 生成的外观,您也可以尝试以下操作:

buckets <- c(0,1,2,3,4,5,25)
mydata_hist <- hist(mydata$V3, breaks=buckets, plot=FALSE)
bp <- barplot(mydata_hist$count, log="y", col="white", names.arg=buckets)
text(bp, mydata_hist$counts, labels=mydata_hist$counts, pos=1)

最后一行是可选的,它在每个条形的顶部下方添加值标签。 这对于对数刻度图很有用,但也可以省略。

我还传递 mainxlabylab 参数来提供绘图标题、x 轴标签和 y 轴标签。

Dirk's answer is a great one. If you want an appearance like what hist produces, you can also try this:

buckets <- c(0,1,2,3,4,5,25)
mydata_hist <- hist(mydata$V3, breaks=buckets, plot=FALSE)
bp <- barplot(mydata_hist$count, log="y", col="white", names.arg=buckets)
text(bp, mydata_hist$counts, labels=mydata_hist$counts, pos=1)

The last line is optional, it adds value labels just under the top of each bar. This can be useful for log scale graphs, but can also be omitted.

I also pass main, xlab, and ylab parameters to provide a plot title, x-axis label, and y-axis label.

懒猫 2024-08-06 00:06:49

我已经组合了一个函数,该函数在默认情况下的行为与 hist 相同,但接受 log 参数。 它使用了其他海报中的一些技巧,但添加了一些自己的技巧。 hist(x)myhist(x) 看起来相同。

最初的问题可以通过以下方式解决:

myhist(mydata$V3, breaks=c(0,1,2,3,4,5,25), log="xy")

函数:

myhist <- function(x, ..., breaks="Sturges",
                   main = paste("Histogram of", xname),
                   xlab = xname,
                   ylab = "Frequency") {
  xname = paste(deparse(substitute(x), 500), collapse="\n")
  h = hist(x, breaks=breaks, plot=FALSE)
  plot(h$breaks, c(NA,h$counts), type='S', main=main,
       xlab=xlab, ylab=ylab, axes=FALSE, ...)
  axis(1)
  axis(2)
  lines(h$breaks, c(h$counts,NA), type='s')
  lines(h$breaks, c(NA,h$counts), type='h')
  lines(h$breaks, c(h$counts,NA), type='h')
  lines(h$breaks, rep(0,length(h$breaks)), type='S')
  invisible(h)
}

读者练习: 不幸的是,并不是所有与 hist 一起工作的东西都可以与 myhist 一起工作。 不过,这应该可以通过更多努力来解决。

I've put together a function that behaves identically to hist in the default case, but accepts the log argument. It uses several tricks from other posters, but adds a few of its own. hist(x) and myhist(x) look identical.

The original problem would be solved with:

myhist(mydata$V3, breaks=c(0,1,2,3,4,5,25), log="xy")

The function:

myhist <- function(x, ..., breaks="Sturges",
                   main = paste("Histogram of", xname),
                   xlab = xname,
                   ylab = "Frequency") {
  xname = paste(deparse(substitute(x), 500), collapse="\n")
  h = hist(x, breaks=breaks, plot=FALSE)
  plot(h$breaks, c(NA,h$counts), type='S', main=main,
       xlab=xlab, ylab=ylab, axes=FALSE, ...)
  axis(1)
  axis(2)
  lines(h$breaks, c(h$counts,NA), type='s')
  lines(h$breaks, c(NA,h$counts), type='h')
  lines(h$breaks, c(h$counts,NA), type='h')
  lines(h$breaks, rep(0,length(h$breaks)), type='S')
  invisible(h)
}

Exercise for the reader: Unfortunately, not everything that works with hist works with myhist as it stands. That should be fixable with a bit more effort, though.

单身情人 2024-08-06 00:06:49

这是一个漂亮的 ggplot2 解决方案:

library(ggplot2)
library(scales)  # makes pretty labels on the x-axis

breaks=c(0,1,2,3,4,5,25)

ggplot(mydata,aes(x = V3)) + 
  geom_histogram(breaks = log10(breaks)) + 
  scale_x_log10(
    breaks = breaks,
    labels = scales::trans_format("log10", scales::math_format(10^.x))
  )

请注意,要在 geom_histogram 中设置中断,必须将它们转换为与 scale_x_log10 一起使用

Here's a pretty ggplot2 solution:

library(ggplot2)
library(scales)  # makes pretty labels on the x-axis

breaks=c(0,1,2,3,4,5,25)

ggplot(mydata,aes(x = V3)) + 
  geom_histogram(breaks = log10(breaks)) + 
  scale_x_log10(
    breaks = breaks,
    labels = scales::trans_format("log10", scales::math_format(10^.x))
  )

Note that to set the breaks in geom_histogram, they had to be transformed to work with scale_x_log10

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文