从 R 中的直方图获取频率值
我知道如何绘制直方图或其他频率/百分比相关的表格。 但现在我想知道,如何在表中获取这些频率值以供事后使用。
我有一个庞大的数据集,现在我绘制一个具有设置的 binwidth 的直方图。我想提取与每个 binwidth 相对应的频率值(即 y 轴上的值)并将其保存在某处。
有人可以帮我解决这个问题吗? 谢谢你!
I know how to draw histograms or other frequency/percentage related tables.
But now I want to know, how can I get those frequency values in a table to use after the fact.
I have a massive dataset, now I draw a histogram with a set binwidth. I want to extract the frequency value (i.e. value on y-axis) that corresponds to each binwidth and save it somewhere.
Can someone please help me with this?
Thank you!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
hist
函数有一个返回值(histogram
类的对象):The
hist
function has a return value (an object of classhistogram
):来自
?hist
:评估
“直方图”类的对象,它是一个包含组件的列表:
这些是名义上的中断,不带有边界模糊。
all(diff(breaks) == 1),它们是相对频率 counts/n
并且一般满足 sum[i; f^(x[i]) (b[i+1]-b[i])] = 1,其中 b[i]
= 中断[i]。
兼容性。
相同。
breaks
和密度
几乎提供了您所需的一切:From
?hist
:Value
an object of class "histogram" which is a list with components:
These are the nominal breaks, not with the boundary fuzz.
all(diff(breaks) == 1), they are the relative frequencies counts/n
and in general satisfy sum[i; f^(x[i]) (b[i+1]-b[i])] = 1, where b[i]
= breaks[i].
compatibility.
the same.
breaks
anddensity
provide just about all you need:以防万一有人在考虑到 ggplot 的
geom_histogram
时遇到这个问题,请注意,有一种方法可以从 ggplot 对象中提取数据。以下便利函数输出一个数据帧,其中包含每个 bin 的下限 (
xmin
)、每个 bin 的上限 (xmax
)、每个 bin 的中点 (x
),以及频率值(y
)。插图:
我在这里回答的一个相关问题(Cumulative histogram with ggplot2)。
Just in case someone hits this question with
ggplot
'sgeom_histogram
in mind, note that there is a way to extract the data from a ggplot object.The following convenience function outputs a dataframe with the lower limit of each bin (
xmin
), the upper limit of each bin (xmax
), the mid-point of each bin (x
), as well as the frequency value (y
).Illustration:
A related question I answered here (Cumulative histogram with ggplot2).