为什么“ X”现在是以下R代码的这五个伽马分布的混合物?

发布于 2025-02-10 19:21:37 字数 548 浏览 1 评论 0原文

我尝试从伽马分布的以下混合模型中进行采样:

在此处输入图像描述

R代码如下:

可以将算法转化为矢量化方法。

步骤1:生成一个随机示例k_1,...,k_n vector k中的整数,其中p(k)=θ_k, k = 1,...,5。

步骤2:设定速率= 1/k。

n <- 5000
k <- sample(1:5, size=n, replace=TRUE, prob=(1:5)/15)
rate <- 1/k
x <- rgamma(n, shape=3, rate=rate)

我的问题是,为什么x现在是这五个伽玛分布的混合物?在混合模型的表达中,似乎我们还需要系数theta_k

I try to sample from the following mixture models of gamma distribution:

enter image description here

The R code is as follows:

The algorithm can be translated into a vectorized approach.

Step 1: Generate a random sample k_1,...,k_n of integers in a vector k ,where P(k)=θ_k, k=1,...,5.

Step 2: Set rate=1/k.

n <- 5000
k <- sample(1:5, size=n, replace=TRUE, prob=(1:5)/15)
rate <- 1/k
x <- rgamma(n, shape=3, rate=rate)

My question is why x is now the mixture of these five gamma distributions? In the expression of the mixture model, it seems that we also need coefficient theta_k?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

南城追梦 2025-02-17 19:21:37

以下是将来自伽马混合物分布的样品与预期混合物密度进行比较的两种方法。这应该有助于了解f_x是伽马分布混合物的累积分布函数。

# Fix random seed for reproducibility
set.seed(2022)

# Sample data
n <- 100000
X <- unlist(lapply(1:5, function(j) rgamma(n * j, shape = 3, rate = 1 / j)))

# Weighted Gamma mixture density
dmix <- function(x)  sapply(
    x,
    function(val) sum((1:5) / 15 * dgamma(val, shape = 3, rate = 1 / (1:5))))
library(tibble)
data = tibble(x = seq(0, ceiling(max(X)), length.out = 100), y = dmix(x))

# Plot histogram of samples and compare with density 
library(ggplot2)
ggplot(data.frame(x = X), aes(x)) +
    geom_histogram(aes(y = ..density..), bins = 200) + 
    geom_line(data = data, aes(x, y))

评论:

  1. 采样时,我们根据权重调整每个伽马分布的样品数量;在这种情况下,权重是1:5
  2. 我们使用aes(y = ..密度..)将直方图表示为正确归一化的密度,以便我们可以将值与混合物密度dmix进行比较。

Here are two ways to compare samples from a Gamma mixture distribution with the expected mixture density. This should help understand how F_X is the cumulative distribution function of a mixture of Gamma distributions.

# Fix random seed for reproducibility
set.seed(2022)

# Sample data
n <- 100000
X <- unlist(lapply(1:5, function(j) rgamma(n * j, shape = 3, rate = 1 / j)))

# Weighted Gamma mixture density
dmix <- function(x)  sapply(
    x,
    function(val) sum((1:5) / 15 * dgamma(val, shape = 3, rate = 1 / (1:5))))
library(tibble)
data = tibble(x = seq(0, ceiling(max(X)), length.out = 100), y = dmix(x))

# Plot histogram of samples and compare with density 
library(ggplot2)
ggplot(data.frame(x = X), aes(x)) +
    geom_histogram(aes(y = ..density..), bins = 200) + 
    geom_line(data = data, aes(x, y))

enter image description here

Comments:

  1. When sampling we adjust the number of samples from each Gamma distribution based on the weights; in this case the weights are simply 1:5.
  2. We use aes(y = ..density..) to express the histogram as a properly normalised density so that we can compare values with the mixture density dmix.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文