实现计算马尔可夫链周期变化方差的函数

发布于 2025-01-16 07:31:16 字数 2801 浏览 1 评论 0原文

我正在研究一个研究项目，不久前我问Mathematics Stack Exchange 上的这个问题，我在其中寻找一种计算方法给定转换矩阵的时期间收入变化的方差，其中每个状态对应于给定向量中收入的对数水平。我想计算一个人在每个州开始的 n 个时期内收入变化的方差是多少。我的状态空间由 11 个状态组成，因此我希望最终得到一个由 11 个不同方差组成的向量。当我提出这个问题时，我得到了满意的答案，但是当我尝试在 RI 中进行编码时遇到了一些问题，希望得到帮助。

我创建了这段代码来计算方差：

install.packages("expm")
library(expm)

# creating standard basis vectors
e <- function(i) {
  e_i = rep(0, length(alpha))
  e_i[i] = 1
  return(e_i)
}

# compute variances
p2p_variance = function(n, alpha, P) {
  variance = list()
  pi_n = list()
  for (i in 1:length(alpha)) {
    pi_n[[i]] = e(i) %*% (P %^% n)
    beta = (t(alpha) - t(alpha)[i])^2
    variance[[i]] = (pi_n[[i]] %*% t(beta)) - (((pi_n[[i]] %*% alpha) - alpha[i]) %^% 2)
  }
  return(t(variance))
}

对于我的 alpha（收入对数水平向量）和 P（转换矩阵）值，我使用：

alpha = c(3.4965, 3.5835, 3.6636, 3.7377, 3.8067, 3.8712, 3.9318,  3.9890, 4.0431, 4.0943, 4.1431)
P = rbind(c(0.9004, 0.0734, 0.0203, 0.0043, 0.0010, 0.0003, 0.0001, 0.0001, 0.0000, 0.0000, 0.0000),
          c(0.3359, 0.3498, 0.2401, 0.0589, 0.0115, 0.0026, 0.0007, 0.0003, 0.0001, 0.0001, 0.0000),
          c(0.1583, 0.1538, 0.3931, 0.2346, 0.0481, 0.0090, 0.0021, 0.0007, 0.0003, 0.0001, 0.0001),
          c(0.0746, 0.0609, 0.1600, 0.4368, 0.2178, 0.0397, 0.0073, 0.0019, 0.0006, 0.0002, 0.0001),
          c(0.0349, 0.0271, 0.0559, 0.1724, 0.4628, 0.2031, 0.0344, 0.0067, 0.0018, 0.0006, 0.0003),
          c(0.0155, 0.0122, 0.0230, 0.0537, 0.1817, 0.4870, 0.1860, 0.0316, 0.0066, 0.0018, 0.0009),
          c(0.0066, 0.0054, 0.0100, 0.0204, 0.0529, 0.1956, 0.4925, 0.1772, 0.0307, 0.0064, 0.0023),
          c(0.0025, 0.0023, 0.0043, 0.0084, 0.0186, 0.0530, 0.2025, 0.4980, 0.1760, 0.0275, 0.0067),
          c(0.0009, 0.0009, 0.0017, 0.0035, 0.0072, 0.0168, 0.0490, 0.2025, 0.5194, 0.1721, 0.0260),
          c(0.0003, 0.0003, 0.0007, 0.0013, 0.0029, 0.0061, 0.0142, 0.0430, 0.2023, 0.5485, 0.1804),
          c(0.0001, 0.0001, 0.0002, 0.0003, 0.0008, 0.0017, 0.0032, 0.0068, 0.0212, 0.1079, 0.8578))

例如，调用 p2p_variance(100, alpha, P)（计算 100 个周期内的方差）会产生以下方差向量：

0.04393012 0.04091066 0.03856503 0.03636202 0.03472286 0.03331921 0.03213084 0.03068901 0.03143765 0.03255994 0.03522346

这似乎是合理的。但是，如果我运行 p2p_variance(1000, alpha, P)，结果是：

0.06126449 0.03445073 0.009621497 -0.01447615 -0.03652425 -0.05752316 -0.07753646 -0.09726683 -0.1134972 -0.1287498 -0.141676

这显然是不正确的，因为我们不能有负方差。我不明白为什么简单地将 n 增加到 1000 会导致负方差。我很可能错误地编码了 p2p_variance 函数，但我一生都找不到问题。或者也许我用来发现这些差异的过程存在某种缺陷？如果有人可以查看这段代码并帮助我诊断问题，我将非常感激

原文

I am working on a research project and a while back I asked this question on Mathematics Stack Exchange, where I was looking for a way to calculate the variance of the period-to-period change in income given a transition matrix, where each state corresponds to a log level of income in a vector, which is given. I want to calculate what the variance of an individual's change in income is over some n number of periods given that they began in each state. My state space consists of 11 states, so I hope to end up with a vector consisting of 11 different variances. When I asked the question, I received a satisfactory answer, but I am running into some issues when trying to code it in R I was hoping to receive help with.

I have created this piece of code to calculate the variances:

install.packages("expm")
library(expm)

# creating standard basis vectors
e <- function(i) {
  e_i = rep(0, length(alpha))
  e_i[i] = 1
  return(e_i)
}

# compute variances
p2p_variance = function(n, alpha, P) {
  variance = list()
  pi_n = list()
  for (i in 1:length(alpha)) {
    pi_n[[i]] = e(i) %*% (P %^% n)
    beta = (t(alpha) - t(alpha)[i])^2
    variance[[i]] = (pi_n[[i]] %*% t(beta)) - (((pi_n[[i]] %*% alpha) - alpha[i]) %^% 2)
  }
  return(t(variance))
}

And for my values of alpha (vector of log levels of income) and P (transition matrix) I use:

alpha = c(3.4965, 3.5835, 3.6636, 3.7377, 3.8067, 3.8712, 3.9318,  3.9890, 4.0431, 4.0943, 4.1431)
P = rbind(c(0.9004, 0.0734, 0.0203, 0.0043, 0.0010, 0.0003, 0.0001, 0.0001, 0.0000, 0.0000, 0.0000),
          c(0.3359, 0.3498, 0.2401, 0.0589, 0.0115, 0.0026, 0.0007, 0.0003, 0.0001, 0.0001, 0.0000),
          c(0.1583, 0.1538, 0.3931, 0.2346, 0.0481, 0.0090, 0.0021, 0.0007, 0.0003, 0.0001, 0.0001),
          c(0.0746, 0.0609, 0.1600, 0.4368, 0.2178, 0.0397, 0.0073, 0.0019, 0.0006, 0.0002, 0.0001),
          c(0.0349, 0.0271, 0.0559, 0.1724, 0.4628, 0.2031, 0.0344, 0.0067, 0.0018, 0.0006, 0.0003),
          c(0.0155, 0.0122, 0.0230, 0.0537, 0.1817, 0.4870, 0.1860, 0.0316, 0.0066, 0.0018, 0.0009),
          c(0.0066, 0.0054, 0.0100, 0.0204, 0.0529, 0.1956, 0.4925, 0.1772, 0.0307, 0.0064, 0.0023),
          c(0.0025, 0.0023, 0.0043, 0.0084, 0.0186, 0.0530, 0.2025, 0.4980, 0.1760, 0.0275, 0.0067),
          c(0.0009, 0.0009, 0.0017, 0.0035, 0.0072, 0.0168, 0.0490, 0.2025, 0.5194, 0.1721, 0.0260),
          c(0.0003, 0.0003, 0.0007, 0.0013, 0.0029, 0.0061, 0.0142, 0.0430, 0.2023, 0.5485, 0.1804),
          c(0.0001, 0.0001, 0.0002, 0.0003, 0.0008, 0.0017, 0.0032, 0.0068, 0.0212, 0.1079, 0.8578))

For instance, a call of p2p_variance(100, alpha, P) (calculating the variance over 100 periods) results in the following vector of variances:

0.04393012 0.04091066 0.03856503 0.03636202 0.03472286 0.03331921 0.03213084 0.03068901 0.03143765 0.03255994 0.03522346

Which seem plausible. However, If I run p2p_variance(1000, alpha, P), it results in:

0.06126449 0.03445073 0.009621497 -0.01447615 -0.03652425 -0.05752316 -0.07753646 -0.09726683 -0.1134972 -0.1287498 -0.141676

This is obviously not correct, since we cannot have negative variance. I cannot figure out why simply increasing n to 1000 is resulting in negative variance here. I have most likely coded my p2p_variance function incorrectly, but I cannot for the life of me find the issue. Or perhaps is the process I am using to find these variances flawed somehow? I would really appreciate if anyone could look over this code and help me diagnose the issue

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

英雄似剑 2025-01-23 07:31:16

您的方差函数返回差异，如果您想要绝对值（方差），只需将其包装在abs()中，如下所示：

p2p_variance = function(n, alpha, P) {
  variance = list()
  pi_n = list()
  for (i in 1:length(alpha)) {
    pi_n[[i]] = e(i) %*% (P %^% n)
    beta = (t(alpha) - t(alpha)[i])^2
    variance[[i]] = abs((pi_n[[i]] %*% t(beta)) - (((pi_n[[i]] %*% alpha) - alpha[i]) %^% 2))
  }
  return(t(variance))
}

p2p_variance(1000, alpha, P)

输出：

     [,1]       [,2]       [,3]        [,4]       [,5]       [,6]       [,7]       [,8]       [,9]      [,10]     [,11]   
[1,] 0.06126449 0.03445073 0.009621497 0.01447615 0.03652425 0.05752316 0.07753646 0.09726683 0.1134972 0.1287498 0.141676

Your variance function is returning the difference, and if you want the absolute value (variance) just wrap it inside abs() like this:

p2p_variance = function(n, alpha, P) {
  variance = list()
  pi_n = list()
  for (i in 1:length(alpha)) {
    pi_n[[i]] = e(i) %*% (P %^% n)
    beta = (t(alpha) - t(alpha)[i])^2
    variance[[i]] = abs((pi_n[[i]] %*% t(beta)) - (((pi_n[[i]] %*% alpha) - alpha[i]) %^% 2))
  }
  return(t(variance))
}

p2p_variance(1000, alpha, P)

Output:

     [,1]       [,2]       [,3]        [,4]       [,5]       [,6]       [,7]       [,8]       [,9]      [,10]     [,11]   
[1,] 0.06126449 0.03445073 0.009621497 0.01447615 0.03652425 0.05752316 0.07753646 0.09726683 0.1134972 0.1287498 0.141676

回复收藏 0 原文

~没有更多了~