实现计算马尔可夫链周期变化方差的函数
我正在研究一个研究项目,不久前我问Mathematics Stack Exchange 上的这个问题,我在其中寻找一种计算方法给定转换矩阵的时期间收入变化的方差,其中每个状态对应于给定向量中收入的对数水平。我想计算一个人在每个州开始的 n 个时期内收入变化的方差是多少。我的状态空间由 11 个状态组成,因此我希望最终得到一个由 11 个不同方差组成的向量。当我提出这个问题时,我得到了满意的答案,但是当我尝试在 RI 中进行编码时遇到了一些问题,希望得到帮助。
我创建了这段代码来计算方差:
install.packages("expm")
library(expm)
# creating standard basis vectors
e <- function(i) {
e_i = rep(0, length(alpha))
e_i[i] = 1
return(e_i)
}
# compute variances
p2p_variance = function(n, alpha, P) {
variance = list()
pi_n = list()
for (i in 1:length(alpha)) {
pi_n[[i]] = e(i) %*% (P %^% n)
beta = (t(alpha) - t(alpha)[i])^2
variance[[i]] = (pi_n[[i]] %*% t(beta)) - (((pi_n[[i]] %*% alpha) - alpha[i]) %^% 2)
}
return(t(variance))
}
对于我的 alpha(收入对数水平向量)和 P(转换矩阵)值,我使用:
alpha = c(3.4965, 3.5835, 3.6636, 3.7377, 3.8067, 3.8712, 3.9318, 3.9890, 4.0431, 4.0943, 4.1431)
P = rbind(c(0.9004, 0.0734, 0.0203, 0.0043, 0.0010, 0.0003, 0.0001, 0.0001, 0.0000, 0.0000, 0.0000),
c(0.3359, 0.3498, 0.2401, 0.0589, 0.0115, 0.0026, 0.0007, 0.0003, 0.0001, 0.0001, 0.0000),
c(0.1583, 0.1538, 0.3931, 0.2346, 0.0481, 0.0090, 0.0021, 0.0007, 0.0003, 0.0001, 0.0001),
c(0.0746, 0.0609, 0.1600, 0.4368, 0.2178, 0.0397, 0.0073, 0.0019, 0.0006, 0.0002, 0.0001),
c(0.0349, 0.0271, 0.0559, 0.1724, 0.4628, 0.2031, 0.0344, 0.0067, 0.0018, 0.0006, 0.0003),
c(0.0155, 0.0122, 0.0230, 0.0537, 0.1817, 0.4870, 0.1860, 0.0316, 0.0066, 0.0018, 0.0009),
c(0.0066, 0.0054, 0.0100, 0.0204, 0.0529, 0.1956, 0.4925, 0.1772, 0.0307, 0.0064, 0.0023),
c(0.0025, 0.0023, 0.0043, 0.0084, 0.0186, 0.0530, 0.2025, 0.4980, 0.1760, 0.0275, 0.0067),
c(0.0009, 0.0009, 0.0017, 0.0035, 0.0072, 0.0168, 0.0490, 0.2025, 0.5194, 0.1721, 0.0260),
c(0.0003, 0.0003, 0.0007, 0.0013, 0.0029, 0.0061, 0.0142, 0.0430, 0.2023, 0.5485, 0.1804),
c(0.0001, 0.0001, 0.0002, 0.0003, 0.0008, 0.0017, 0.0032, 0.0068, 0.0212, 0.1079, 0.8578))
例如,调用 p2p_variance(100, alpha, P)(计算 100 个周期内的方差)会产生以下方差向量:
0.04393012 0.04091066 0.03856503 0.03636202 0.03472286 0.03331921 0.03213084 0.03068901 0.03143765 0.03255994 0.03522346
这似乎是合理的。但是,如果我运行 p2p_variance(1000, alpha, P),结果是:
0.06126449 0.03445073 0.009621497 -0.01447615 -0.03652425 -0.05752316 -0.07753646 -0.09726683 -0.1134972 -0.1287498 -0.141676
这显然是不正确的,因为我们不能有负方差。我不明白为什么简单地将 n 增加到 1000 会导致负方差。我很可能错误地编码了 p2p_variance 函数,但我一生都找不到问题。或者也许我用来发现这些差异的过程存在某种缺陷?如果有人可以查看这段代码并帮助我诊断问题,我将非常感激
I am working on a research project and a while back I asked this question on Mathematics Stack Exchange, where I was looking for a way to calculate the variance of the period-to-period change in income given a transition matrix, where each state corresponds to a log level of income in a vector, which is given. I want to calculate what the variance of an individual's change in income is over some n number of periods given that they began in each state. My state space consists of 11 states, so I hope to end up with a vector consisting of 11 different variances. When I asked the question, I received a satisfactory answer, but I am running into some issues when trying to code it in R I was hoping to receive help with.
I have created this piece of code to calculate the variances:
install.packages("expm")
library(expm)
# creating standard basis vectors
e <- function(i) {
e_i = rep(0, length(alpha))
e_i[i] = 1
return(e_i)
}
# compute variances
p2p_variance = function(n, alpha, P) {
variance = list()
pi_n = list()
for (i in 1:length(alpha)) {
pi_n[[i]] = e(i) %*% (P %^% n)
beta = (t(alpha) - t(alpha)[i])^2
variance[[i]] = (pi_n[[i]] %*% t(beta)) - (((pi_n[[i]] %*% alpha) - alpha[i]) %^% 2)
}
return(t(variance))
}
And for my values of alpha (vector of log levels of income) and P (transition matrix) I use:
alpha = c(3.4965, 3.5835, 3.6636, 3.7377, 3.8067, 3.8712, 3.9318, 3.9890, 4.0431, 4.0943, 4.1431)
P = rbind(c(0.9004, 0.0734, 0.0203, 0.0043, 0.0010, 0.0003, 0.0001, 0.0001, 0.0000, 0.0000, 0.0000),
c(0.3359, 0.3498, 0.2401, 0.0589, 0.0115, 0.0026, 0.0007, 0.0003, 0.0001, 0.0001, 0.0000),
c(0.1583, 0.1538, 0.3931, 0.2346, 0.0481, 0.0090, 0.0021, 0.0007, 0.0003, 0.0001, 0.0001),
c(0.0746, 0.0609, 0.1600, 0.4368, 0.2178, 0.0397, 0.0073, 0.0019, 0.0006, 0.0002, 0.0001),
c(0.0349, 0.0271, 0.0559, 0.1724, 0.4628, 0.2031, 0.0344, 0.0067, 0.0018, 0.0006, 0.0003),
c(0.0155, 0.0122, 0.0230, 0.0537, 0.1817, 0.4870, 0.1860, 0.0316, 0.0066, 0.0018, 0.0009),
c(0.0066, 0.0054, 0.0100, 0.0204, 0.0529, 0.1956, 0.4925, 0.1772, 0.0307, 0.0064, 0.0023),
c(0.0025, 0.0023, 0.0043, 0.0084, 0.0186, 0.0530, 0.2025, 0.4980, 0.1760, 0.0275, 0.0067),
c(0.0009, 0.0009, 0.0017, 0.0035, 0.0072, 0.0168, 0.0490, 0.2025, 0.5194, 0.1721, 0.0260),
c(0.0003, 0.0003, 0.0007, 0.0013, 0.0029, 0.0061, 0.0142, 0.0430, 0.2023, 0.5485, 0.1804),
c(0.0001, 0.0001, 0.0002, 0.0003, 0.0008, 0.0017, 0.0032, 0.0068, 0.0212, 0.1079, 0.8578))
For instance, a call of p2p_variance(100, alpha, P)
(calculating the variance over 100 periods) results in the following vector of variances:
0.04393012 0.04091066 0.03856503 0.03636202 0.03472286 0.03331921 0.03213084 0.03068901 0.03143765 0.03255994 0.03522346
Which seem plausible. However, If I run p2p_variance(1000, alpha, P)
, it results in:
0.06126449 0.03445073 0.009621497 -0.01447615 -0.03652425 -0.05752316 -0.07753646 -0.09726683 -0.1134972 -0.1287498 -0.141676
This is obviously not correct, since we cannot have negative variance. I cannot figure out why simply increasing n to 1000 is resulting in negative variance here. I have most likely coded my p2p_variance function incorrectly, but I cannot for the life of me find the issue. Or perhaps is the process I am using to find these variances flawed somehow? I would really appreciate if anyone could look over this code and help me diagnose the issue
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您的方差函数返回差异,如果您想要绝对值(方差),只需将其包装在
abs()
中,如下所示:输出:
Your variance function is returning the difference, and if you want the absolute value (variance) just wrap it inside
abs()
like this:Output: