为什么不返回正确的估计值的Optim()函数?

发布于 2025-02-07 03:50:58 字数 1442 浏览 3 评论 0原文

我尝试通过输入Alpha0,alpha1和beta 1的模型及其参数来模拟过程。

该模型在本身上使用泊松分布来获取下一个值。我尝试使用200个样本量进行模拟,并使用100个样本或重复。

a0<-5
a1<-0.9
b1<-0.2

l<-rep(1,200)
xs<-rep(0,200)
y<-rep(0,200)
s<-matrix(nrow=100, ncol=200)


xs[1]<-0
l[1]<-1

for (j in 1: 100){
  for (i in 2: 50)
  {
    l[i]<-a0+a1*xs[i-1]+b1*l[i-1]
    xs[i]<-rpois(1,lambda = l[i])
  }
  s[j,1:200]<-xs
}

并使用最佳函数最大程度地降低负log的可能性,以获取100个alpha 0,alpha 1和beta 1的初始参数的估计值。特定于其标准偏差并非其标准偏差

loglik<-function(theta,x)
{  
  alpha0<-theta[1];
  alpha1<-theta[2];
  beta1<-theta[3]
  
  #lambda
  T<-length(x);
  
  lambda<-rep(1,T);
  likeli<-rep(1,T);
  
  for(t in (2:T))
  {
    
    lambda[t]<-alpha0+alpha1*x[t-1]+beta1*lambda[t-1]; 
    likeli[t]<-((lambda[t]^x[t])*exp(-lambda[t]))/factorial(x[t])
  }
  
  return(-log(prod(likeli)))
}

estimates<-matrix(nrow=100, ncol=3)

for(i in 1:100){
  initial<-c(6,0.8,1)
  res<-optim(initial,loglik,x=s[i,],control=list(maxit=10000),hessian=T)
  estimates[i,] <- res$par
  
}

mean(estimates[,1])
sqrt(var(estimates[,1]))
mean(estimates[,2])
sqrt(var(estimates[,2]))
mean(estimates[,3])
sqrt(var(estimates[,3]))

。我的错误,它可以在最初的参数上评估

Error in optim(initial, loglik, x = s[i, ], control = list(maxit = 10000),  : 
  function cannot be evaluated at initial parameters

问题在哪里以及如何摆脱这个问题?

I have tried simulating a process by inputting a model and its parameters for alpha0, alpha1 and beta 1.

The model uses a poisson distribution conditional on itself to get the next value. I have tried simulating with a sample size of 200 and using 100 samples or repetitions.

a0<-5
a1<-0.9
b1<-0.2

l<-rep(1,200)
xs<-rep(0,200)
y<-rep(0,200)
s<-matrix(nrow=100, ncol=200)


xs[1]<-0
l[1]<-1

for (j in 1: 100){
  for (i in 2: 50)
  {
    l[i]<-a0+a1*xs[i-1]+b1*l[i-1]
    xs[i]<-rpois(1,lambda = l[i])
  }
  s[j,1:200]<-xs
}

And minimising the negative log likelihood using the optim functions as to get the 100 estimates for the initial parameters of alpha 0, alpha 1, and beta 1. Specifally their estimate alongside their standard deviation.However the optim funtion isn't working

loglik<-function(theta,x)
{  
  alpha0<-theta[1];
  alpha1<-theta[2];
  beta1<-theta[3]
  
  #lambda
  T<-length(x);
  
  lambda<-rep(1,T);
  likeli<-rep(1,T);
  
  for(t in (2:T))
  {
    
    lambda[t]<-alpha0+alpha1*x[t-1]+beta1*lambda[t-1]; 
    likeli[t]<-((lambda[t]^x[t])*exp(-lambda[t]))/factorial(x[t])
  }
  
  return(-log(prod(likeli)))
}

estimates<-matrix(nrow=100, ncol=3)

for(i in 1:100){
  initial<-c(6,0.8,1)
  res<-optim(initial,loglik,x=s[i,],control=list(maxit=10000),hessian=T)
  estimates[i,] <- res$par
  
}

mean(estimates[,1])
sqrt(var(estimates[,1]))
mean(estimates[,2])
sqrt(var(estimates[,2]))
mean(estimates[,3])
sqrt(var(estimates[,3]))

An its giving me an error that it could evaluate at the initial parameters

Error in optim(initial, loglik, x = s[i, ], control = list(maxit = 10000),  : 
  function cannot be evaluated at initial parameters

Where is the problem and how do I get rid of this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

夜光 2025-02-14 03:50:58

这并不能完全解决您的问题,但是它使您能够参与其中:

  • 使用dpois()直接计算Log-likelihood(这避免了/溢出和其他可能的错误)
  • 使用tmax < /code>和tt而不是tt(避免内置变量名称的最佳实践)
  • 摆脱了半隆(化妆品 现在,
loglik <- function(theta,x) {  
  alpha0 <- theta[1]
  alpha1 <- theta[2]
  beta1 <- theta[3]

  
  #lambda
  tmax <- length(x)  
  lambda <- rep(1,tmax)
  llik <- rep(0,tmax)
  
  for(tt in 2:tmax) { 
    lambda[tt] <- alpha0+alpha1*x[tt-1]+beta1*lambda[tt-1]
    llik[tt] <- dpois(x[tt], lambda[tt], log = TRUE)
  }
  
  cat(alpha0, alpha1, beta1, -sum(llik), "\n")
  return(-sum(llik))
}

当我运行loglik(初始,s [1,])我至少获得有限的值...

当我运行时

optim(initial,loglik,x=s[1,],control=list(maxit=10000))

没有 hessian = true true)我确实得到答案;试图获得黑森的错误,就非有限有限差异的错误。前进的一些可能的方法:

  • 找到某种方法来避免预测泊松均值的负值,例如绑定您的参数搜索(使用method =“ l-bfgs-b”
  • 您的模拟似乎只花了50时间,最后留下了150个零 - 肯定会使事情变得奇怪

This doesn't completely solve your problem, but it gets you partway there:

  • compute the log-likelihood directly using dpois() (this avoids under/overflow and other possible errors)
  • use tmax and tt instead of T and t (it's best practice to avoid names of built-in variables)
  • get rid of semicolons (cosmetic)
loglik <- function(theta,x) {  
  alpha0 <- theta[1]
  alpha1 <- theta[2]
  beta1 <- theta[3]

  
  #lambda
  tmax <- length(x)  
  lambda <- rep(1,tmax)
  llik <- rep(0,tmax)
  
  for(tt in 2:tmax) { 
    lambda[tt] <- alpha0+alpha1*x[tt-1]+beta1*lambda[tt-1]
    llik[tt] <- dpois(x[tt], lambda[tt], log = TRUE)
  }
  
  cat(alpha0, alpha1, beta1, -sum(llik), "\n")
  return(-sum(llik))
}

Now when I run loglik(initial, s[1,]) I at least get a finite value ...

When I run

optim(initial,loglik,x=s[1,],control=list(maxit=10000))

(without hessian=TRUE) I do get an answer; trying to get the Hessian gives an error about non-finite finite differences. Some possible ways forward:

  • find some way to avoid predicting negative values for the Poisson mean, e.g. bound your parameter search (using lower and method = "L-BFGS-B")
  • Your simulations appear only to be going to time 50, leaving a string of 150 zeros at the end - that could definitely be making things wonky
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文