避免在R中重复
我试图将各种(截断的)概率分布拟合到相同薄的分位数集中。我可以做到,但似乎需要大量相同代码的重复。有一种更整洁的方式吗?
我正在使用Nadarajah和Kotz的此代码来生成截断分布的PDF:
qtrunc <- function(p, spec, a = -Inf, b = Inf, ...)
{
tt <- p
G <- get(paste("p", spec, sep = ""), mode = "function")
Gin <- get(paste("q", spec, sep = ""), mode = "function")
tt <- Gin(G(a, ...) + p*(G(b, ...) - G(a, ...)), ...)
return(tt)
}
其中spec
可以是R中的代码存在的任何未截断的分布的名称,而> ... ...
参数用于提供该未截断的分布的参数的名称。
为了达到最佳拟合,我需要测量给定分位数与使用分布参数的任意值计算的距离之间的距离。例如,对于伽马发行版,代码如下:
spec <- "gamma"
fit_gamma <- function(x, l = 0, h = 20, t1 = 5, t2 = 13){
ct1 <- qtrunc(p = 1/3, spec, a = l, b = h, shape = x[1],rate = x[2])
ct2 <- qtrunc(p = 2/3, spec, a = l, b = h, shape = x[1],rate = x[2])
dist <- vector(mode = "numeric", length = 2)
dist[1] <- (t1 - ct1)^2
dist[2] <- (t2- ct2)^2
return(sqrt(sum(dist)))
}
其中l
是较低的截断,h
较高,我得到了两个tertiles <代码> T1 和T2
。
最后,我使用optim
寻求最佳拟合度,因此:
gamma_fit <- optim(par = c(2, 4),
fn = fit_gamma,
l = l,
h = h,
t1 = t1,
t2 = t2,
method = "L-BFGS-B",
lower = c(1.01, 1.4)
现在假设我想做同样的事情,而是拟合正态分布。我在R中使用的正态分布的参数的名称是均值
和sd
。
我可以实现我想要的东西,但只有编写一个全新的函数fit_normal
,它与我的fit_gamma
函数非常相似,但是使用ct1 和
ct2
。
重复代码的问题变得非常严重,因为我希望尝试将大量不同的分布安装到我的数据中。
我想知道的是,是否有一种编写通用fit_spec
的方法,以便我不必写出参数名称。
I am trying to fit a variety of (truncated) probability distributions to the same very thin set of quantiles. I can do it but it seems to require lots of duplication of the same code. Is there a neater way?
I am using this code by Nadarajah and Kotz to generate the pdf of the truncated distributions:
qtrunc <- function(p, spec, a = -Inf, b = Inf, ...)
{
tt <- p
G <- get(paste("p", spec, sep = ""), mode = "function")
Gin <- get(paste("q", spec, sep = ""), mode = "function")
tt <- Gin(G(a, ...) + p*(G(b, ...) - G(a, ...)), ...)
return(tt)
}
where spec
can be the name of any untruncated distribution for which code in R exists, and the ...
argument is used to provide the names of the parameters of that untruncated distribution.
To achieve the best fit I need to measure the distance between the given quantiles and those calculated using arbitrary values of the parameters of the distribution. In the case of the gamma distribution, for example, the code is as follows:
spec <- "gamma"
fit_gamma <- function(x, l = 0, h = 20, t1 = 5, t2 = 13){
ct1 <- qtrunc(p = 1/3, spec, a = l, b = h, shape = x[1],rate = x[2])
ct2 <- qtrunc(p = 2/3, spec, a = l, b = h, shape = x[1],rate = x[2])
dist <- vector(mode = "numeric", length = 2)
dist[1] <- (t1 - ct1)^2
dist[2] <- (t2- ct2)^2
return(sqrt(sum(dist)))
}
where l
is the lower truncation, h
is the higher and I am given the two tertiles t1
and t2
.
Finally, I seek the best fit using optim
, thus:
gamma_fit <- optim(par = c(2, 4),
fn = fit_gamma,
l = l,
h = h,
t1 = t1,
t2 = t2,
method = "L-BFGS-B",
lower = c(1.01, 1.4)
Now suppose I want to do the same thing but fitting a normal distribution instead. The names of the parameters of the normal distribution that I am using in R are mean
and sd
.
I can achieve what I want but only by writing a whole new function fit_normal
that is extremely similar to my fit_gamma
function but with the new parameter names used in the definition of ct1
and ct2
.
The problem of duplication of code becomes very severe because I wish to try fitting a large number of different distributions to my data.
What I want to know is whether there is a way of writing a generic fit_spec
as it were so that the parameter names do not have to be written out by me.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
使用
X
作为命名列表,以创建一个参数列表,以传递到qtrunc()
使用do.call()
。这就是如下所示,与您的原始功能相同。
这将与其他分布一起使用,无论多么多参数。
Use
x
as a named list to create a list of arguments to pass intoqtrunc()
usingdo.call()
.This is called as follows, which is the same as your original function.
This will work with other distributions, for however many parameters they have.