当前位置：文江博客话题详情

估计幂律分布中的指数截止

发布于 2024-12-29 22:18:15 字数 519 浏览 2 评论 0原文

当我进行一些社交网络分析时，我偶然发现了在网络度上拟合概率分布的问题。

因此，我有一个概率分布 P(X >= x)，从目视检查来看，它遵循具有指数截止的幂律，而不是纯幂律（直线）。

因此，假设具有指数截止的幂律分布方程为：

f(x) = x**alpha * exp(beta*x)

如何使用 Python 估计参数 alpha 和 beta？

我知道 scipy.stats.powerlaw 包存在，并且它们有一个 .fit() 函数，但这似乎不起作用，因为它只返回绘图的位置和比例，这似乎仅对正态分布有用吗？这个包上也没有足够的教程。

PS 我很清楚 CLauset 等人的实现，但他们不知道似乎提供了估计替代分布参数的方法。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

烟凡古楼 2025-01-05 22:18:15

Powerlaw库可以直接用于估计参数，如下：

安装所有python依赖项：
```
pip install powerlaw mpmath scipy
```

在 python 环境中运行 powerlaw 包：

导入幂律
数据 = [5, 4, ... ]
结果 = powerlaw.Fit(数据)

从结果中获取参数

results.truncated_power_law.parameter1 # 幂律参数（alpha）
results.truncated_power_law.parameter2 # 指数截止参数（beta）

Powerlaw library can directly be used to estimate the parameters as follows:

Install all the pythons dependencies:
```
pip install powerlaw mpmath scipy
```

Run the powerlaw package fit in a python environment:

import powerlaw
data = [5, 4, ... ]
results = powerlaw.Fit(data)

get the parameters from the results

results.truncated_power_law.parameter1 # power law  parameter (alpha)
results.truncated_power_law.parameter2 # exponential cut-off parameter (beta)

回复收藏 0 原文

放飞的风筝 2025-01-05 22:18:15

函数 scipy.stats.powerlaw.fit 可能仍然适用于您的目的。 scipy.stats 中的分布如何工作有点令人困惑（每个分布的文档都引用了可选参数 loc 和 scale，尽管并非所有分布都使用这些参数，并且每个分布的使用方式都不同）。如果您查看文档：

http://docs。 scipy.org/doc/scipy/reference/ generated/scipy.stats.powerlaw.html

还有第二个非可选参数“a”，即“形状参数”。在幂律的情况下，它包含单个参数。不用担心“loc”和“scale”。

编辑：抱歉，忘记了您也想要 beta 参数。最好的方法可能是自己定义所需的幂律函数，然后使用 scipy 的通用拟合算法来学习参数。例如：
http://www.scipy.org/Cookbook/FittingData#head-5eba0779a34c07f5a596bbcf99dbc7886eac18e5

回复收藏 0 原文

|煩躁 2025-01-05 22:18:15

以下是通过最大化 R 中的可能性来估计具有指数截止的幂律的标度指数和指数率的方法：

# Input: Data vector, lower threshold
# Output: List, giving type ("powerexp"), scaling exponent, exponential rate, lower threshold, log-likelihood


powerexp.fit <- function(data,threshold=1,method="constrOptim",initial_rate=-1) {
  x <- data[data>=threshold]
  negloglike <- function(theta) {
    -powerexp.loglike(x,threshold,exponent=theta[1],rate=theta[2])
  }
  # Fit a pure power-law distribution
  pure_powerlaw <- pareto.fit(data,threshold)
  # Use this as a first guess at the exponent
  initial_exponent <- pure_powerlaw$exponent
  if (initial_rate < 0) { initial_rate <- exp.fit(data,threshold)$rate }
  minute_rate <- 1e-6
  theta_0 <- as.vector(c(initial_exponent,initial_rate))
  theta_1 <- as.vector(c(initial_exponent,minute_rate))
  switch(method,
    constrOptim = {
      # Impose the constraint that rate >= 0
      # and that exponent >= -1
      ui <- rbind(c(1,0),c(0,1))
      ci <- c(-1,0)
      # Can't start with values on the boundary of the feasible set so add
      # tiny amounts just in case
      if (theta_0[1] == -1) {theta_0[1] <- theta_0[1] + minute_rate}
      if (theta_0[2] == 0) {theta_0[2] <- theta_0[2] + minute_rate}
      est <- constrOptim(theta=theta_0,f=negloglike,grad=NULL,ui=ui,ci=ci)
      alpha <- est$par[1]
      lambda <- est$par[2]
      loglike <- -est$value},
    optim = {
      est <- optim(par=theta_0,fn=negloglike)
      alpha <- est$par[1]
      lambda <- est$par[2]
      loglike <- -est$value},
    nlm = {
      est.0 <- nlm(f=negloglike,p=theta_0)
      est.1 <- nlm(f=negloglike,p=theta_1)
      est <- est.0
      if (-est.1$minimum > -est.0$minimum) { est <- est.1;cat("NLM had to switch\n") }
      alpha <- est$estimate[1]
      lambda <- est$estimate[2]
      loglike <- -est$minimum},
    {cat("Unknown method",method,"\n"); alpha<-NA; lambda<-NA; loglike<-NA}
  )
  fit <- list(type="powerexp", exponent=alpha, rate=lambda, xmin=threshold,
              loglike=loglike, samples.over.threshold=length(x))
  return(fit)
}

查看 https://github.com/jeffalstott/powerlaw/ 了解更多信息

Here is a means of estimating the scaling exponent and exponential rate of power law with exponential cut-off by maximizing likelihood in R:

# Input: Data vector, lower threshold
# Output: List, giving type ("powerexp"), scaling exponent, exponential rate, lower threshold, log-likelihood


powerexp.fit <- function(data,threshold=1,method="constrOptim",initial_rate=-1) {
  x <- data[data>=threshold]
  negloglike <- function(theta) {
    -powerexp.loglike(x,threshold,exponent=theta[1],rate=theta[2])
  }
  # Fit a pure power-law distribution
  pure_powerlaw <- pareto.fit(data,threshold)
  # Use this as a first guess at the exponent
  initial_exponent <- pure_powerlaw$exponent
  if (initial_rate < 0) { initial_rate <- exp.fit(data,threshold)$rate }
  minute_rate <- 1e-6
  theta_0 <- as.vector(c(initial_exponent,initial_rate))
  theta_1 <- as.vector(c(initial_exponent,minute_rate))
  switch(method,
    constrOptim = {
      # Impose the constraint that rate >= 0
      # and that exponent >= -1
      ui <- rbind(c(1,0),c(0,1))
      ci <- c(-1,0)
      # Can't start with values on the boundary of the feasible set so add
      # tiny amounts just in case
      if (theta_0[1] == -1) {theta_0[1] <- theta_0[1] + minute_rate}
      if (theta_0[2] == 0) {theta_0[2] <- theta_0[2] + minute_rate}
      est <- constrOptim(theta=theta_0,f=negloglike,grad=NULL,ui=ui,ci=ci)
      alpha <- est$par[1]
      lambda <- est$par[2]
      loglike <- -est$value},
    optim = {
      est <- optim(par=theta_0,fn=negloglike)
      alpha <- est$par[1]
      lambda <- est$par[2]
      loglike <- -est$value},
    nlm = {
      est.0 <- nlm(f=negloglike,p=theta_0)
      est.1 <- nlm(f=negloglike,p=theta_1)
      est <- est.0
      if (-est.1$minimum > -est.0$minimum) { est <- est.1;cat("NLM had to switch\n") }
      alpha <- est$estimate[1]
      lambda <- est$estimate[2]
      loglike <- -est$minimum},
    {cat("Unknown method",method,"\n"); alpha<-NA; lambda<-NA; loglike<-NA}
  )
  fit <- list(type="powerexp", exponent=alpha, rate=lambda, xmin=threshold,
              loglike=loglike, samples.over.threshold=length(x))
  return(fit)
}

Check out https://github.com/jeffalstott/powerlaw/ for more info

回复收藏 0 原文

~没有更多了~