使用 FFT 来近似聚合损失随机变量的 CDF

发布于 2025-01-17 22:17:51 字数 2307 浏览 1 评论 0原文

下面你将找到几周前给我的一个班级作业的 python 代码,但我一直无法成功调试。问题是使用 FFT 查找总损失随机变量的风险值(即 p% 分位数)。我们得到了一个清晰的数学过程,通过它我们可以获得总损失随机变量的离散 CDF 的估计。然而,我的结果严重偏离,并且我犯了某种错误,即使经过数小时的调试代码后我也无法找到该错误。

给出聚合损失随机变量 S,使得 S=sum(X_i for i in range(N)),其中 N 为负二项式分布为 r=5, beta=.2X_i 分布为 theta=1 指数分布。此参数化的概率生成函数为 P(z)=[1-\beta(z-1)]^{-r}。

来近似 S 的分布

  1. 我们被要求通过选择网格宽度 h 和整数 n ,使得 r=2^n 是要离散化 X 的元素数量,
  2. 离散化 X 并计算位于宽度为 h 的等距间隔中的概率,
  3. 将 FFT 应用于离散化X
  4. N 的 PGF 应用于傅立叶变换 X 的元素,
  5. 并将逆 FFT 应用于该向量。

所得向量应该是 S 每个此类区间的概率质量的近似值。我从以前的方法中知道,95% VaR 应该约为 4,99.9% VaR 应该约为 10。但我的代码返回了无意义的结果。一般来说,我的 ECDF 达到水平 > 0.95 的索引已经太晚了,即使经过几个小时的调试,我也没有找到出错的地方。

我还在数学堆栈交换上问过这个问题,因为这个问题很大程度上是关于编程和数学的交叉点,我目前不知道问题是否出在事物的实现方面,或者我是否正在应用数学思想错误的。

import numpy as np
from scipy.stats import expon
from scipy.fft import fft, ifft

r, beta, theta = 5, .2, 1
var_levels = [.95, .999]


def discretize_X(h: float, m: int):
    X = expon(scale=theta)
    f_X = [X.cdf(h / 2),
           *[X.cdf(j * h + h / 2) - X.cdf(j * h - h / 2) for j in range(1, m - 1)],
           X.sf((m - 1) * h - h / 2)]
    return f_X


# Probability generating function of N ~ NB(r, beta)
def PGF(z: [float, complex]):
    return (1 - beta * (z - 1)) ** (-r)


h = 1e-2
n = 10
r = 2 ** n

VaRs, TVaRs = [], []

# discretize X with (r-1) cells of width h and one final cell with the survival function at h*(r-1)
f_X = discretize_X(h, r)
phi_vec = fft(f_X)
f_tilde_vec_fft = np.array([PGF(phi) for phi in phi_vec])
f_S = np.real(ifft(f_tilde_vec_fft))
ecdf_S = np.cumsum(f_S)  # calc cumsum to get ECDF

for p in var_levels:
    var_idx = np.where(ecdf_S >= p)[0][0]  # get lowest index where ecdf_S >= p
    print("p =", p, "\nVaR idx:", var_idx)
    var = h * var_idx  # VaR should be this index times the cell width
    print("VaR:", var)
    tvar = 1 / (1 - p) * np.sum(f_S[var_idx:] * np.array([i * h for i in range(var_idx, r)]))  # TVaR should be each cell's probability times the value inside that cell

    VaRs.append(var)
    TVaRs.append(tvar)

return VaRs, TVaRs

Below you will find my python code for a class assignment I was given a couple weeks ago which I have been unable to successfully debug. The problem is about finding the value at risk (i.e., the p% quantile) for an aggregate loss random variable, using FFT. We are given a clear mathematical procedure by which we can gain an estimation of the discretized CDF of the aggregate loss random variable. My results are, however, seriously off and I am making some kind of mistake which I have been unable to find even after hours of debugging my code.

The aggregate loss random variable S is given such that S=sum(X_i for i in range(N)), where N is negative binomially distributed with r=5, beta=.2, and X_i is exponentially distributed with theta=1. The probability generating function for this parametrization is P(z)=[1-\beta(z-1)]^{-r}.

We were asked to approximate the distribution of S by

  1. choosing a grid width h and an integer n such that r=2^n is the number of elements to discretize X on,
  2. discretizing X and calculating the probabilities of being in equally spaced intervals of width h,
  3. applying the FFT to the discretized X,
  4. applying the PGF of N to the elements of the Fourier-transformed X,
  5. applying the inverse FFT to this vector.

The resulting vector should be an approximation for the probability masses of each such interval for S. I know from previous methods that the 95% VaR ought to be ~4 and the 99.9% VaR ought to be ~10. But my code returns nonsensical results. Generally speaking, my index where the ECDF reaches levels >0.95 is way too late, and even after hours of debugging I have not managed to find where I am going wrong.

I have also asked this question on the math stackexchange, since this question is very much on the intersection of programming and math and I have no idea at this moment whether the issue is on the implementation side of things or whether I am applying the mathematical ideas wrong.

import numpy as np
from scipy.stats import expon
from scipy.fft import fft, ifft

r, beta, theta = 5, .2, 1
var_levels = [.95, .999]


def discretize_X(h: float, m: int):
    X = expon(scale=theta)
    f_X = [X.cdf(h / 2),
           *[X.cdf(j * h + h / 2) - X.cdf(j * h - h / 2) for j in range(1, m - 1)],
           X.sf((m - 1) * h - h / 2)]
    return f_X


# Probability generating function of N ~ NB(r, beta)
def PGF(z: [float, complex]):
    return (1 - beta * (z - 1)) ** (-r)


h = 1e-2
n = 10
r = 2 ** n

VaRs, TVaRs = [], []

# discretize X with (r-1) cells of width h and one final cell with the survival function at h*(r-1)
f_X = discretize_X(h, r)
phi_vec = fft(f_X)
f_tilde_vec_fft = np.array([PGF(phi) for phi in phi_vec])
f_S = np.real(ifft(f_tilde_vec_fft))
ecdf_S = np.cumsum(f_S)  # calc cumsum to get ECDF

for p in var_levels:
    var_idx = np.where(ecdf_S >= p)[0][0]  # get lowest index where ecdf_S >= p
    print("p =", p, "\nVaR idx:", var_idx)
    var = h * var_idx  # VaR should be this index times the cell width
    print("VaR:", var)
    tvar = 1 / (1 - p) * np.sum(f_S[var_idx:] * np.array([i * h for i in range(var_idx, r)]))  # TVaR should be each cell's probability times the value inside that cell

    VaRs.append(var)
    TVaRs.append(tvar)

return VaRs, TVaRs

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

输什么也不输骨气 2025-01-24 22:17:51

不确定数学,但在代码片段中变量 r 被覆盖,并且在计算 f_tilde_vec_fft 函数 PGF 时不使用 5正如 r 的预期,但 1024。修复 - 在超参数定义中将名称 r 更改为 r_nb

r_nb, beta, theta = 5, .2, 1

以及函数中PGF:

return (1 - beta * (z - 1)) ** (-r_nb)

运行后其他参数保持不变(如h, n 等)对于 VaRs 我得到 [4.05, 9.06]

Not sure about math, but in snippet variable r gets overrided, and when computing f_tilde_vec_fft function PGF uses not 5 as expected for r, but 1024. Fix -- change name r to r_nb in definition of hyperparameters:

r_nb, beta, theta = 5, .2, 1

and also in function PGF:

return (1 - beta * (z - 1)) ** (-r_nb)

After run with other parameters remain same (such as h, n etc.) for VaRs I get [4.05, 9.06]

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文