如何编译计算 Hessian 矩阵的函数？

发布于 2024-12-17 00:48:27 字数 1622 浏览 1 评论 0原文

我希望了解如何编译计算对数似然的 Hessian 矩阵的函数，以便它可以有效地与不同的参数集一起使用。

这是一个例子。

假设我们有一个计算 Logit 模型的对数似然的函数，其中 y 是向量，x 是矩阵。 beta 是参数向量。

 pLike[y_, x_, beta_] :=
 Module[
  {xbeta, logDen},
  xbeta = x.beta;
  logDen = Log[1.0 + Exp[xbeta]];
  Total[y*xbeta - logDen]
  ]

给定以下数据，我们可以如下使用它

In[1]:= beta = {0.5, -1.0, 1.0};

In[2]:= xmat = 
  Table[Flatten[{1, 
     RandomVariate[NormalDistribution[0.0, 1.0], {2}]}], {500}];

In[3]:= xbeta = xmat.beta;

In[4]:= prob = Exp[xbeta]/(1.0 + Exp[xbeta]);

In[5]:= y = Map[RandomVariate[BernoulliDistribution[#]] &, prob] ;

In[6]:= Tally[y]

Out[6]= {{1, 313}, {0, 187}}

In[9]:= pLike[y, xmat, beta]

Out[9]= -272.721

我们可以将其 hessian 写成如下

 hessian[y_, x_, z_] :=
  Module[{},
   D[pLike[y, x, z], {z, 2}]
  ]


In[10]:= z = {z1, z2, z3}

Out[10]= {z1, z2, z3}

In[11]:= AbsoluteTiming[hess = hessian[y, xmat, z];]

Out[11]= {0.1248040, Null}

In[12]:= AbsoluteTiming[
 Table[hess /. {z1 -> 0.0, z2 -> -0.5, z3 -> 0.8}, {100}];]

Out[12]= {14.3524600, Null}

出于效率原因，我可以按如下方式编译原始似然函数，

pLikeC = Compile[{{y, _Real, 1}, {x, _Real, 2}, {beta, _Real, 1}},
   Module[
    {xbeta, logDen},
    xbeta = x.beta;
    logDen = Log[1.0 + Exp[xbeta]];
    Total[y*xbeta - logDen]
    ],
   CompilationTarget -> "C", Parallelization -> True,  
   RuntimeAttributes -> {Listable}
   ];

它产生与 pLike 相同的答案

In[10]:= pLikeC[y, xmat, beta]

Out[10]= -272.721

我正在寻找一种简单的方法来获得类似的，a鉴于我多次评估它的兴趣，hessian 函数的编译版本。

原文

I am looking to see how a function that computes the Hessian of a log-likelihood can be compiled, so that it can be efficiently used with different sets of parameters.

Here is an example.

Suppose we have a function that computes the log-likelihood of a logit model, where y is vector and x is a matrix. beta is a vector of parameters.

 pLike[y_, x_, beta_] :=
 Module[
  {xbeta, logDen},
  xbeta = x.beta;
  logDen = Log[1.0 + Exp[xbeta]];
  Total[y*xbeta - logDen]
  ]

Given the following data, we can use it as follows

In[1]:= beta = {0.5, -1.0, 1.0};

In[2]:= xmat = 
  Table[Flatten[{1, 
     RandomVariate[NormalDistribution[0.0, 1.0], {2}]}], {500}];

In[3]:= xbeta = xmat.beta;

In[4]:= prob = Exp[xbeta]/(1.0 + Exp[xbeta]);

In[5]:= y = Map[RandomVariate[BernoulliDistribution[#]] &, prob] ;

In[6]:= Tally[y]

Out[6]= {{1, 313}, {0, 187}}

In[9]:= pLike[y, xmat, beta]

Out[9]= -272.721

We can write its hessian as follows

 hessian[y_, x_, z_] :=
  Module[{},
   D[pLike[y, x, z], {z, 2}]
  ]


In[10]:= z = {z1, z2, z3}

Out[10]= {z1, z2, z3}

In[11]:= AbsoluteTiming[hess = hessian[y, xmat, z];]

Out[11]= {0.1248040, Null}

In[12]:= AbsoluteTiming[
 Table[hess /. {z1 -> 0.0, z2 -> -0.5, z3 -> 0.8}, {100}];]

Out[12]= {14.3524600, Null}

For efficiency reasons, I can compile the original likelihood function as follows

pLikeC = Compile[{{y, _Real, 1}, {x, _Real, 2}, {beta, _Real, 1}},
   Module[
    {xbeta, logDen},
    xbeta = x.beta;
    logDen = Log[1.0 + Exp[xbeta]];
    Total[y*xbeta - logDen]
    ],
   CompilationTarget -> "C", Parallelization -> True,  
   RuntimeAttributes -> {Listable}
   ];

which yields the same answer as pLike

In[10]:= pLikeC[y, xmat, beta]

Out[10]= -272.721

I am looking for an easy way to obtain similarly, a compiled version of the hessian function, given my interest in evaluating it many times.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

好倦 2024-12-24 00:48:27

列昂尼德已经打败了我，但无论如何我都会发布我的想法只是为了搞笑。

这里的主要问题是编译适用于数值函数，而 D 需要符号。因此，诀窍是首先使用与您打算使用的特定大小的矩阵所需的相同数量的变量来定义 pLike 函数，例如

pLike[{y1, y2}, {{x1, x2, x3}, {x12, x22, x32}}, {z1, z2, z3}]

在此处输入图像描述

Hessian:

D[pLike[{y1, y2}, {{x1, x2, x3}, {x12, x22, x32}}, {z1, z2, z3}], {{z1, z2, z3}, 2}]

在此处输入图像描述

该函数应该是可编译的，因为它依赖于仅数字数量。

要概括各种向量，可以构建如下所示的内容：

Block[{ny = 2, nx = 3, z1, z2, z3},
   hessian[
      Table[ToExpression["y" <> ToString[i] <> "_"], {i, ny}], 
      Table[ToExpression["xr" <> ToString[i] <> "c" <> ToString[j] <> "_"], 
           {i, ny}, {j, nx}], {z1_, z2_, z3_}
   ] =
   D[
     pLike[
        Table[ToExpression["y" <> ToString[i]], {i, ny}], 
        Table[ToExpression["xr" <> ToString[i] <> "c" <> ToString[j]], 
             {i, ny}, {j, nx}], {z1, z2, z3}
        ], 
     {{z1, z2, z3}, 2}
   ]
 ]

在此处输入图像描述

这当然可以轻松概括为变量恩克斯和纽约。

现在是编译部分。这是一段丑陋的代码，由上面的代码和编译组成，并适合可变的 y 大小。比起我的代码，我更喜欢 ruebenko 的代码。

ClearAll[hessianCompiled];
Block[{z1, z2, z3},
 hessianCompiled[yd_] :=
  (hessian[
     Table[ToExpression["y" <> ToString[i] <> "_"], {i, yd}], 
     Table[ToExpression["xr" <> ToString[i]<>"c"<>ToString[j] <>"_"],{i,yd},{j,3}],
     {z1_, z2_, z3_}
     ] =
    D[
     pLike[
      Table[ToExpression["y" <> ToString[i]], {i, yd}],
      Table[ToExpression["xr" <> ToString[i] <> "c" <> ToString[j]], {i,yd},{j,3}],
      {z1, z2, z3}
     ], {{z1, z2, z3}, 2}
    ];
   Compile[{{y, _Real, 1}, {x, _Real, 2}, {z, _Real, 1}}, 
    hessian[Table[y[[i]], {i, yd}], Table[x[[i, j]], {i, yd}, {j, 3}],
      Table[z[[i]], {i, 3}]]]// Evaluate] // Quiet
   )
 ]

hessianCompiled[500][y, xmat, beta] // Timing 

{1.497, {{-90.19295669, -15.80180276, 6.448357845}, 
        {-15.80180276, -80.41058154, -26.33982586},
        {6.448357845, -26.33982586, -72.92978931}}}

ruebenko's version (including my edits):

(cf = mkCHessian[500, 3]; cf[y, xmat, beta]) // Timing

{1.029, {{-90.19295669, -15.80180276, 6.448357845}, 
         {-15.80180276, -80.41058154, -26.33982586}, 
         {6.448357845, -26.33982586, -72.92978931}}}

请注意，这两个测试都包括编译时间。单独运行计算：

h = hessianCompiled[500];
Do[h[y, xmat, beta], {100}]; // Timing
Do[cf[y, xmat, beta], {100}]; // Timing

(* timing for 100 hessians: 

   ==> {0.063, Null}

   ==> {0.062, Null}
*)

Leonid already beat me, but I'll post my line of thought anyway just for laughs.

The main problem here is that compilation works for numerical functions whereas D needs symbolics. So the trick would be to first define the pLike function with the same amount of variables as needed for the particular size of matrices you intend to use, e.g,

pLike[{y1, y2}, {{x1, x2, x3}, {x12, x22, x32}}, {z1, z2, z3}]

enter image description here

The Hessian:

D[pLike[{y1, y2}, {{x1, x2, x3}, {x12, x22, x32}}, {z1, z2, z3}], {{z1, z2, z3}, 2}]

enter image description here

This function should be compilable as it depends on numerical quantities only.

To generalize for various vectors one could build something like this:

Block[{ny = 2, nx = 3, z1, z2, z3},
   hessian[
      Table[ToExpression["y" <> ToString[i] <> "_"], {i, ny}], 
      Table[ToExpression["xr" <> ToString[i] <> "c" <> ToString[j] <> "_"], 
           {i, ny}, {j, nx}], {z1_, z2_, z3_}
   ] =
   D[
     pLike[
        Table[ToExpression["y" <> ToString[i]], {i, ny}], 
        Table[ToExpression["xr" <> ToString[i] <> "c" <> ToString[j]], 
             {i, ny}, {j, nx}], {z1, z2, z3}
        ], 
     {{z1, z2, z3}, 2}
   ]
 ]

enter image description here

And this can of course be easily generalized for variable nx and ny.

And now for the Compile part. It's an ugly piece of code, consisting of the above and a compile and made suitable for variable y size. I like ruebenko's code more than mine.

ClearAll[hessianCompiled];
Block[{z1, z2, z3},
 hessianCompiled[yd_] :=
  (hessian[
     Table[ToExpression["y" <> ToString[i] <> "_"], {i, yd}], 
     Table[ToExpression["xr" <> ToString[i]<>"c"<>ToString[j] <>"_"],{i,yd},{j,3}],
     {z1_, z2_, z3_}
     ] =
    D[
     pLike[
      Table[ToExpression["y" <> ToString[i]], {i, yd}],
      Table[ToExpression["xr" <> ToString[i] <> "c" <> ToString[j]], {i,yd},{j,3}],
      {z1, z2, z3}
     ], {{z1, z2, z3}, 2}
    ];
   Compile[{{y, _Real, 1}, {x, _Real, 2}, {z, _Real, 1}}, 
    hessian[Table[y[[i]], {i, yd}], Table[x[[i, j]], {i, yd}, {j, 3}],
      Table[z[[i]], {i, 3}]]]// Evaluate] // Quiet
   )
 ]

hessianCompiled[500][y, xmat, beta] // Timing 

{1.497, {{-90.19295669, -15.80180276, 6.448357845}, 
        {-15.80180276, -80.41058154, -26.33982586},
        {6.448357845, -26.33982586, -72.92978931}}}

ruebenko's version (including my edits):

(cf = mkCHessian[500, 3]; cf[y, xmat, beta]) // Timing

{1.029, {{-90.19295669, -15.80180276, 6.448357845}, 
         {-15.80180276, -80.41058154, -26.33982586}, 
         {6.448357845, -26.33982586, -72.92978931}}}

Note that both tests include compilation time. Running the calculation on its own:

h = hessianCompiled[500];
Do[h[y, xmat, beta], {100}]; // Timing
Do[cf[y, xmat, beta], {100}]; // Timing

(* timing for 100 hessians: 

   ==> {0.063, Null}

   ==> {0.062, Null}
*)

回复收藏 0 原文

南笙 2024-12-24 00:48:27

这是基于之前帖子的一个想法：我们以符号方式构造编译的输入。

mkCHessian[{y_, ys_Integer}, {x_, xs_Integer}, {beta_, bs_Integer}] :=
  With[{
   args = MapThread[{#1, _Real, #2} &, {{y, x, beta}, {1, 2, 1}}],
   yi = Quiet[Part[y, #] & /@ Range[ys]],
   xi = Quiet[Table[Part[x, i, j], {i, xs}, {j, xs}]],
   betai = Quiet[Part[beta, #] & /@ Range[bs]]
   },
  Print[args];
  Print[yi];
  Print[xi];
  Print[betai];
  Compile[Evaluate[args], 
   Evaluate[D[pLike[yi, xi, betai], {betai, 2}]]]
  ]

然后生成编译后的函数。

cf = mkCHessian[{y, 3}, {x, 3}, {beta, 3}];

然后你调用那个编译函数

cf[y, xmat, beta]

请验证我没有犯错误；在de Vries的帖子中，y的长度是2。我的长度是3。我确定什么是正确的。当然，印刷品仅供说明之用...

更新
尺寸处理略有改进且变量已本地化的版本：

ClearAll[mkCHessian];
mkCHessian[ys_Integer, bs_Integer] :=
 Module[
   {beta, x, y, args, xi, yi, betai},
   args = MapThread[{#1, _Real, #2} &, {{y, x, beta}, {1, 2, 1}}];
   yi = Quiet[Part[y, #] & /@ Range[ys]];
   xi = Quiet[Table[Part[x, i, j], {i, ys}, {j, bs}]];
   betai = Quiet[Part[beta, #] & /@ Range[bs]];
   Compile[Evaluate[args], Evaluate[D[pLike[yi, xi, betai], {betai, 2}]]]
 ]

现在，使用 In[1] 到 In[5] 中的 asim 定义：

cf = mkCHessian[500, 3];
cf[y, xmat, beta]

(* ==> {{-8.852446923, -1.003365612, 1.66653381}, 
       {-1.003365612, -5.799363241, -1.277665283},
       {1.66653381, -1.277665283, -7.676551252}}  *)

由于 y 是随机向量，因此结果会有所不同。

Here is an idea based on the previous post(s): We construct the input to Compile symbolically.

mkCHessian[{y_, ys_Integer}, {x_, xs_Integer}, {beta_, bs_Integer}] :=
  With[{
   args = MapThread[{#1, _Real, #2} &, {{y, x, beta}, {1, 2, 1}}],
   yi = Quiet[Part[y, #] & /@ Range[ys]],
   xi = Quiet[Table[Part[x, i, j], {i, xs}, {j, xs}]],
   betai = Quiet[Part[beta, #] & /@ Range[bs]]
   },
  Print[args];
  Print[yi];
  Print[xi];
  Print[betai];
  Compile[Evaluate[args], 
   Evaluate[D[pLike[yi, xi, betai], {betai, 2}]]]
  ]

And then generate the compiled function.

cf = mkCHessian[{y, 3}, {x, 3}, {beta, 3}];

You then call that compiled function

cf[y, xmat, beta]

Please verify that I did not make a mistake; in de Vries's post y is of length 2. Mine is length 3. I am sure what is correct. And of course, the Print are for illustration...

Update
A version with slightly improved dimension handling and with variables localized:

ClearAll[mkCHessian];
mkCHessian[ys_Integer, bs_Integer] :=
 Module[
   {beta, x, y, args, xi, yi, betai},
   args = MapThread[{#1, _Real, #2} &, {{y, x, beta}, {1, 2, 1}}];
   yi = Quiet[Part[y, #] & /@ Range[ys]];
   xi = Quiet[Table[Part[x, i, j], {i, ys}, {j, bs}]];
   betai = Quiet[Part[beta, #] & /@ Range[bs]];
   Compile[Evaluate[args], Evaluate[D[pLike[yi, xi, betai], {betai, 2}]]]
 ]

Now, with asim's definitions in In[1] to In[5]:

cf = mkCHessian[500, 3];
cf[y, xmat, beta]

(* ==> {{-8.852446923, -1.003365612, 1.66653381}, 
       {-1.003365612, -5.799363241, -1.277665283},
       {1.66653381, -1.277665283, -7.676551252}}  *)

Since y is a random vector results will vary.

回复收藏 0 原文

~没有更多了~