numpy:用更多观察值更新最小二乘法的代码

发布于 2024-11-06 09:17:19 字数 295 浏览 8 评论 0原文

我正在寻找基于 numpy 的普通最小二乘实现,它允许通过更多观察来更新拟合。类似于应用统计算法 AS 274 或 R 的biglm

如果做不到这一点,用新行更新 QR 分解的例程也会令人感兴趣。

有什么指点吗?

I am looking for a numpy-based implementation of ordinary least squares that would allow the fit to be updated with more observations. Something along the lines of Applied Statistics algorithm AS 274 or R's biglm.

Failing that, a routine for updating a QR decomposition with new rows would also be of interest.

Any pointers?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

梦幻的味道 2024-11-13 09:17:19

scikits.statsmodels 有一个递归 OLS,可以更新沙箱中可用于此目的的逆 X'X。 (仅用于计算递归 OLS 残差。)

当数据太大而无法装入内存时,Nathaniel Smith 将他的 OLS 代码发布到 scipy 用户邮件列表中。主要代码更新X'X。

我认为econpy也有这个功能。

Pandas 有一个扩展的 OLS,但它可能不太容易以在线方式使用。

Nathaniels 的代码可能是最接近 biglm 的。我认为一般线性模型没有任何意义(误差协方差与恒等式不同)。

所有这些都需要一些工作才能用于此目的。我不知道有任何 python(-wrapped) 代码可以更新 QR。

更新:
请参阅http://mail.scipy.org/pipermail/scipy- dev/2010-February/013853.html

cholmod 中有增量 qr 和 cholesky 可用,但我没有尝试,无论是许可证还是关于Windows问题的编译,我不认为我试图让incremental_qr工作
请参阅附件

http://mail.scipy.org/pipermail/scipy -dev/2010-February/013844.html

scikits.statsmodels has an recursive OLS that updates the inverse X'X in the sandbox that could be used for this. (used only to calculate recursive OLS residuals.)

Nathaniel Smith posted his code for OLS when the data is too large to fit in memory to the scipy-user mailing list. The main code updates X'X.

I think econpy also has a function for this.

Pandas has an expanding OLS, but it may not be easy to use in an online fashion.

Nathaniels code might be the closest to biglm. I don't think there is anything for general linear model (error covariance different from identity).

All need some work before they can be used for this. I don't know of any python(-wrapped) code that would update QR.

update:
see http://mail.scipy.org/pipermail/scipy-dev/2010-February/013853.html

there is incremental qr and cholesky in cholmod available, but I didn't try it, either license or compilation on windows problems, and I don't think I tried to get incremental_qr to work
see attachements

http://mail.scipy.org/pipermail/scipy-dev/2010-February/013844.html

玩物 2024-11-13 09:17:19

您可以尝试位于 http://code.google.com/p/pythonequations/ 的 pythonequations 项目downloads/list,尽管它可能超出您的需要,但它确实使用 scipy 和 numpy。该代码是 http://zunzun.com 在线曲线和曲面拟合网站的中间件(我是作者)。源代码带有许多示例。或者,单独的网站可能就足够了 - 请尝试一下。

 James Phillips
 2548 Vera Cruz Drive
 Birmingham, AL  35235  USA

 [email protected]

You might try the pythonequations project at http://code.google.com/p/pythonequations/downloads/list, though it may be more than you need it does use scipy and numpy. That code is the middleware for the http://zunzun.com online curve and surface fitting web site (I'm the author). The source code comes with many examples. Alternatively, the web site alone may be sufficient - please give it a try.

 James Phillips
 2548 Vera Cruz Drive
 Birmingham, AL  35235  USA

 [email protected]
许你一世情深 2024-11-13 09:17:19

这还不是详细的答案,但是:

AFAIK,QR 更新如 未在numpy中实现,但无论如何我想请您指定您实际目标的更详细方式。

特别是,为什么仅使用 k 最新观测值计算 xAx= b)的新估计是不可接受的,当(bunch的)新的观察结果到来(并且使用现代硬件,k确实可以是相当大的一个)?

This is not a detailed answer yet, but:

AFAIK, the QR update like this is not implemented in numpy, but anyway I'll like ask you to specify a more detailed manner what you are actually aiming for.

Especially, why it would not be acceptable to just calculate new estimate for x (of Ax= b) with k latest observations, when (bunch of) new observations arrives (and with modern hardware, k indeed can be quite large one)?

浅笑轻吟梦一曲 2024-11-13 09:17:19

文件的 LSQ.F90 部分很容易编译,

gfortran-4.4 -shared -fPIC -g -o lsq.so LSQ.F90

并且这在 Python 中有效,

from ctypes import cdll

lsq = cdll.LoadLibrary('./lsq.so')

一旦我弄清楚函数调用,我就会将其包含在这个答案中。

The LSQ.F90 part of the file compiles easily enough with,

gfortran-4.4 -shared -fPIC -g -o lsq.so LSQ.F90

and this works in Python,

from ctypes import cdll

lsq = cdll.LoadLibrary('./lsq.so')

As soon as I figure out the function call I'll include it in this answer.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文