python 中的分段线性回归
python中有一个库可以进行分段线性回归吗? 我想自动将多行拟合到我的数据中以获得如下结果:
顺便说一句。我确实知道段数。
Is there a library in python to do segmented linear regression?
I'd like to fit multiple lines to my data automatically to get something like this:
Btw. I do know the number of segments.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
可能可以使用 Numpy 的 numpy.piecewise() 工具。
更详细的描述如下:
如何在Python中应用分段线性拟合?
如果这是不是所需要的,那么您可能会在这些问题中找到一些有用的信息:
https://datascience。 stackexchange.com/questions/8266/is-there-a-library-that-would-perform-segmented-线性-regression-in-python
这里:
https://datascience.stackexchange.com/questions/8457/ python-library-for-segmented-regression-aka-piecewise-regression
Probably, Numpy's
numpy.piecewise()
tool can be used.More detailed description is shown here:
How to apply piecewise linear fit in Python?
If this is not what is needed, then you may probably find some helpful information in these questions:
https://datascience.stackexchange.com/questions/8266/is-there-a-library-that-would-perform-segmented-linear-regression-in-python
and here:
https://datascience.stackexchange.com/questions/8457/python-library-for-segmented-regression-a-k-a-piecewise-regression
正如上面评论中提到的,分段线性回归带来了许多自由参数的问题。因此,我决定放弃使用 n_segments * 3 - 1 参数(即 n_segments - 1 个段位置、n_segment y 偏移、n_segment 斜率)并执行数值优化的方法。相反,我寻找的区域已经具有大致恒定的斜率。
算法
使用决策树而不是聚类算法来获取连接的线段而不是一组(非相邻)点。分割的细节可以通过决策树参数(当前为
max_leaf_nodes
)进行调整。代码
As mentioned in a comment above, segmented linear regression brings the problem of many free parameters. I therefore decided to go away from an approach, which uses n_segments * 3 - 1 parameters (i.e. n_segments - 1 segment positions, n_segment y-offests, n_segment slopes) and performs numerical optimization. Instead, I look for regions, which have already a roughly constant slope.
Algorithm
A decision tree is used instead of a clustering algorithm to get connected segments and not set of (non neighboring) points. The details of the segmentation can be adjusted by the decision trees parameters (currently
max_leaf_nodes
).Code
您只需按升序对 X 进行排序并创建多个线性回归。您可以使用 sklearn 中的 LinearRegression。
例如,将曲线一分为二将如下所示:
我做了类似的实现,代码如下:
https://github.com/mavaladezt/Segmented-Algorithm
You just need to order X in ascending order and create several linear regressions. You can use LinearRegression from sklearn.
For example, dividing the curve in 2 will be something like this:
I did a similar implementation, here is the code:
https://github.com/mavaladezt/Segmented-Algorithm
python 库
piecewise-regression
就是用来完成这个任务的。 Github 链接。带有 1 个断点的简单示例。
为了进行演示,首先生成一些示例数据:
然后拟合分段模型:
并绘制它:
示例 2 - 4 断点。
现在让我们看一些与原始问题类似的数据,有 4 个断点。
拟合模型并绘制它:
此 Google Colab 笔记本
There is the
piecewise-regression
python library for doing exactly this. Github link.Simple Example with 1 Breakpoint.
For a demonstration, first generate some example data:
Then fit a piecewise model:
And plot it:
Example 2 - 4 Breakpoints.
Now let's look at some data that is similar to the original question, with 4 breakpoints.
Fit the model and plot it:
Code examples in this Google Colab notebook