python中的滚动中位数
我有一些基于每日收盘价的股票数据。我需要能够将这些值插入到 python 列表中并获取最近 30 个收盘价的中位数。有没有一个Python库可以做到这一点?
I have some stock data based on daily close values. I need to be able to insert these values into a python list and get a median for the last 30 closes. Is there a python library that does this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
在纯 Python 中,将数据放在 Python 列表
a
中,您可以这样做(假设
a
至少有 30 个项目。)使用 NumPy 包,您可以使用
In pure Python, having your data in a Python list
a
, you could do(This assumes
a
has at least 30 items.)Using the NumPy package, you could use
你考虑过pandas吗?它基于
numpy
,可以自动将时间戳与您的数据关联起来,并且只要您用numpy.nan
填充它,就会丢弃任何未知日期。它还通过 matplotlib 提供了一些相当强大的绘图功能。基本上它是为Python 中的财务分析而设计的。
Have you considered pandas? It is based on
numpy
and can automatically associate timestamps with your data, and discards any unknown dates as long as you fill it withnumpy.nan
. It also offers some rather powerful graphing via matplotlib.Basically it was designed for financial analysis in python.
中位数不就是排序范围内的中间值吗?
因此,假设您的列表是
stock_data
:现在您只需找到并修复相差一的错误,并处理
stock_data
元素少于 30 个的情况...让我们在这里尝试一下:
isn't the median just the middle value in a sorted range?
so, assuming your list is
stock_data
:Now you just need to get the off-by-one errors found and fixed and also handle the case of
stock_data
being less than 30 elements...let us try that here a bit:
#发现这很有帮助:
#found this helpful:
虽然答案是正确的,但滚动中位数在循环内调用 np.median 会产生巨大的开销。这是一种更快的方法,具有
w*|x|
空间复杂度。输出与输入向量具有相同的长度。
少于一个条目的行将被忽略,其中一半为 nan(仅发生在偶数窗口宽度),仅返回第一个选项。这是上面的shifted_matrix以及各自的中值:
可以通过调整最终切片来改变行为
medians[(w-1)//2:-(w-1)//2]
。基准:
替代方法:(结果将发生变化)
两种算法都具有线性时间复杂度。
因此,函数
moving_median
将是更快的选择。While the answers are correct, the rolling median would have a huge overhead of calling
np.median
within a loop. Here is a much faster method withw*|x|
space complexity.The output has the same length as the input vector.
Rows with less than one entry will be ignored and with half of them nans (happens only for an even window-width), only the first option will be returned. Here is the shifted_matrix from above with the respective median values:
The behaviour can be changed by adapting the final slice
medians[(w-1)//2:-(w-1)//2]
.Benchmark:
Alternative approach: (the results will be shifted)
Both algorithms have a linear time complexity.
Therefore, the function
moving_median
will be the faster option.