蒙特卡洛延续多叉潘达时间
我在pandas数据框架中的时间表中有很多数据点。据说每列都是彼此独立的。我想创建一个Montecarlo过程,以计算每个列的预期值。为此,我的期望是基础数据遵循布朗运动模式,因此我需要在时空点之间的差异上产生正态分布。
我以这样的方式改变了我的数据:
diffs = (data.diff() / data.shift(1))
这是我目前拥有的:
data = diffs.describe()
这给出了以下输出:
A B C
count 4986.000000 4963.000000 1861.000000
mean 0.000285 0.000109 0.000421
std 0.015759 0.015426 0.014676
...
我这样处理以生成更多样本的处理:
import numpy as np
desired_samples = 1000
random = np.random.default_rng().normal(loc=[data.loc[["mean"]].to_numpy()], scale=[data.loc[["std"]].to_numpy()], size=[len(data.columns), desired_samples])
但是这给了我一个错误:
ValueError: shape mismatch: objects cannot be broadcast to a single shape. Mismatch is between arg 0 with shape (441, 1000) and arg 1 with shape (1, 1, 441).
我想要的只是一个随机的矩阵列的值与样本列具有相同的性病和平均值。即当我进行Random.Describe()
时,我会得到类似的东西:
A B C
count 1000.0 1000.0 1000.0
mean 0.000285 0.000109 0.000421
std 0.015759 0.015426 0.014676
...
生成这些样本的正确方法是什么?
I have a bunch of data points in a timeseries in a pandas dataframe. Each column is supposedly independent of each other. I want to create a montecarlo process to calculate expected values for each of the columns. For that, my expectation is that the underlying data follows a brownian motion pattern, so I'd need to generate a normal distribution over the differences between points in time space.
I transform my data like this:
diffs = (data.diff() / data.shift(1))
This is what I have at the moment:
data = diffs.describe()
This gives the following output:
A B C
count 4986.000000 4963.000000 1861.000000
mean 0.000285 0.000109 0.000421
std 0.015759 0.015426 0.014676
...
I process it like this to generate more samples:
import numpy as np
desired_samples = 1000
random = np.random.default_rng().normal(loc=[data.loc[["mean"]].to_numpy()], scale=[data.loc[["std"]].to_numpy()], size=[len(data.columns), desired_samples])
However this gives me an error:
ValueError: shape mismatch: objects cannot be broadcast to a single shape. Mismatch is between arg 0 with shape (441, 1000) and arg 1 with shape (1, 1, 441).
What I'd want is just a matrix of random values whose columns have the same std and mean as the sample's columns. I.e. such as when I do random.describe()
, I'd get something like:
A B C
count 1000.0 1000.0 1000.0
mean 0.000285 0.000109 0.000421
std 0.015759 0.015426 0.014676
...
What'd be the correct way to generate those samples?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以使用
apply()
使用相关列的属性创建随机正常值的数据框架。生成测试数据
则具有相同(计算)平均值的随机值和STD
如果您希望矩阵为
numpy
You could use
apply()
to create a data frame of random normal values using the attributes of the associated columns.Generate Test Data
Generate Random Values with same approx (calculated) Mean and STD
If you want the matrix to be
numpy