与最佳拟合曲线一致的随机值
我正在考虑生成具有有趣分布的测试数据。
我了解生成均匀分布和正态分布的方法,但如何将任意函数转换为加权分布函数?我的术语可能有偏差——我不介意更正。
例如,假设我有一个函数,随着时间的推移,该函数通常会增加,但会定期循环。 “活动”通常在一年内增加,但每周循环一次,周末急剧下降。
该函数可以是代数函数,但如果它可以是任何函数(具有离散/不连续范围(?)的命令式(?)),那么它将很有价值。
如果示例中的活动曲线为 f(t)
,我可以将 f(t)
设为平均值并提供固定的标准差,但是我该如何选择 < code>t 是否也需要分发?我不想遍历 T
,我只想使用适当的分布在 T
中随机选择。
因此,TestActivityGenerator() 函数采用绝对日期范围、另一条几周内的曲线和另一条一天中几小时内的曲线之间的曲线参数,并以适当的分布输出日期时间。结果不以任何特定顺序生成。
另一种情况可能是:实数生成器生成素数的可能性是合数的 1.652 倍。这没有什么技巧——有一些简单的方法可以做到这一点,但我正在寻找一种通用的解决方案。
谢谢!
编辑:我更改了标题的措辞,以便从不同的角度看待问题 - 我们如何从最佳拟合曲线回溯到与该曲线一致的随机样本。如果我有股票市场数据的直方图,如何生成与真实数据分布类似的数据。不仅仅是每个 t
平均为相同值的成对值,因为它们会失败其他随机性测试。
I'm contemplating the generation of test data with interesting distributions.
I understand methods for the generation of uniform distribution and normal distribution, but how can I transform an arbitrary function into a weighted distribution function? My terminology may be off here - I won't mind corrections.
For example, let's say that I have a function over time which generally increases, but cycles periodically. "Activity" which increases generally over a year, but weekly cycles with sharp falloff on the weekends.
The function could be algebraic, but it would be valuable if it could be any function (imperative(?) with discrete/discontinuous ranges(?)).
If the Activity curve from the example is f(t)
, I could just make f(t)
the mean and provide a fixed standard deviation, but how do I chose t
if it too needs distribution? I don't want to have to iterate through T
, I just want to select among T
randomly with the appropriate distributions.
So the TestActivityGenerator() function takes parameters for curves between, say, an absolute date range, another curve over weeks, and another curve over hours in the day, and spits out DateTimes in the proper distributions. Results are not generated in any specific ordering.
Another scenario might be: a generator of reals which is, say, 1.652 times more likely to spit out a prime number than a composite. No tricks on this one - there are trivial ways to do this, but I'm looking for a general solution.
Thanks!
Edit: I've change the wording of the title to look at the problem from a different angle - How can we backtrack from a curve of best-fit to random samples that are consistent with that curve. If I have a histogram of stock market data, how can I generate data that is distributed similarly to the real data. Not just pairwise-values that average to the same value for each t
, because they would fail other randomness tests.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论