概率时间序列、观察到的数据概率（似曾相识）

发布于 2024-07-25 05:58:35 字数 1255 浏览 6 评论 0原文

好的，伙计们...感谢您查看这个问题。我记得在大学时做了以下操作，但我忘记了确切的解决方案。任何接受者都应该朝着正确的方向前进。

我有一个 N 个时间序列的数据（我们将使用三个）。数据序列按时间顺序排列（例如 obsOne[1] 与 obsTwo[1] 和 obsThree[1] 一起出现）

obsOne[47, 136 , -108, -15, 22, ...], obsTwo[448, 321, 122, -207, 269, ...], obsThree[381, 283, 429, -393, 242, ...]

步骤2. 根据数据系列，我为每个数据系列创建一系列宽度为 Z 的 X 范围 bin。（例如观察 obsOne：bin1 = [<-108, -108] bin2 = [-108, -26] bin3 = [-26, 55] ... binX = [136, > 136]

步骤 3. 现在创建一个包含数据系列所有可能组合的表。因此，如果我有 4 个 bin 和 3 个数据系列，则所有组合总计将有 4x4x4 = 64 个可能的结果（例如 row1 = obsOne bin1 + obsTwo bin1 + obsThree bin1, row2 = obsOne bin1 + obsTwo bin1 + obsThree bin2，... row5 = obsOne bin1 + obsTwo bin1 + obsThree binX，row6 = obsOne bin1 + obsTwo bin2 + obsThree bin1，row7 = obsOne bin1 + obsTwo bin1 + obsThree bin2，row9 = obsOne bin1 + obsTwo bin2 + obsThree binX, ...)

步骤 4. 现在，我返回数据系列，查找数据系列中的每一行落在表中的位置，并计算观察结果出现的次数（例如 obsOne[2] obsTwo[。 2] obsThree[2] = 表上的第 30 行，obsOne[X] obsTwo[X] obsThree[X] = 表上的第 52 行

步骤 5. 然后，我只获取表中具有正匹配的行，计算有多少个观察值。落在该行上，除以数据系列中的观察总数，这给了我观察数据的该范围的概率。

我为这个基本问题道歉，我不是数学专家。很多年前我就做过这样的事。我忘记了我用的是哪种方法，它比这种漫长的（古老的“手工”）方法快得多。我当时没有使用 python，它是 c++ 中的其他一些专有包。我想看看是否有东西可以用 python （现在是一个 python 商店）解决这个问题，并且总是可以扩展，所以它是软约束。

原文

okay folks...thanks for looking at this question. I remember doing the following below in college however I forgotten the exact solution. Any takers to steer in the right direction.

I have a time series of data (we'll use three) of N. The data series is sequential in order of time (e.g. obsOne[1] occurred along with obsTwo[1] and obsThree[1])

obsOne[47, 136, -108, -15, 22, ...], obsTwo[448, 321, 122, -207, 269, ...], obsThree[381, 283, 429, -393, 242, ...]

Step 2. from the data series I create a series of X range bins with width Z for each data series. (e.g. of observation obsOne: bin1 = [<-108, -108] bin2 = [-108, -26] bin3 = [-26, 55] ... binX = [136, > 136]

Step 3. Now create a table with all possible combinations on the data series. Thus if I had 4 bins and 3 data series all combinations would total 4x4x4 = 64 possible outcomes. (e.g. row1 = obsOne bin1 + obsTwo bin1 + obsThree bin1, row2 = obsOne bin1 + obsTwo bin1 + obsThree bin2, ... row5 = obsOne bin1 + obsTwo bin1 + obsThree binX, row6 = obsOne bin1 + obsTwo bin2 + obsThree bin1, row7 = obsOne bin1 + obsTwo bin1 + obsThree bin2, row9 = obsOne bin1 + obsTwo bin2 + obsThree binX, ...)

Step 4. I now go back to the data series and find where each row in the data series falls on on the table and count how many times an observation does so. (e.g. obsOne[2] obsTwo[2] obsThree[2] = row 30 on table, obsOne[X] obsTwo[X] obsThree[X] = row 52 on table.

Step 5. I then only take the rows on the table with positive matches, count how many observations fell on that row, dived by total number of observation in data series and that gives me my probability for that range on the observed data.

I apologize for this basic question, not a math expert. I have done this before many years ago. I forgot which method I used, it was much faster than this long (ancient "by hand") method. I wasn't using python at the time, it was some other proprietary package in c++. I'd like to see if something is out there that can solve this problem with python (now a python shop), could always extend, so it is soft constraint.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

ぽ尐不点ル 2024-08-01 05:58:35

你在谈论这样的事情吗？

from __future__ import division
from collections import defaultdict

obsOne= [47, 136, -108, -15, 22, ]
obsTwo= [448, 321, 122, -207, 269, ]
obsThree= [381, 283, 429, -393, 242, ]

class BinParams( object ):
    def __init__( self, timeSeries, X ):
        self.mx= max(timeSeries )
        self.mn= min(timeSeries )
        self.Z=(self.mx-self.mn)/X
    def index( self, sample ):
        return (sample-self.mn)//self.Z

binsOne=  BinParams( obsOne, 4 )
binsTwo=  BinParams( obsTwo, 4 )
binsThree= BinParams( obsThree, 4 )

counts= defaultdict(int)
for s1, s2, s3 in zip( obsOne, obsTwo, obsThree ):
    posn= binsOne.index(s1), binsTwo.index(s2), binsThree.index(s3)
    counts[posn] += 1

for k in counts:
    print k, counts[k], counts[k]/len(counts)

Are you talking about something like this?

from __future__ import division
from collections import defaultdict

obsOne= [47, 136, -108, -15, 22, ]
obsTwo= [448, 321, 122, -207, 269, ]
obsThree= [381, 283, 429, -393, 242, ]

class BinParams( object ):
    def __init__( self, timeSeries, X ):
        self.mx= max(timeSeries )
        self.mn= min(timeSeries )
        self.Z=(self.mx-self.mn)/X
    def index( self, sample ):
        return (sample-self.mn)//self.Z

binsOne=  BinParams( obsOne, 4 )
binsTwo=  BinParams( obsTwo, 4 )
binsThree= BinParams( obsThree, 4 )

counts= defaultdict(int)
for s1, s2, s3 in zip( obsOne, obsTwo, obsThree ):
    posn= binsOne.index(s1), binsTwo.index(s2), binsThree.index(s3)
    counts[posn] += 1

for k in counts:
    print k, counts[k], counts[k]/len(counts)

回复收藏 0 原文

~没有更多了~