使用 kinect 和 python 进行手势识别:嗯学习

发布于 2024-12-21 08:29:11 字数 971 浏览 1 评论 0原文

我想用 kinect 在 python 中进行手势识别。

在阅读了一些理论之后,我认为最好的方法之一是使用隐马尔可夫模型(HMM)(鲍姆韦尔奇或某种 EM 方法)和一些已知的手势数据进行无监督学习,以实现一组经过训练的 HMM(每个手势一个)我想认识的)。

然后,我会将观察到的数据的最大对数似然(维特比?)与训练集中的 HMM 进行匹配进行识别。

例如,我用 kinect 设备记录了一些手势(打招呼、踢一拳、用手画圈)的数据(右手的坐标 x、y、z),并且我做了一些训练:

# training
known_datas = [
tuple( load_data('punch.mat'),                'PUNCH' ),
tuple( load_data('say_hello.mat'),            'HELLO' ), 
tuple( load_data('do_circle_with_hands.mat'), 'CIRCLE' )
]

gestures = set()
for x, name in known_datas:
    m = HMM()
    m.baumWelch(x)
    gestures.add(m)

然后我执行识别观察到的新数据执行最大 loglik 并选择之前保存的手势,对于每个经过训练的 HMM 具有最大 loglik:

# recognition
observed = load_data('new_data.mat')
logliks = [m.viterbi(observed) for m in gestures]

print 'observed data is ', gestures[logliks.index(max(logliks))]

我的问题是:

  • 这是完全愚蠢的事情吗?
  • 真实案例有多少训练集?
  • 每个 HMM 有多少个状态?
  • 可以实时进行吗?

I want to do gesture recognition in python with kinect.

After reading up on some theory, I think one of the best method is unsupervised learning with Hidden Markov Model (HMM) (baum welch or some EM method) with some known gesture data, to achieve a set of trained HMM (one for each gesture that I want to recognize).

I would then do the recognition matching the max log likelihood (with viterbi?) of observed data with the HMM in the trained set.

For example, I have data (coordinate x,y,z of the right hand) recorded with the kinect device of some gestures (saying hello, kick a punch, do a circle with the hand) and I do some training:

# training
known_datas = [
tuple( load_data('punch.mat'),                'PUNCH' ),
tuple( load_data('say_hello.mat'),            'HELLO' ), 
tuple( load_data('do_circle_with_hands.mat'), 'CIRCLE' )
]

gestures = set()
for x, name in known_datas:
    m = HMM()
    m.baumWelch(x)
    gestures.add(m)

then I perform recognition of observed new data performing the max loglik and choose the gesture saved before that has the max loglik for each trained HMM:

# recognition
observed = load_data('new_data.mat')
logliks = [m.viterbi(observed) for m in gestures]

print 'observed data is ', gestures[logliks.index(max(logliks))]

My questions are:

  • Is this something totally stupid?
  • How many training set for a real case?
  • How many states for each HMM?
  • Is it possible to do it in realtime?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

零度° 2024-12-28 08:29:11

首先:这是一个非常特殊的问题,你需要一位机器学习专家。不幸的是,堆栈交换站点中还没有 ML 等价物……也许有一天会有一个。 :)

我想你的方法是有效的,只是一些评论:

  • 你刚刚用 HMM() 实例化的 HMM 类需要精心设计,以便 HMM 的结构可以像手势一样表示某物。 HMM 之间有状态转换,那么如何为手势定义 HMM?我确信这是可能的(甚至认为这是一个很好的方法),但它需要一些思考。也许状态只是 3D 立方体的角,对于您识别的手势的每个观察点,您选择该立方体最近的角。然后,BW 算法可以通过训练数据来近似转换可能性。但您可能需要选择一个更细粒度的状态模型,可能是一个 n * n * n 体素网格。

  • 维特比算法给出的不是模型的可能性,而是给定序列观察的最可能的状态序列。 IIRC 您将选择前向算法来获取给定模型的给定观察序列的概率。

我认为,如果有一个训练有素且不太复杂的 HMM,您应该能够实时识别手势,但这只是一个有根据的猜测。 :)

First of all: This is a very special question, you'll need a machine learning expert here. Unfortunately there's no ML equivalent here among the stack exchange sites yet ... maybe there'll be one some day. :)

I guess your approach is valid, just some remarks:

  • The HMM class which you just instantiate with HMM() here needs to be crafted so that the HMM's structure can represent sth similar to a gesture. HMMs have states and transitions between them, so how would you define an HMM for a gesture? I'm positive that this is possible (and even think it's a good approach) but it requires some thinking. Maybe the states are just the corners of a 3D cube, and for each observed point of your recognized gesture you pick the closest corner of this cube. The BW algorithm can then approximate the transition likelihoods through your training data. But you may need to pick a more fine-grained state model, maybe an n * n * n voxel grid.

  • The Viterbi algorithm gives you not the likelihood of a model but the most likely sequence of states for a given sequence observation. IIRC you'd pick the forward algorithm to get probability of a given observation sequence for a given model.

I assume that, given a well-trained and not too complex HMM, you should be able to recognize gestures in real-time, but that's just an educated guess. :)

梦萦几度 2024-12-28 08:29:11

它已在许多变体中得到成功应用:http: //scholar.google.co.il/scholar?hl=en&q=HMM+手势+识别

备注:

It has already been applied successfully in many variations: http://scholar.google.co.il/scholar?hl=en&q=HMM+Gesture+Recognition .

Remarks:

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文