多点触控环境中的手势识别使用哪些方法/算法?

发布于 2024-07-16 07:33:30 字数 750 浏览 5 评论 0原文

在多点触控环境中,手势识别如何工作? 使用什么数学方法或算法来识别或拒绝可能手势的数据?

我制作了一些反光手套和一个红外 LED 阵列,以及一个 Wii 遥控器。 Wii 遥控器进行内部斑点检测并跟踪 4 个红外光点,并通过蓝牙适配器将该信息传输到我的计算机。

这是基于 Johnny Chung Lee 的 Wii 研究。 我的精确设置与此处展示的荷兰研究生一模一样。 我可以轻松跟踪 2d 空间中 4 个点的位置,并且我编写了基本软件来接收和可视化这些点。

替代文本替代文本alt text

荷兰学生已经从基本的捏击识别中获得了很多功能。 如果可以的话,我想更进一步,并实现一些其他手势。

手势识别通常是如何实现的? 除了任何琐碎的事情之外,我如何编写软件来识别和识别各种手势:各种滑动、圆周运动、字母追踪等。

In a multi-touch environment, how does gesture recognition work? What mathematical methods or algorithms are utilized to recognize or reject data for possible gestures?

I've created some retro-reflective gloves and an IR LED array, coupled with a Wii remote. The Wii remote does internal blob detection and tracks 4 points of IR light and transmits this information to my computer via a bluetooth dongle.

This is based off Johnny Chung Lee's Wii Research. My precise setup is exactly like the graduate students from the Netherlands displayed here. I can easily track 4 point's positions in 2d space and I've written my basic software to receive and visualize these points.

alt textalt textalt text

The Netherlands students have gotten a lot of functionality out of their basic pinch-click recognition. I'd like to take it a step further if I could, and implement some other gestures.

How is gesture recognition usually implemented? Beyond anything trivial, how could I write software to recognize and identify a variety of gestures: various swipes, circular movements, letter tracing, etc.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

泛泛之交 2024-07-23 07:33:30

无论如何,正如我所看到的,手势识别通常是使用类似于图像识别软件的机器学习技术来实现的。 这是一个关于在 c# 中进行鼠标手势识别的很棒的项目。 我确信这些概念非常相似,因为您可以将问题简化为二维空间。 如果你能得到一些与此相关的东西,我很乐意看到它。 很棒的项目想法!

Gesture recognition, as I've seen it anyway, is usually implemented using machine learning techniques similar to image recognition software. Here's a cool project on codeproject about doing mouse gesture recognition in c#. I'm sure the concepts are quite similar since you can likely reduce the problem down to 2D space. If you get something working with this, I'd love to see it. Great project idea!

樱花坊 2024-07-23 07:33:30

看待它的一种方法是将其视为压缩/识别问题。 基本上,您想要获取一大堆数据,扔掉其中的大部分,然后对剩余的进行分类。 如果我这样做(从头开始),我可能会按如下方式进行:

  • 使用滚动历史窗口,
  • 获取起始帧中四个点的重心,保存它,然后从所有位置中减去它帧。
  • 将每一帧分解为两个部分:星座的形状及其 CofG 相对于上一帧的移动。
  • 也保存最后一帧的绝对 CofG,
  • 一系列 CofG 变化会给您带来滑动、挥动等效果。
  • 一系列星座变形会给您带来捏捏等效果。

看到照片后(每只手上有两个点 ,不是四点合一,哦!)我将上面的内容修改如下:

  • 对成对的 Co​​fG 进行计算,但需要注意的是:
    • 如果有四个可见点,则选择对以最小化对内距离的乘积
    • 如果有三个可见点,最近的两个是一对,另一个是另一个
    • 在需要时使用前/后帧进行覆盖
  • 的不是星座,而是距离/方向对的嵌套结构(即,双手之间有一个 D/O,每只手还有一个)。
  • 将完整简化的数据传递给每个手势的识别器,让他们整理出他们关心的内容。
  • 如果你想变得可爱,可以做一些 DSL 来识别模式,并编写如下内容:

    触发时 
          在frame.final中:矩形(点)  
        和 
          超过frames.final(5):points.all (p => p.jerk) 
      

    触发时 
          超过frames.final(3):hands.all (h => h.click) 
      

One way to look at it is as a compression / recognition problem. Basically, you want to take a whole bunch of data, throw out most of it, and categorize the remainder. If I were doing this (from scratch) I'd probably proceed as follows:

  • work with a rolling history window
  • take the center of gravity of the four points in the start frame, save it, and subtract it out of all the positions in all frames.
  • factor each frame into two components: the shape of the constellation and the movement of it's CofG relative to the last frame's.
  • save the absolute CofG for the last frame too
  • the series of CofG changes gives you swipes, waves, etc.
  • the series of constellation morphing gives you pinches, etc.

After seeing your photo (two points on each hand, not four points on one, doh!) I'd modify the above as follows:

  • Do the CofG calculation on pairs, with the caveats that:
    • If there are four points visible, pairs are chosen to minimize the product of the intrapair distances
    • If there are three points visible, the closest two are one pair, the other one is the other
    • Use prior / following frames to override when needed
  • Instead of a constellation, you've got a nested structure of distance / orientation pairs (i.e., one D/O between the hands, and one more for each hand).
  • Pass the full reduced data to recognizers for each gesture, and let them sort out what they care about.
  • If you want to get cute, do a little DSL to recognize the patterns, and write things like:

    fire when
        in frame.final: rectangle(points) 
      and
        over frames.final(5): points.all (p => p.jerk)
    

    or

    fire when
        over frames.final(3): hands.all (h => h.click)
    
疧_╮線 2024-07-23 07:33:30

如果有人感兴趣的话,可以看一下用这种技术做了什么的视频吗?

帕蒂·梅斯演示第六感 - TED 2009

A video of what has been done with this sort of technology, if anyone is interested?

Pattie Maes demos the Sixth Sense - TED 2009

清晰传感 2024-07-23 07:33:30

我见过的大多数简单的手势识别工具都使用基于矢量的模板来识别它们。 例如,您可以将向右滑动定义为“0”,将复选标记定义为“-45, 45, 45”,将顺时针圆圈定义为“0, -45, -90, -135, 180, 135, 90, 45, 0”,等等。

Most simple gesture-recognition tools I've looked at use a vector-based template to recognize them. For example, you can define right-swipe as "0", a checkmark as "-45, 45, 45", a clockwise circle as "0, -45, -90, -135, 180, 135, 90, 45, 0", and so on.

九歌凝 2024-07-23 07:33:30

呃..过去一年左右我一直在研究手势识别,但我不想说太多,因为我正在尝试为我的技术申请专利:)但是...我们有一些运气通过自适应增强,尽管您所做的事情看起来根本不同。 你只有 4 个数据点需要处理,所以我认为你真的不需要“减少”任何东西。

我要研究的是 Flash 之类的程序如何将徒手绘制的圆变成实际的圆。 看起来你可以跟踪这些点大约一秒钟的时间,然后以某种方式“平滑”路径,然后你可能可以摆脱对手势的硬编码(如果你让它们足够简单)。 否则,是的,您将需要使用学习算法。 神经网络可能有用……我不知道。 只是抛出想法:) 也许看看OCR 是如何完成的......甚至< a href="http://en.wikipedia.org/wiki/Hough_transform" rel="nofollow noreferrer">霍夫变换。 在我看来,这是一个识别形状的问题,而不是识别手势的问题。

Err.. I've been working on gesture recognition for the past year or so now, but I don't want to say too much because I'm trying to patent my technology :) But... we've had some luck with adaptive boosting, although what you're doing looks fundamentally different. You only have 4 points of data to process, so I don't think you really need to "reduce" anything.

What I would investigate is how programs like Flash turn a freehand drawn circle into an actual circle. It seems like you could track the points for duration of about a second, and then "smooth" the path in some fashion, and then you could probably get away with hardcoding your gestures (if you make them simple enough). Otherwise, yes, you're going to want to use a learning algorithm. Neural nets might work... I don't know. Just tossing out ideas :) Maybe look at how OCR is done too... or even Hough transforms. It looks to me like this is a problem of recognizing shapes more than it is of recognizing gestures.

岁月流歌 2024-07-23 07:33:30

我不太精通这种类型的数学,但我在某处读到人们有时使用 马尔可夫链隐藏马尔可夫模型 进行手势识别。

也许在计算机科学方面有更多背景的人可以进一步阐明它并提供更多细节。

I'm not very well versed in this type of mathematics, but I have read somewhere that people sometimes use Markov Chains or Hidden Markov Models to do Gesture Recognition.

Perhaps someone with a little more background in this side of Computer Science can illuminate it further and provide some more details.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文