当前位置：文江博客话题详情

多点触控环境中的手势识别使用哪些方法/算法？

发布于 2024-07-16 07:33:30 字数 750 浏览 5 评论 0原文

在多点触控环境中，手势识别如何工作？使用什么数学方法或算法来识别或拒绝可能手势的数据？

我制作了一些反光手套和一个红外 LED 阵列，以及一个 Wii 遥控器。 Wii 遥控器进行内部斑点检测并跟踪 4 个红外光点，并通过蓝牙适配器将该信息传输到我的计算机。

这是基于 Johnny Chung Lee 的 Wii 研究。我的精确设置与此处展示的荷兰研究生一模一样。我可以轻松跟踪 2d 空间中 4 个点的位置，并且我编写了基本软件来接收和可视化这些点。

替代文本 alt text

荷兰学生已经从基本的捏击识别中获得了很多功能。如果可以的话，我想更进一步，并实现一些其他手势。

手势识别通常是如何实现的？除了任何琐碎的事情之外，我如何编写软件来识别和识别各种手势：各种滑动、圆周运动、字母追踪等。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

泛泛之交 2024-07-23 07:33:30

无论如何，正如我所看到的，手势识别通常是使用类似于图像识别软件的机器学习技术来实现的。这是一个关于在 c# 中进行鼠标手势识别的很棒的项目。我确信这些概念非常相似，因为您可以将问题简化为二维空间。如果你能得到一些与此相关的东西，我很乐意看到它。很棒的项目想法！

回复收藏 0 原文

樱花坊 2024-07-23 07:33:30

看待它的一种方法是将其视为压缩/识别问题。基本上，您想要获取一大堆数据，扔掉其中的大部分，然后对剩余的进行分类。如果我这样做（从头开始），我可能会按如下方式进行：

使用滚动历史窗口，
获取起始帧中四个点的重心，保存它，然后从所有位置中减去它帧。
将每一帧分解为两个部分：星座的形状及其 CofG 相对于上一帧的移动。
也保存最后一帧的绝对 CofG，
一系列 CofG 变化会给您带来滑动、挥动等效果。
一系列星座变形会给您带来捏捏等效果。

看到照片后（每只手上有两个点，不是四点合一，哦！）我将上面的内容修改如下：

对成对的 CofG 进行计算，但需要注意的是：
- 如果有四个可见点，则选择对以最小化对内距离的乘积
- 如果有三个可见点，最近的两个是一对，另一个是另一个
- 在需要时使用前/后帧进行覆盖
的不是星座，而是距离/方向对的嵌套结构（即，双手之间有一个 D/O，每只手还有一个）。
将完整简化的数据传递给每个手势的识别器，让他们整理出他们关心的内容。

如果你想变得可爱，可以做一些 DSL 来识别模式，并编写如下内容：

触发时 
      在frame.final中：矩形（点）  
    和 
      超过frames.final(5)：points.all (p => p.jerk)

或

触发时 
      超过frames.final(3)：hands.all (h => h.click)

One way to look at it is as a compression / recognition problem. Basically, you want to take a whole bunch of data, throw out most of it, and categorize the remainder. If I were doing this (from scratch) I'd probably proceed as follows:

work with a rolling history window
take the center of gravity of the four points in the start frame, save it, and subtract it out of all the positions in all frames.
factor each frame into two components: the shape of the constellation and the movement of it's CofG relative to the last frame's.
save the absolute CofG for the last frame too
the series of CofG changes gives you swipes, waves, etc.
the series of constellation morphing gives you pinches, etc.

After seeing your photo (two points on each hand, not four points on one, doh!) I'd modify the above as follows:

Do the CofG calculation on pairs, with the caveats that:
- If there are four points visible, pairs are chosen to minimize the product of the intrapair distances
- If there are three points visible, the closest two are one pair, the other one is the other
- Use prior / following frames to override when needed
Instead of a constellation, you've got a nested structure of distance / orientation pairs (i.e., one D/O between the hands, and one more for each hand).
Pass the full reduced data to recognizers for each gesture, and let them sort out what they care about.

If you want to get cute, do a little DSL to recognize the patterns, and write things like:

fire when
    in frame.final: rectangle(points) 
  and
    over frames.final(5): points.all (p => p.jerk)

fire when
    over frames.final(3): hands.all (h => h.click)

回复收藏 0 原文

疧_╮線 2024-07-23 07:33:30

如果有人感兴趣的话，可以看一下用这种技术做了什么的视频吗？

帕蒂·梅斯演示第六感 - TED 2009

回复收藏 0 原文

清晰传感 2024-07-23 07:33:30

我见过的大多数简单的手势识别工具都使用基于矢量的模板来识别它们。例如，您可以将向右滑动定义为“0”，将复选标记定义为“-45, 45, 45”，将顺时针圆圈定义为“0, -45, -90, -135, 180, 135, 90, 45, 0”，等等。

回复收藏 0 原文

九歌凝 2024-07-23 07:33:30

呃..过去一年左右我一直在研究手势识别，但我不想说太多，因为我正在尝试为我的技术申请专利:)但是...我们有一些运气通过自适应增强，尽管您所做的事情看起来根本不同。你只有 4 个数据点需要处理，所以我认为你真的不需要“减少”任何东西。

我要研究的是 Flash 之类的程序如何将徒手绘制的圆变成实际的圆。看起来你可以跟踪这些点大约一秒钟的时间，然后以某种方式“平滑”路径，然后你可能可以摆脱对手势的硬编码（如果你让它们足够简单）。否则，是的，您将需要使用学习算法。神经网络可能有用……我不知道。只是抛出想法:) 也许看看OCR 是如何完成的......甚至< a href="http://en.wikipedia.org/wiki/Hough_transform" rel="nofollow noreferrer">霍夫变换。在我看来，这是一个识别形状的问题，而不是识别手势的问题。

回复收藏 0 原文