将 Kinect 音频与视频相匹配

发布于 2024-11-18 07:51:10 字数 2905 浏览 2 评论 0原文

我有一个使用 Kinect(或者更可能的是其中四个)处理视频会议的项目。现在,我的公司在 VTC 室使用这些昂贵得令人难以置信的摄像机。我们希望通过使用几个连接在一起的 Kinect 来降低成本。计划是让四/五个摄像头覆盖 180 度的弧线,这样 Kinect 就可以看到整个房间/桌子(仍然比我们当前的摄像头便宜很多!)。应用程序将根据桌上的谈话者选择来自 Kinect 的视频流。理论上计划很好,但我遇到了障碍。

据我所知,无法判断哪个麦克风阵列对应于 Kinect Runtime 对象。我可以获得一个代表每个 Kinect 的对象,使用:

Device device = new Device();
Runtime[] kinects = new Runtime[device.Count];
for( int i = 0; i < kinects.Length; i ++ )
    kinects[i] = new Runtime(i);

和每个麦克风阵列使用:

var source = new KinectAudioSource();
IEnumerable<AudioDeviceInfo> devices = source.FindCaptureDevices();
foreach( AudioDeviceInfo in device in devices)
{
    KinectAudioSource devSpecificSource = new KinectAudioSource();
    devSpecificSource.MicrophoneIndex = (short)device.DeviceIndex;
}

但我找不到任何方法来知道运行时 A 对应于 KinectAudioSource B。对于我正在使用的两个 Kinect 来说,这不是一个大问题(我'我只是猜测哪个是哪个,如果错了就切换它们),但是当我们有四个或五个 Kinect 时,我不想每次应用程序运行时都需要进行任何类型的校准。我考虑过假设 Runtime 和 KinectAudioSource 对象的顺序相同(运行时索引 0 对应于设备中的第一个 AudioDeviceInfo),但这似乎有风险。

那么问题来了:有没有办法将 Runtime 对象与其 KinectAudioSource 相匹配?如果不是,是否可以保证它们的顺序正确,以便我可以将运行时 0 与设备中的第一个 KinectAudioSource 麦克风索引进行匹配?

更新: 最后,我对 WPF 的单线程单元要求和 Kinect 音频的多线程单元要求进行了猛烈抨击,足以让两者一起表现。据我所知,问题是 Kinect 运行时对象和 KinectAudioSource 的顺序对齐。我在一个相当吵闹的实验室(我是房间里大概 40 名实习生之一),所以很难测试,但我相当确定我插入的两个 Kinect 的顺序已经改变了。我有两个 Runtime 对象和两个 KinectAudioSource 对象。当第一个 KinectAudioSource 报告声音直接来自它前面时,我实际上站在与第二个运行时对象关联的 Kinect 前面。所以不能保证两者的顺序会一致。现在,重复这个问题:如何将 KinectAudioSource 对象与 Nui.Runtime 对象匹配?现在,我只连接了两个 Kinect,但由于目标是四到五个......我需要一种具体的方法来做到这一点。

更新2: 把我工作时的两个 Kinect 带回家玩。三个 Kinect,一台计算机。有趣的东西(实际上一次将它们全部安装起来很痛苦,而且其中一个视频源似乎不起作用,所以我现在又回到了 2 个)。 musefan 的回答让我希望我错过了 AudioDeviceInfo 对象中的一些内容,这些内容可以阐明这个问题,但没有运气。我在运行时对象中发现了一个有趣的字段,名为 NuiCamera.UniqueDeviceName,但我找不到该字段与 AudioDeviceInfo 中的任何内容之间的任何链接。

这些字段的输出,希望夏洛克·福尔摩斯能够看到该线程并注意到连接:

Console.WriteLine("Nui{0}: {1}", i, nuis[i].NuiCamera.UniqueDeviceName);
//Nui0: USB\VID_0409&PID_005A\6&1F9D61BF&0&4
//Nui1: USB\VID_0409&PID_005A\6&356AC357&0&3

Console.WriteLine("AudioDeviceInfo{0}: {1}, {2}, {3}", audios.IndexOf(audio), device.DeviceID, device.DeviceIndex, device.DeviceName);
//AudioDeviceInfo0: {0.0.1.00000000}.{1945437e-2d55-45e5-82ba-fc3021441b17}, 0, Microphone Array (Kinect USB Audio)
//AudioDeviceInfo1: {0.0.1.00000000}.{6002e98f-2429-459a-8e82-9810330a8e25}, 1, Microphone Array (2- Kinect USB Audio)

更新 3: 我不是在寻找校准技术。我正在寻找一种在运行时将 Kinect 摄像头与其麦克风阵列在应用程序中进行匹配的方法,而无需事先进行设置。请停止发布可能的校准技术。发布问题的全部目的是找到一种方法来避免需要用户进行设置。

更新4: WMI 绝对是一个可行的方法。不幸的是,我没有太多的时间来研究它,因为我一直在努力让 3 个 Kinect 能够很好地相互配合。 USB 集线器无法处理带宽的原因是什么?我已经告诉我的老板,似乎没有任何简单的方法可以将 3 个以上的 Kinect 连接到普通计算机而不是蓝屏。我可能仍然会在空闲时间尝试做这件事,但就工作而言......这几乎是一个死胡同。

感谢大家的回答,抱歉我无法发布有效的解决方案。

I have a project dealing with video conferencing using the Kinect (or, more likely, four of them). Right now, my company uses these stupidly expensive cameras for our VTC rooms. The hope is, using a couple Kinects linked together, we can reduce the costs. The plan is to have four/five of them covering a 180 degree arc so the Kinects can see the entire room/table (still a lot cheaper than our current cameras!). The applications would choose a video stream coming from a Kinect based on who at the table is talking. Plan is fine in theory, but I've run into a snag.

As far as I can tell, there is no way to tell which microphone array corresponds to Kinect Runtime object. I can get an object representing each Kinect using:

Device device = new Device();
Runtime[] kinects = new Runtime[device.Count];
for( int i = 0; i < kinects.Length; i ++ )
    kinects[i] = new Runtime(i);

And every microphone array using:

var source = new KinectAudioSource();
IEnumerable<AudioDeviceInfo> devices = source.FindCaptureDevices();
foreach( AudioDeviceInfo in device in devices)
{
    KinectAudioSource devSpecificSource = new KinectAudioSource();
    devSpecificSource.MicrophoneIndex = (short)device.DeviceIndex;
}

but I cannot find any way to know that Runtime A corresponds to KinectAudioSource B. This isn't a huge problem for the two Kinects I'm using (I'll just guess which is which, and switch them if they're wrong), but when we get up to four or five Kinects, I don't want to need to do any kind of calibration every time the application runs. I've considered assuming that the Runtime and KinectAudioSource objects will be in the same order (Runtime index 0 corresponds to the first AudioDeviceInfo in devices), but that seems risky.

So, the question: is there any way to match a Runtime object with its KinectAudioSource? If not, is it guaranteed that they will be in the correct order so I can match Runtime 0 with the first KinectAudioSource microphone index in devices?

UPDATE:
Finally slammed my face against WPF's single thread apartment requirement and the Kinect audio's multiple thread apartment requirement enough to get the two to behave together. Problem is, as far as I can tell, the order of the Kinect Runtime objects and KinectAudioSources do not line up. I'm in a rather loud lab (I'm one of.. maybe 40 interns in the room), so it's hard to test, but I'm fairly certain that the order is switched for the two Kinects I have plugged in. I have two Runtime objects and two KinectAudioSource objects. When the first KinectAudioSource reports that a sound is coming from directly in front of it, I'm actually standing in front of the Kinect associated with the second Runtime object. So there's no guarantee that the orders of the two will line up. So now, to repeat the question: how do I match up the KinectAudioSource object with the Nui.Runtime object? Right now, I only have two Kinects hooked up, but since the goal is four or five.. I need a concrete way to do this.

UPDATE 2:
Brought the two Kinects I have at work back home to play with. Three Kinects, one computer. Fun stuff (it was actually a pain to get them all installed at once, and one of the video feeds doesn't seem to be working, so I'm back to 2 for now). musefan's answer got me hoping that I had missed something in the AudioDeviceInfo objects that would shed some light on this problem, but no luck. I found an interesting looking field in Runtime objects called NuiCamera.UniqueDeviceName, but I can't find any link between that and anything in AudioDeviceInfo.

Output from those fields, in the hopes Sherlock Holmes sees the thread and notices a connection:

Console.WriteLine("Nui{0}: {1}", i, nuis[i].NuiCamera.UniqueDeviceName);
//Nui0: USB\VID_0409&PID_005A\6&1F9D61BF&0&4
//Nui1: USB\VID_0409&PID_005A\6&356AC357&0&3

Console.WriteLine("AudioDeviceInfo{0}: {1}, {2}, {3}", audios.IndexOf(audio), device.DeviceID, device.DeviceIndex, device.DeviceName);
//AudioDeviceInfo0: {0.0.1.00000000}.{1945437e-2d55-45e5-82ba-fc3021441b17}, 0, Microphone Array (Kinect USB Audio)
//AudioDeviceInfo1: {0.0.1.00000000}.{6002e98f-2429-459a-8e82-9810330a8e25}, 1, Microphone Array (2- Kinect USB Audio)

UPDATE 3:
I'm not looking for calibration techniques. I'm looking for a way to match the Kinect camera with its microphone array within the application at runtime, with no previous set up required. Please stop posting possible calibration techniques. The entire point of posting the question was to find a way to avoid needing the user to do set up.

UPDATE 4:
WMI definitely seems like the way to go. Unfortunately, I haven't had a lot of time to work on it, as I've been struggling just to get 3 Kinects to play nice with each other. Something about USB hubs not being able to handle the bandwidth? I've informed my boss that there doesn't seem to be any easy way to connect 3+ Kinects to a regular computer and not blue screen. I might still try to work on this in my free time, but as far as work goes.. it's pretty much a dead end.

Thanks for the answers guys, sorry I couldn't post a working solution.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

少年亿悲伤 2024-11-25 07:51:10

微软研究院提供的API实际上并没有提供这个功能。 Kinect 本质上是多个摄像头和一个麦克风阵列,每个传感器都有一个独特的驱动程序堆栈,因此不存在与物理硬件设备的链接。实现此目的的最佳方法是使用 Windows API,通过 WMI 并获取 NUI 摄像头和麦克风的设备 ID,并使用 WMI 查找它们连接到的 USB 总线(因为每个 Kinect传感器必须位于其自己的总线上)然后您就会知道哪个设备与哪个设备匹配。这将是一项昂贵的操作,因此我建议您在启动或检测设备时执行此操作,并保留信息,直到您知道硬件配置更改或应用程序重置为止。通过 .NET 使用 WMI 已有详细记录,但这里有一篇文章专门讨论了通过 WMI/.NET 的 USB 设备: http://www.developerfusion.com/article/84338/making-usb-c-Friendly/

The API provided by Microsoft Research doesn't actually provide this capability. Kinect is essentially multiple cameras, and a microphone array with each sensor having a unique driver stack so there is no linkage to the physical hardware device. The best way to achieve this would be to use the Windows API instead, by way of WMI and take the device ID's you get for the NUI camera, and microphones, and use WMI to find which USB bus they are attachted to (as each Kinect sensor has to be on its own bus) then you'll know which device matches what. This will be an expensive operation, so I would recommend you do this on start-up, or detection of the devices and keep the information persisted until a time you know the hardware configuration changes, or the application is reset. Using WMI through .NET is pretty well documented, but here is one article that specifically talks about USB devices through WMI/.NET: http://www.developerfusion.com/article/84338/making-usb-c-friendly/.

和我恋爱吧 2024-11-25 07:51:10

Mannimarco,

我看到的唯一链接是相机的 UniqueDeviceName 属性等于它的“设备实例路径”。

在我的计算机上的设备管理器中进行一些研究后,我可以看出相机的 UniqueDeviceName 末尾的最后 2 个数字(0&3、0&4)是递增值(基于控制器 + 端口?)。

我的建议是,您根据最后的数字对摄像机列表进行排序,并根据音频设备的 DeviceID 属性对音频设备进行排序。这样,我想当您迭代相机列表时,您可以使用音频设备列表中相应的索引将两者匹配在一起。

顺便说一句,这是我的第一篇文章,所以如果我错了,请温柔......

Mannimarco,

the only link I see is that a camera's UniqueDeviceName property equals it's 'device instance path'.

Doing a little research in the device manager on my computer I can tell that the last 2 numbers at the end of the camera's UniqueDeviceName (0&3, 0&4) are incrementing values (based on controller + port?).

My suggestion is that you sort your list of cameras based on those last digits, and sort your audiodevices on their DeviceID property. This way i suppose when you iterate over your camera list, you can use the corresponding index in the audiodevice list to match the 2 together.

Btw, this is my first post so please be gentle if I'm wrong...

酷遇一生 2024-11-25 07:51:10

我看过 SDK 文档,老实说它并不是很好。此外,我没有任何 Kinect 设备来测试这一点。

我要做的第一件事是为每个设备创建所有有用属性值的输出列表,然后我将开始在两个设备之间查找看起来可用于链接的匹配项。对于我找到的每一个,我都会进行测试,看看它是否有效。

因此,我将有一个简单的控制台应用程序来输出以下属性值:

对于每个 AudioDeviceInfo

  • DeviceID = X
  • DeviceIndex = X
  • DeviceName = X

对于每个 KinectAudioSource

  • MicrophoneIndex = X

对于每个运行时

  • InstanceIndex = X

然后查找值中的任何匹配项。 SDK 中的其他内容似乎没有什么真正有用的。但是当 SDK 返回 AudioDeviceInfo 和 Runtime 数组时,必须有内部逻辑。

无论如何,我希望你能以某种方式做对

I have had a look at the SDK documentation and it is not great in all honesty. Further more I do not have any Kinect devices to test this on.

The first thing I would do thou is to create an output list of all useful property values for each device, then I would start to look for matches across the two that look like they can be used for links. For each one I find, I would test to see if it does the job.

So I would have a simple console application to output the following property values:

For Each AudioDeviceInfo

  • DeviceID = X
  • DeviceIndex = X
  • DeviceName = X

For Each KinectAudioSource

  • MicrophoneIndex = X

For Each Runtime

  • InstanceIndex = X

then look for any matches in values. Nothing else in the SDK seems really useful. But there must be internal logic to the SDK when it return arrays of AudioDeviceInfo and Runtime.

Anyway, I hope you get it right somehow

ヅ她的身影、若隐若现 2024-11-25 07:51:10

我会从所有这些设备中获取音频流,然后比较音量级别。
一旦你知道了,你就可以确定 kinects 3d 空间中实际说话的“物体”或人。

从那里你需要确定这个物体/人在哪些相机中可见...

是的,这是一个复杂的项目...kinect 非常棒...我对 API 不太了解,但它没有给你人与人之间的距离之类的?

祝你好运:)

I would get the audio stream from all of them and then compare volume levels.
Once you have that you can determine the "object" or person in the kinects 3d space that is actually speaking.

From there you need to determine which cameras this object / person is visible in ...

yeh this is one complex project ... kinect is pretty awesome though ... I don't know much about the API but does it not give you distances and such of people?

good luck with it :)

末骤雨初歇 2024-11-25 07:51:10

我只需一一校准 kinect,将唯一的设备标识符对(摄像头 id、麦克风 id)写入文件。然后,您可以在应用程序中在启动时使用该文件来同步麦克风实例和摄像头实例(即创建一个将一个摄像头实例与一个麦克风实例相关联的表)。由于 kinect 内的摄像头和麦克风可能都有自己的 USB 接口 IC(通过内部 USB 集线器连接),因此在技术上没有办法在没有事先校准的情况下将两者关联起来,因为这两个设备标识符可能完全不相关。此外,您可能希望在 Kinect 设备上放置标签,并在初始化文件中引用这些标签。

I would just calibrate the kinects one by one, writing the unique device identifier pairs (camera id, microphone id) to a file. In your application you can then use that file at startup time to synchronize mircophone instances and camera instances (ie. create a table that relates one camera instance to one microphone instance). As camera and microphone inside the kinect probably have their own usb interface ic each (connected via an interal usb hub), there is technically no way to relate the two without prior calibration, as the two device identifier are probably completely unrelated. Also you might want to put labels on the Kinect units and reference these labels inside your initialization file.

怀里藏娇 2024-11-25 07:51:10

听起来很有趣,也许你需要一些“自动校准”。

也许有一些“每个 USB 连接的远程电源开关”(io 卡连接到 USB 电源线)。因此,您可以自动打开一个 Kinect 的电源,然后您就知道哪个麦克风属于哪个摄像头。

或者类似的东西...

问候!
斯特凡

Sounds interesting, maybe you need some "automatic calibration".

Maybe with some "remote power switches for each usb connection" (io card connected to the usb powerlines). So you could power-on one Kinect after the other automatically and now you know which microphone belongs to which camera.

Or something like that...

Regards!
Stefan

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文