有没有办法通过 Mediapipe 将 468 个地标映射/转换为 68 个地标?
我试图将地面真实面部标志(68 个标志)与 Mediapipe 标志检测(468 个标志)进行比较。为了做到这一点,我认为我需要以某种方式将 468 个地标映射到 68 个地标。我可能的解决方案是手动查找最接近 68 个地标中每个地标的索引并输出它们。但我不确定这里的准确性。有人可以在这方面帮助我吗?
I am trying to compare the ground truth facial landmarks (68 landmarks) with Mediapipe landmark detection (which are 468 landmarks). In order to do so, I think I need to map the 468 landmarks to 68 landmarks in some way. My possible solution is to manually find the indices closest to each of the 68 landmarks and output those. But I am not sure of the accuracy here. Can someone help me in this regard?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我不是该主题的专家,但我认为没有直接的方法来进行转换,这是因为一侧到另一侧的映射不同。
对此,我采取了与您提到的相同的想法,并且提取了最接近的点。
这是我添加到 mediapipe 建议在 dlib 中使用的代码中的代码。考虑到许多点彼此重合,最终结果并没有那么错误。
在 MediaPipe Face Mesh 代码示例的开头,您应该添加一个列表,该列表是与 dlib 68 地标匹配的点的选择:
在 MediaPipe Face Mesh 代码示例中查找该行:
然后添加以下内容:
现在该列表(提取的地标)你可以在你的代码中使用它
I'm not an expert on the subject, but I think there is no direct way to do a conversion, this is because the mapping is different from one side to the other.
To this I have taken the same idea that you mention and I have extracted the closest points.
So this is the code that I have added to the code that mediapipe proposes for my use in dlib. The final result is not so wrong considering that many points are coincident with each other.
At the beginning of the MediaPipe Face Mesh code example you should add a list which is a selection of points that match dlib 68 landmarks:
In the MediaPipe Face Mesh code example look for the line:
then add the following:
now that list (landmarks extracted) you can use it in your code
这段代码可以做到:
https://github.com/PeizhiYan/Mediapipe_2_Dlib_Landmarks/
此代码将 Mediapipe 的 478 个密集面部标志映射到Dlib 的 68 个稀疏面部标志通过定义对应关系,其中每个 Dlib 标志索引对应于一个或两个 Mediapipe 索引,必要时对坐标进行平均。函数convert_landmarks_mediapipe_to_dlib 使用此预定义映射将 Mediapipe 地标数组转换为 Dlib 地标。
This code can do:
https://github.com/PeizhiYan/Mediapipe_2_Dlib_Landmarks/
This code maps Mediapipe's 478 dense facial landmarks to Dlib's 68 sparse facial landmarks by defining correspondences where each Dlib landmark index corresponds to one or two Mediapipe indices, averaging coordinates when necessary. The function convert_landmarks_mediapipe_to_dlib converts an array of Mediapipe landmarks into Dlib landmarks using this predefined mapping.