Apple A4 上的 YUV 到 RGBA，我应该使用着色器还是 NEON？

发布于 2024-12-20 17:22:21 字数 350 浏览 7 评论 0原文

我正在使用 OpenGL ES 和 ffmpeg 为 Apple TV 编写媒体播放器框架。在 OpenGL ES 上渲染需要转换为 RGBA，使用 swscale 进行软转换速度慢得难以忍受，因此根据互联网上的信息我想出了两个想法：使用 neon （如此处）或使用片段着色器和 GL_LUMINANCE 和GL_LUMINANCE_ALPHA。

由于我对 OpenGL 几乎一无所知，所以第二个选项仍然不起作用:)

你能给我一些如何继续的指示吗？先感谢您。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

如此安好 2024-12-27 17:22:22

OpenGL ES2.0 着色器绝对值得学习：

您可以在 GPU 和 CPU 之间进行负载平衡（例如，在 GPU 渲染当前帧时对后续帧进行视频解码）。
无论如何，视频帧都需要传输到 GPU：如果您的视频具有 4:2:0 采样色度，则使用 YCbCr 可以节省 25% 的总线带宽。
通过 GPU 硬件插值器，您可以免费获得 4:2:0 色度上采样。（您的着色器应配置为对 Y 和 C{b,r} 纹理使用相同的顶点坐标，实际上将色度纹理拉伸到同一区域。 )
在 iOS5 上，使用纹理缓存将 YCbCr 纹理推送到 GPU 的速度很快（无需数据复制或混合）（请参阅 CVOpenGLESTextureCache* API 函数）。与 NEON 相比，您将节省 1-2 个数据副本。

我在我的超快 iPhone 相机应用程序 SnappyCam 中使用这些技术，取得了很好的效果。

您的实现方向是正确的：如果您的 CbCr 是交错的，则对 Y 和 GL_LUMINANCE_ALPHA 使用 GL_LUMINANCE 纹理。否则，如果所有 YCbCr 组件都是非交错的，则使用三个 GL_LUMINANCE 纹理。

为 4:2:0 双平面 YCbCr（其中 CbCr 交错）创建两个纹理非常简单：

    glBindTexture(GL_TEXTURE_2D, texture_y);
    glTexImage2D(
        GL_TEXTURE_2D, 
        0, 
        GL_LUMINANCE,        // Texture format (8bit)
        width,
        height,
        0,                   // No border
        GL_LUMINANCE,        // Source format (8bit)
        GL_UNSIGNED_BYTE,    // Source data format
        NULL
    );
    glBindTexture(GL_TEXTURE_2D, texture_cbcr);
    glTexImage2D(
        GL_TEXTURE_2D, 
        0, 
        GL_LUMINANCE_ALPHA, // Texture format (16-bit)
        width / 2,
        height / 2,
        0,                  // No border
        GL_LUMINANCE_ALPHA, // Source format (16-bits)
        GL_UNSIGNED_BYTE,   // Source data format
        NULL
    );

然后您可以在其中使用 glTexSubImage2D() > 或iOS5纹理缓存来更新这些纹理。

我还建议使用跨越纹理坐标空间的 2D variing (x: [0,1], y: [0,1]) 以便避免任何依赖纹理都会在片段着色器中读取。根据我的经验，最终结果是超快的并且根本不会加载 GPU。

It is most definitely worthwhile learning OpenGL ES2.0 shaders:

You can load-balance between the GPU and CPU (e.g. video decoding of subsequent frames while GPU renders the current frame).
Video frames need to go to the GPU in any case: using YCbCr saves you 25% bus bandwidth if your video has 4:2:0 sampled chrominance.
You get 4:2:0 chrominance up-sampling for free, with the GPU hardware interpolator. (Your shader should be configured to use the same vertex coordinates for both Y and C{b,r} textures, in effect stretching the chrominance texture out over the same area.)
On iOS5 pushing YCbCr textures to the GPU is fast (no data-copy or swizzling) with the texture cache (see the CVOpenGLESTextureCache* API functions). You will save 1-2 data-copies compared to NEON.

I am using these techniques to great effect in my super-fast iPhone camera app, SnappyCam.

You are on the right track for implementation: use a GL_LUMINANCE texture for Y and GL_LUMINANCE_ALPHA if your CbCr is interleaved. Otherwise use three GL_LUMINANCE textures if all of your YCbCr components are noninterleaved.

Creating two textures for 4:2:0 bi-planar YCbCr (where CbCr is interleaved) is straightforward:

    glBindTexture(GL_TEXTURE_2D, texture_y);
    glTexImage2D(
        GL_TEXTURE_2D, 
        0, 
        GL_LUMINANCE,        // Texture format (8bit)
        width,
        height,
        0,                   // No border
        GL_LUMINANCE,        // Source format (8bit)
        GL_UNSIGNED_BYTE,    // Source data format
        NULL
    );
    glBindTexture(GL_TEXTURE_2D, texture_cbcr);
    glTexImage2D(
        GL_TEXTURE_2D, 
        0, 
        GL_LUMINANCE_ALPHA, // Texture format (16-bit)
        width / 2,
        height / 2,
        0,                  // No border
        GL_LUMINANCE_ALPHA, // Source format (16-bits)
        GL_UNSIGNED_BYTE,   // Source data format
        NULL
    );

where you would then use glTexSubImage2D() or the iOS5 texture cache to update these textures.

I'd also recommend using a 2D varying that spans the texture coordinate space (x: [0,1], y: [0,1]) so that you avoid any dependent texture reads in your fragment shader. The end result is super-fast and doesn't load the GPU at all in my experience.

回复收藏 0 原文