GPU加速FFMPEG管道

发布于 2025-02-10 04:31:06 字数 7280 浏览 1 评论 0原文

我的机器上有一个Nvidia Tesla T4,并希望确保我利用它具有最大的潜力。在一眼阅读此信息之后 https ://docs.nvidia.com/video-technologies/video-codec-sdk/ffmpeg-with-nvidia-gpu/ 我不确定如何优化/测试

Node.js过程中的管道使用JPEG缓冲区并使用FFMPEG作为儿童过程。我正在将缓冲区通过FFMPEG运送到S3。

export default ({ fps, resolution, targetQuality }: any) => {
  try {
    const childProcess = spawn("ffmpeg", [
      "-y",
      "-hwaccel",
      "cuda",
      "-f",
      "image2pipe",
      "-framerate",
      `${fps ? `${fps}` : 60}`,
      "-i",
      "-",
      "-vf",
      `format=rgba,scale=${resolution.width}:${resolution.height}`,
      "-an",
      "-vcodec",
      "libx264",
      "-pix_fmt",
      "yuv420p",
      "-r",
      `${fps ? `${fps}` : 60}`,
      "-preset",
      "veryslow",
      "-profile:v",
      "high444",
      "-crf",
      `${quality(targetQuality)}`,
      "-x264opts",
      "fast_pskip=0:psy=0:deblock=-3,-3",
      "-g",
      "300",
      "-y",
      "-f",
      "mp4",
      "-movflags",
      "frag_keyframe+empty_moov",
      "pipe:1"
    ]);

    // LOGS
    childProcess.stderr.on("data", (err: any) => {
      console.log("FFMPEG Output", err.toString("utf8"));
    });

    childProcess.stderr.on("error", (err: any) => {
      console.log("ERROR ERROR FFMPEG", { err });
    });

    childProcess.on("error", (error: any) => {
      console.log("ERROR in childProcess", { error });
    });

    childProcess.on("beforeExit", (code: any) => {
      console.log("FFMPEG Stream Error Process beforeExit event with code: ", code);
    });

    childProcess.on("exit", (code: any) => {
      if (code !== 0) {
        console.log("FFMPEG Stream Error Process exit event with code: ", code);
      }
    });

    return childProcess;
  } catch (e) {
    console.log("ffmpeg issue", {
      e
    });
  }
};

每个帧都像这样的等待写(childprocess.stdin,frame);。我要查看是否有一种方法可以根据我所使用的GPU进行优化,以及它如何工作/为什么起作用。

ffmpeg登录

FFMPEG Output ffmpeg version n4.3.1 Copyright (c) 2000-2020 the FFmpeg developers
0|index  |   built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04)
0|index  |   configuration: --prefix= --prefix=/usr --disable-debug --disable-doc --disable-static --enable-cuda --enable-cuda-sdk --enable-cuvid --enable-libdrm --enable-ffplay --enable-gnutls --enable-gpl --enable-libass --enable-libfdk-aac --enable-libfontconfig --enable-libfreetype --enable-libmp3lame --enable-libnpp --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libopus --enable-libpulse --enable-sdl2 --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libv4l2 --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-libxcb --enable-libxvid --enable-nonfree --enable-nvenc --enable-omx --enable-openal --enable-opencl --enable-runtime-cpudetect --enable-shared --enable-vaapi --enable-vdpau --enable-version3 --enable-xlib 

更多ffmpeg日志

0|index  | FFMPEG Output Input #0, image2pipe, from 'pipe:':
0|index  |   Duration: N/A, bitrate: N/A
0|index  |     Stream #0:0: Video: mjpeg (Baseline), yuvj420p(pc, bt470bg/unknown/unknown), 1080x1080 [SAR 1:1 DAR 1:1], 60 fps, 60 tbr, 60 tbn, 60 tbc
0|index  | FFMPEG Output Stream mapping:
0|index  |   Stream #0:0 -> #0:0 (mjpeg (native) -> h264 (libx264))
0|index  | FFMPEG Output [libx264 @ 0x55ca4182db00] using SAR=1/1
0|index  | FFMPEG Output [libx264 @ 0x55ca4182db00] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
0|index  | FFMPEG Output [libx264 @ 0x55ca4182db00] profile High, level 5.0
0|index  | FFMPEG Output [libx264 @ 0x55ca4182db00] 264 - core 152 r2854 e9a5903 - H.264/MPEG-4 AVC codec - Copyleft 2003-2017 - http://www.videolan.org/x264.html - options: cabac=1 ref=16 deblock=1:-3:-3 analyse=0x3:0x133 me=umh subme=10 psy=0 mixed_ref=1 me_range=24 chroma_me=1 trellis=2 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=0 chroma_qp_offset=0 threads=24 lookahead_threads=4 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=8 b_pyramid=2 b_adapt=2 b_bias=0 direct=3 weightb=1 open_gop=0 weightp=2 keyint=300 keyint_min=30 scenecut=40 intra_refresh=0 rc_lookahead=60 rc=crf mbtree=1 crf=17.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
0|index  | FFMPEG Output Output #0, mp4, to 'pipe:1':
0|index  |   Metadata:
0|index  |     encoder         : Lavf58.45.100
0|index  |     Stream #0:0: Video: h264 (libx264) (avc1 / 0x31637661), yuv420p, 1080x1080 [SAR 1:1 DAR 1:1], q=-1--1, 60 fps, 15360 tbn, 60 tbc
0|index  |     Metadata:
0|index  |       encoder         : Lavc58.91.100 libx264
0|index  |     Side data:
0|index  |       
0|index  | FFMPEG Output cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
0|index  | FFMPEG Output frame=   37 fps=0.0 q=0.0 size=       1kB time=00:00:00.00 bitrate=N/A speed=   0x    
0|index  | FFMPEG Output frame=   60 fps= 13 q=-1.0 Lsize=     932kB time=00:00:00.95 bitrate=8039.9kbits/s speed=0.21x    
0|index  | video:931kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.152935%
0|index  | FFMPEG Output [libx264 @ 0x55ca4182db00] frame I:1     Avg QP:18.49  size: 28035
0|index  | [libx264 @ 0x55ca4182db00] frame P:16    Avg QP:20.57  size: 30170
0|index  | [libx264 @ 0x55ca4182db00] frame B:43    Avg QP:21.67  size: 10277
0|index  | [libx264 @ 0x55ca4182db00] consecutive B-frames:  1.7%  3.3% 25.0% 60.0%  0.0% 10.0%  0.0%  0.0%  0.0%
0|index  | [libx264 @ 0x55ca4182db00] mb I  I16..4: 33.2% 64.9%  1.9%
0|index  | [libx264 @ 0x55ca4182db00] mb P  I16..4: 21.2% 38.8%  0.8%  P16..4: 17.6%  6.8%  5.0%  0.1%  0.0%    skip: 9.7%
0|index  | [libx264 @ 0x55ca4182db00] mb B  I16..4:  7.2% 12.0%  0.2%  B16..8: 27.1%  5.0%  0.9%  direct: 2.9%  skip:44.7%  L0:47.8% L1:42.4% BI: 9.8%
0|index  | [libx264 @ 0x55ca4182db00] 
0|index  | FFMPEG Output 8x8 transform intra:63.0% inter:95.7%
0|index  | [libx264 @ 0x55ca4182db00] direct mvs  spatial:72.1% temporal:27.9%
0|index  | [libx264 @ 0x55ca4182db00] coded y,uvDC,uvAC intra: 39.3% 68.6% 7.3% inter: 13.0% 15.1% 0.8%
0|index  | [libx264 @ 0x55ca4182db00] i16 v,h,dc,p: 19% 26% 20% 36%
0|index  | [libx264 @ 0x55ca4182db00] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 23% 23% 25%  5%  4%  5%  5%  5%  6%
0|index  | [libx264 @ 0x55ca4182db00] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 25% 20% 19%  4%  8%  6%  7%  5%  5%
0|index  | [libx264 @ 0x55ca4182db00] i8c dc,h,v,p: 26% 28% 22% 23%
0|index  | [libx264 @ 0x55ca4182db00] Weighted P-Frames: Y:0.0% UV:0.0%
0|index  | [libx264 @ 0x55ca4182db00] ref P L0: 59.7% 14.2% 13.8%  4.3%  2.7%  1.7%  1.3%  0.7%  0.5%  0.3%  0.3%  0.2%  0.1%  0.1%  0.0%  0.0%
0|index  | [libx264 @ 0x55ca4182db00] ref B L0: 90.6%  5.1%  2.1%  0.9%  0.5%  0.3%  0.2%  0.1%  0.1%  0.0%  0.0%  0.0%  0.0%  0.0%  0.0%
0|index  | [libx264 @ 0x55ca4182db00] ref B L1: 99.2%  0.8%
0|index  | [libx264 @ 0x55ca4182db00] kb/s:7621.38
0|index  | progress { loaded: 954806, total: 954806, part: 1, key: 'undefined.mp4' }
0|index  | Browser disconnected we need to report this and handle
0|index  | Done ✅ { perf: 'Execution time: 22681.152896999964 ms' }

I have an Nvidia Tesla T4 on my machine and want to make sure I'm leveraging it to it's fullest potential. After reading up on this at a glance https://docs.nvidia.com/video-technologies/video-codec-sdk/ffmpeg-with-nvidia-gpu/ I am not sure how to optimize/test for piping

Inside a Node.js process I am taking a JPEG buffer and using FFMPEG as a child process. I am piping the buffer through FFMPEG to S3.

export default ({ fps, resolution, targetQuality }: any) => {
  try {
    const childProcess = spawn("ffmpeg", [
      "-y",
      "-hwaccel",
      "cuda",
      "-f",
      "image2pipe",
      "-framerate",
      `${fps ? `${fps}` : 60}`,
      "-i",
      "-",
      "-vf",
      `format=rgba,scale=${resolution.width}:${resolution.height}`,
      "-an",
      "-vcodec",
      "libx264",
      "-pix_fmt",
      "yuv420p",
      "-r",
      `${fps ? `${fps}` : 60}`,
      "-preset",
      "veryslow",
      "-profile:v",
      "high444",
      "-crf",
      `${quality(targetQuality)}`,
      "-x264opts",
      "fast_pskip=0:psy=0:deblock=-3,-3",
      "-g",
      "300",
      "-y",
      "-f",
      "mp4",
      "-movflags",
      "frag_keyframe+empty_moov",
      "pipe:1"
    ]);

    // LOGS
    childProcess.stderr.on("data", (err: any) => {
      console.log("FFMPEG Output", err.toString("utf8"));
    });

    childProcess.stderr.on("error", (err: any) => {
      console.log("ERROR ERROR FFMPEG", { err });
    });

    childProcess.on("error", (error: any) => {
      console.log("ERROR in childProcess", { error });
    });

    childProcess.on("beforeExit", (code: any) => {
      console.log("FFMPEG Stream Error Process beforeExit event with code: ", code);
    });

    childProcess.on("exit", (code: any) => {
      if (code !== 0) {
        console.log("FFMPEG Stream Error Process exit event with code: ", code);
      }
    });

    return childProcess;
  } catch (e) {
    console.log("ffmpeg issue", {
      e
    });
  }
};

Each frame is written like this await write(childProcess.stdin, frame);. Im asking to see if there is a way to optimize it based on the GPU I am on and how does it work/why does it work.

FFMPEG LOGS

FFMPEG Output ffmpeg version n4.3.1 Copyright (c) 2000-2020 the FFmpeg developers
0|index  |   built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04)
0|index  |   configuration: --prefix= --prefix=/usr --disable-debug --disable-doc --disable-static --enable-cuda --enable-cuda-sdk --enable-cuvid --enable-libdrm --enable-ffplay --enable-gnutls --enable-gpl --enable-libass --enable-libfdk-aac --enable-libfontconfig --enable-libfreetype --enable-libmp3lame --enable-libnpp --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libopus --enable-libpulse --enable-sdl2 --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libv4l2 --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-libxcb --enable-libxvid --enable-nonfree --enable-nvenc --enable-omx --enable-openal --enable-opencl --enable-runtime-cpudetect --enable-shared --enable-vaapi --enable-vdpau --enable-version3 --enable-xlib 

MORE FFMPEG LOGS

0|index  | FFMPEG Output Input #0, image2pipe, from 'pipe:':
0|index  |   Duration: N/A, bitrate: N/A
0|index  |     Stream #0:0: Video: mjpeg (Baseline), yuvj420p(pc, bt470bg/unknown/unknown), 1080x1080 [SAR 1:1 DAR 1:1], 60 fps, 60 tbr, 60 tbn, 60 tbc
0|index  | FFMPEG Output Stream mapping:
0|index  |   Stream #0:0 -> #0:0 (mjpeg (native) -> h264 (libx264))
0|index  | FFMPEG Output [libx264 @ 0x55ca4182db00] using SAR=1/1
0|index  | FFMPEG Output [libx264 @ 0x55ca4182db00] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
0|index  | FFMPEG Output [libx264 @ 0x55ca4182db00] profile High, level 5.0
0|index  | FFMPEG Output [libx264 @ 0x55ca4182db00] 264 - core 152 r2854 e9a5903 - H.264/MPEG-4 AVC codec - Copyleft 2003-2017 - http://www.videolan.org/x264.html - options: cabac=1 ref=16 deblock=1:-3:-3 analyse=0x3:0x133 me=umh subme=10 psy=0 mixed_ref=1 me_range=24 chroma_me=1 trellis=2 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=0 chroma_qp_offset=0 threads=24 lookahead_threads=4 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=8 b_pyramid=2 b_adapt=2 b_bias=0 direct=3 weightb=1 open_gop=0 weightp=2 keyint=300 keyint_min=30 scenecut=40 intra_refresh=0 rc_lookahead=60 rc=crf mbtree=1 crf=17.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
0|index  | FFMPEG Output Output #0, mp4, to 'pipe:1':
0|index  |   Metadata:
0|index  |     encoder         : Lavf58.45.100
0|index  |     Stream #0:0: Video: h264 (libx264) (avc1 / 0x31637661), yuv420p, 1080x1080 [SAR 1:1 DAR 1:1], q=-1--1, 60 fps, 15360 tbn, 60 tbc
0|index  |     Metadata:
0|index  |       encoder         : Lavc58.91.100 libx264
0|index  |     Side data:
0|index  |       
0|index  | FFMPEG Output cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
0|index  | FFMPEG Output frame=   37 fps=0.0 q=0.0 size=       1kB time=00:00:00.00 bitrate=N/A speed=   0x    
0|index  | FFMPEG Output frame=   60 fps= 13 q=-1.0 Lsize=     932kB time=00:00:00.95 bitrate=8039.9kbits/s speed=0.21x    
0|index  | video:931kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.152935%
0|index  | FFMPEG Output [libx264 @ 0x55ca4182db00] frame I:1     Avg QP:18.49  size: 28035
0|index  | [libx264 @ 0x55ca4182db00] frame P:16    Avg QP:20.57  size: 30170
0|index  | [libx264 @ 0x55ca4182db00] frame B:43    Avg QP:21.67  size: 10277
0|index  | [libx264 @ 0x55ca4182db00] consecutive B-frames:  1.7%  3.3% 25.0% 60.0%  0.0% 10.0%  0.0%  0.0%  0.0%
0|index  | [libx264 @ 0x55ca4182db00] mb I  I16..4: 33.2% 64.9%  1.9%
0|index  | [libx264 @ 0x55ca4182db00] mb P  I16..4: 21.2% 38.8%  0.8%  P16..4: 17.6%  6.8%  5.0%  0.1%  0.0%    skip: 9.7%
0|index  | [libx264 @ 0x55ca4182db00] mb B  I16..4:  7.2% 12.0%  0.2%  B16..8: 27.1%  5.0%  0.9%  direct: 2.9%  skip:44.7%  L0:47.8% L1:42.4% BI: 9.8%
0|index  | [libx264 @ 0x55ca4182db00] 
0|index  | FFMPEG Output 8x8 transform intra:63.0% inter:95.7%
0|index  | [libx264 @ 0x55ca4182db00] direct mvs  spatial:72.1% temporal:27.9%
0|index  | [libx264 @ 0x55ca4182db00] coded y,uvDC,uvAC intra: 39.3% 68.6% 7.3% inter: 13.0% 15.1% 0.8%
0|index  | [libx264 @ 0x55ca4182db00] i16 v,h,dc,p: 19% 26% 20% 36%
0|index  | [libx264 @ 0x55ca4182db00] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 23% 23% 25%  5%  4%  5%  5%  5%  6%
0|index  | [libx264 @ 0x55ca4182db00] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 25% 20% 19%  4%  8%  6%  7%  5%  5%
0|index  | [libx264 @ 0x55ca4182db00] i8c dc,h,v,p: 26% 28% 22% 23%
0|index  | [libx264 @ 0x55ca4182db00] Weighted P-Frames: Y:0.0% UV:0.0%
0|index  | [libx264 @ 0x55ca4182db00] ref P L0: 59.7% 14.2% 13.8%  4.3%  2.7%  1.7%  1.3%  0.7%  0.5%  0.3%  0.3%  0.2%  0.1%  0.1%  0.0%  0.0%
0|index  | [libx264 @ 0x55ca4182db00] ref B L0: 90.6%  5.1%  2.1%  0.9%  0.5%  0.3%  0.2%  0.1%  0.1%  0.0%  0.0%  0.0%  0.0%  0.0%  0.0%
0|index  | [libx264 @ 0x55ca4182db00] ref B L1: 99.2%  0.8%
0|index  | [libx264 @ 0x55ca4182db00] kb/s:7621.38
0|index  | progress { loaded: 954806, total: 954806, part: 1, key: 'undefined.mp4' }
0|index  | Browser disconnected we need to report this and handle
0|index  | Done ✅ { perf: 'Execution time: 22681.152896999964 ms' }

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文