ffmpeg - 降低视频中对话的背景音乐音量
我有一个视频,里面有一些背景音乐。
我希望在视频中的特定位置添加一段口头对话,以便在对话音频的整个持续时间内降低背景音乐。
我使用 sidechaincompress
找到了类似的解决方案,它仅适用于 mp3。我对其进行了一些更改,以便它也包含视频 (-map 0:v
)。然而,现在对话一结束,音频就会被缩短。
ffmpeg -i video-with-bg-music.mp4 -i dialogue.mp3 -c:v libx264 -filter_complex "[1:a]asplit=2[sc][mix];[0:a][sc]sidechaincompress=threshold=0.003:ratio=20[bg];[bg][mix]amerge[final]" -map 0:v -map [final] final.mp4
我不是使用 ffmpeg 的专业人士,可能不知道 filter_complex
发生了什么。请帮帮我。
编辑:
@kesh 建议的解决方案的输出日志
ffmpeg -i video_bg.mp4 -i dialogues.mp3 -filter_complex "[1:a]adelay=0,apad,asplit=2[sc][mix];[0:a][sc]sidechaincompress=threshold=0.003:ratio=20[bg];[bg][mix]amix=duration=shortest[out]" -map 0:v -map [out] video_bg_speech.mp4
ffmpeg version 4.2.4-1ubuntu0.1 Copyright (c) 2000-2020 the FFmpeg developers
built with gcc 9 (Ubuntu 9.3.0-10ubuntu2)
configuration: --prefix=/usr --extra-version=1ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
libavutil 56. 31.100 / 56. 31.100
libavcodec 58. 54.100 / 58. 54.100
libavformat 58. 29.100 / 58. 29.100
libavdevice 58. 8.100 / 58. 8.100
libavfilter 7. 57.100 / 7. 57.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 5.100 / 5. 5.100
libswresample 3. 5.100 / 3. 5.100
libpostproc 55. 5.100 / 55. 5.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'video_bg.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.29.100
Duration: 00:01:50.12, start: 0.000000, bitrate: 157 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 720x1280 [SAR 1:1 DAR 9:16], 146 kb/s, 24 fps, 24 tbr, 12288 tbn, 48 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 128 kb/s (default)
Metadata:
handler_name : SoundHandler
[mp3 @ 0x5577fa3b6740] Estimating duration from bitrate, this may be inaccurate
Input #1, mp3, from 'dialogue.mp3':
Duration: 00:00:17.33, start: 0.000000, bitrate: 32 kb/s
Stream #1:0: Audio: mp3, 24000 Hz, mono, fltp, 32 kb/s
File 'video_bg_speech.mp4' already exists. Overwrite ? [y/N] y
Stream mapping:
Stream #0:1 (aac) -> sidechaincompress:main (graph 0)
Stream #1:0 (mp3float) -> adelay (graph 0)
Stream #0:0 -> #0:0 (h264 (native) -> h264 (libx264))
amix (graph 0) -> Stream #0:1 (aac)
Press [q] to stop, [?] for help
[libx264 @ 0x5577fa446040] using SAR=1/1
[libx264 @ 0x5577fa446040] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 0x5577fa446040] profile High, level 3.1
[libx264 @ 0x5577fa446040] 264 - core 155 r2917 0a84d98 - H.264/MPEG-4 AVC codec - Copyleft 2003-2018 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=18 lookahead_threads=3 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=24 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
[Parsed_sidechaincompress_3 @ 0x5577faed3f00] No channel layout for input 1
Output #0, mp4, to 'video_bg_speech.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.29.100
Stream #0:0(und): Video: h264 (libx264) (avc1 / 0x31637661), yuv420p(progressive), 720x1280 [SAR 1:1 DAR 9:16], q=-1--1, 24 fps, 12288 tbn, 24 tbc (default)
Metadata:
handler_name : VideoHandler
encoder : Lavc58.54.100 libx264
Side data:
cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 24000 Hz, mono, fltp, 69 kb/s (default)
Metadata:
encoder : Lavc58.54.100 aac
frame= 480 fps=325 q=-1.0 Lsize= 517kB time=00:00:19.87 bitrate= 213.0kbits/s speed=13.5x
video:353kB audio:152kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 2.388190%
[libx264 @ 0x5577fa446040] frame I:2 Avg QP:15.99 size:136135
[libx264 @ 0x5577fa446040] frame P:121 Avg QP:16.86 size: 603
[libx264 @ 0x5577fa446040] frame B:357 Avg QP:26.34 size: 43
[libx264 @ 0x5577fa446040] consecutive B-frames: 0.6% 0.4% 0.6% 98.3%
[libx264 @ 0x5577fa446040] mb I I16..4: 2.3% 65.7% 32.1%
[libx264 @ 0x5577fa446040] mb P I16..4: 0.0% 0.1% 0.1% P16..4: 1.1% 0.1% 0.1% 0.0% 0.0% skip:98.5%
[libx264 @ 0x5577fa446040] mb B I16..4: 0.0% 0.0% 0.0% B16..8: 0.2% 0.0% 0.0% direct: 0.0% skip:99.8% L0:55.5% L1:44.5% BI: 0.0%
[libx264 @ 0x5577fa446040] 8x8 transform intra:62.5% inter:56.6%
[libx264 @ 0x5577fa446040] coded y,uvDC,uvAC intra: 89.6% 89.0% 78.7% inter: 0.0% 0.1% 0.0%
[libx264 @ 0x5577fa446040] i16 v,h,dc,p: 37% 5% 8% 50%
[libx264 @ 0x5577fa446040] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 24% 16% 6% 6% 9% 9% 9% 10% 9%
[libx264 @ 0x5577fa446040] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 25% 18% 7% 6% 11% 9% 9% 7% 7%
[libx264 @ 0x5577fa446040] i8c dc,h,v,p: 42% 22% 20% 15%
[libx264 @ 0x5577fa446040] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 @ 0x5577fa446040] ref P L0: 46.3% 2.9% 19.3% 31.5%
[libx264 @ 0x5577fa446040] ref B L0: 45.6% 54.0% 0.4%
[libx264 @ 0x5577fa446040] ref B L1: 97.0% 3.0%
[libx264 @ 0x5577fa446040] kb/s:144.26
[aac @ 0x5577fa3b9940] Qavg: 307.060
I have a video with some background music in it.
I wish to add a piece of spoken dialogue at a particular location in the video, such that the background music is lowered for the entire duration of the dialogue audio.
I found a similar solution using sidechaincompress
, which just works for mp3. I made some changes to it so that it includes the video too (-map 0:v
). However, now the audio is cut short as soon as the dialogue ends.
ffmpeg -i video-with-bg-music.mp4 -i dialogue.mp3 -c:v libx264 -filter_complex "[1:a]asplit=2[sc][mix];[0:a][sc]sidechaincompress=threshold=0.003:ratio=20[bg];[bg][mix]amerge[final]" -map 0:v -map [final] final.mp4
I am not a pro at using ffmpeg and probably don't know what's going on with the filter_complex
. Please help me out.
EDIT:
Output log for solution suggested by @kesh
ffmpeg -i video_bg.mp4 -i dialogues.mp3 -filter_complex "[1:a]adelay=0,apad,asplit=2[sc][mix];[0:a][sc]sidechaincompress=threshold=0.003:ratio=20[bg];[bg][mix]amix=duration=shortest[out]" -map 0:v -map [out] video_bg_speech.mp4
ffmpeg version 4.2.4-1ubuntu0.1 Copyright (c) 2000-2020 the FFmpeg developers
built with gcc 9 (Ubuntu 9.3.0-10ubuntu2)
configuration: --prefix=/usr --extra-version=1ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
libavutil 56. 31.100 / 56. 31.100
libavcodec 58. 54.100 / 58. 54.100
libavformat 58. 29.100 / 58. 29.100
libavdevice 58. 8.100 / 58. 8.100
libavfilter 7. 57.100 / 7. 57.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 5.100 / 5. 5.100
libswresample 3. 5.100 / 3. 5.100
libpostproc 55. 5.100 / 55. 5.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'video_bg.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.29.100
Duration: 00:01:50.12, start: 0.000000, bitrate: 157 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 720x1280 [SAR 1:1 DAR 9:16], 146 kb/s, 24 fps, 24 tbr, 12288 tbn, 48 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 128 kb/s (default)
Metadata:
handler_name : SoundHandler
[mp3 @ 0x5577fa3b6740] Estimating duration from bitrate, this may be inaccurate
Input #1, mp3, from 'dialogue.mp3':
Duration: 00:00:17.33, start: 0.000000, bitrate: 32 kb/s
Stream #1:0: Audio: mp3, 24000 Hz, mono, fltp, 32 kb/s
File 'video_bg_speech.mp4' already exists. Overwrite ? [y/N] y
Stream mapping:
Stream #0:1 (aac) -> sidechaincompress:main (graph 0)
Stream #1:0 (mp3float) -> adelay (graph 0)
Stream #0:0 -> #0:0 (h264 (native) -> h264 (libx264))
amix (graph 0) -> Stream #0:1 (aac)
Press [q] to stop, [?] for help
[libx264 @ 0x5577fa446040] using SAR=1/1
[libx264 @ 0x5577fa446040] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 0x5577fa446040] profile High, level 3.1
[libx264 @ 0x5577fa446040] 264 - core 155 r2917 0a84d98 - H.264/MPEG-4 AVC codec - Copyleft 2003-2018 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=18 lookahead_threads=3 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=24 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
[Parsed_sidechaincompress_3 @ 0x5577faed3f00] No channel layout for input 1
Output #0, mp4, to 'video_bg_speech.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.29.100
Stream #0:0(und): Video: h264 (libx264) (avc1 / 0x31637661), yuv420p(progressive), 720x1280 [SAR 1:1 DAR 9:16], q=-1--1, 24 fps, 12288 tbn, 24 tbc (default)
Metadata:
handler_name : VideoHandler
encoder : Lavc58.54.100 libx264
Side data:
cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 24000 Hz, mono, fltp, 69 kb/s (default)
Metadata:
encoder : Lavc58.54.100 aac
frame= 480 fps=325 q=-1.0 Lsize= 517kB time=00:00:19.87 bitrate= 213.0kbits/s speed=13.5x
video:353kB audio:152kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 2.388190%
[libx264 @ 0x5577fa446040] frame I:2 Avg QP:15.99 size:136135
[libx264 @ 0x5577fa446040] frame P:121 Avg QP:16.86 size: 603
[libx264 @ 0x5577fa446040] frame B:357 Avg QP:26.34 size: 43
[libx264 @ 0x5577fa446040] consecutive B-frames: 0.6% 0.4% 0.6% 98.3%
[libx264 @ 0x5577fa446040] mb I I16..4: 2.3% 65.7% 32.1%
[libx264 @ 0x5577fa446040] mb P I16..4: 0.0% 0.1% 0.1% P16..4: 1.1% 0.1% 0.1% 0.0% 0.0% skip:98.5%
[libx264 @ 0x5577fa446040] mb B I16..4: 0.0% 0.0% 0.0% B16..8: 0.2% 0.0% 0.0% direct: 0.0% skip:99.8% L0:55.5% L1:44.5% BI: 0.0%
[libx264 @ 0x5577fa446040] 8x8 transform intra:62.5% inter:56.6%
[libx264 @ 0x5577fa446040] coded y,uvDC,uvAC intra: 89.6% 89.0% 78.7% inter: 0.0% 0.1% 0.0%
[libx264 @ 0x5577fa446040] i16 v,h,dc,p: 37% 5% 8% 50%
[libx264 @ 0x5577fa446040] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 24% 16% 6% 6% 9% 9% 9% 10% 9%
[libx264 @ 0x5577fa446040] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 25% 18% 7% 6% 11% 9% 9% 7% 7%
[libx264 @ 0x5577fa446040] i8c dc,h,v,p: 42% 22% 20% 15%
[libx264 @ 0x5577fa446040] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 @ 0x5577fa446040] ref P L0: 46.3% 2.9% 19.3% 31.5%
[libx264 @ 0x5577fa446040] ref B L0: 45.6% 54.0% 0.4%
[libx264 @ 0x5577fa446040] ref B L1: 97.0% 3.0%
[libx264 @ 0x5577fa446040] kb/s:144.26
[aac @ 0x5577fa3b9940] Qavg: 307.060
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
试试这个(在 3 秒标记处插入的短片):
aresample
使对话框的采样率与视频的采样率相匹配。adelay
将剪辑放置在正确的位置(注意延迟以毫秒为单位)apad
以静音无限期地延长短流amix
组合了 2 个流,并且视频音频结束时停止附录:
如果mp4没有音频,请尝试:
音频过滤器图中的
apad
扩展音频流,以及-shortest
输出选项将其切断视频流结束。调试示例
Try this (the short clip inserted at 3-second mark):
aresample
to match the dialog's sampling rate to the video's.adelay
places the clip at the right position (note the delay is in millisec)apad
extends the short stream indefinitely with silenceamix
combines 2 streams and stops when the video audio endsAddendum:
If mp4 has no audio, try:
apad
in the audio filtergraph extends the audio stream, and-shortest
output option cuts it off when the video stream ends.Debug Sample