如何在 Linux 中创建 MP3 的波形图像?

发布于 2024-10-07 22:14:21 字数 64 浏览 0 评论 0原文

给定一个 MP3,我想将文件中的波形提取到图像 (.png) 中,

是否有一个包可以满足我的需要?

Given an MP3 I would like to extract the waveform from the file into an image (.png)

Is there a package that can do what I need ?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

春花秋月 2024-10-14 22:14:21

使用 soxgnuplot 您可以创建基本波形图像:

sox audio.mp3 audio.dat #create plaintext file of amplitude values
tail -n+3 audio.dat > audio_only.dat #remove comments

# write script file for gnuplot
echo set term png size 320,180 > audio.gpi #set output format
echo set output \"audio.png\" >> audio.gpi #set output file
echo plot \"audio_only.dat\" with lines >> audio.gpi #plot data

gnuplot audio.gpi #run script

东西,请使用以下 GNU Plot 文件作为模板(将其另存为 audio.gpi):

#set output format and size
set term png size 320,180

#set output file
set output "audio.png"

# set y range
set yr [-1:1]

# we want just the data
unset key
unset tics
unset border
set lmargin 0             
set rmargin 0
set tmargin 0
set bmargin 0

# draw rectangle to change background color
set obj 1 rectangle behind from screen 0,0 to screen 1,1
set obj 1 fillstyle solid 1.0 fillcolor rgbcolor "#222222"

# draw data with foreground color
plot "audio_only.dat" with lines lt rgb 'white'

然后运行:

sox audio.mp3 audio.dat #create plaintext file of amplitude values
tail -n+3 audio.dat > audio_only.dat #remove comments

gnuplot audio.gpi #run script

在此处输入图像描述

基于此答案对类似问题在文件格式方面更通用,但在使用的软件方面不太通用。

Using sox and gnuplot you can create basic waveform images:

sox audio.mp3 audio.dat #create plaintext file of amplitude values
tail -n+3 audio.dat > audio_only.dat #remove comments

# write script file for gnuplot
echo set term png size 320,180 > audio.gpi #set output format
echo set output \"audio.png\" >> audio.gpi #set output file
echo plot \"audio_only.dat\" with lines >> audio.gpi #plot data

gnuplot audio.gpi #run script

enter image description here

To create something simpler/prettier, use the following GNU Plot file as a template (save it as audio.gpi):

#set output format and size
set term png size 320,180

#set output file
set output "audio.png"

# set y range
set yr [-1:1]

# we want just the data
unset key
unset tics
unset border
set lmargin 0             
set rmargin 0
set tmargin 0
set bmargin 0

# draw rectangle to change background color
set obj 1 rectangle behind from screen 0,0 to screen 1,1
set obj 1 fillstyle solid 1.0 fillcolor rgbcolor "#222222"

# draw data with foreground color
plot "audio_only.dat" with lines lt rgb 'white'

and just run:

sox audio.mp3 audio.dat #create plaintext file of amplitude values
tail -n+3 audio.dat > audio_only.dat #remove comments

gnuplot audio.gpi #run script

enter image description here

Based on this answer to a similar question that is more general regarding file format but less general in regards to software used.

海未深 2024-10-14 22:14:21

FFmpeg showwavespic

FFmpeg 可以像往常一样在单个命令中完成此操作:

示例命令:

sudo apt install ffmpeg
ffmpeg -i in.flac -filter_complex "showwavespic=s=640x320:colors=black" \
  -frames:v 1 out.png

您还可以在 RGB 中设置颜色 colors=0x0088FF: 在 ffmpeg 的 showwaves 中使用十六进制颜色

我说的示例测试数据“你好,我的名字是 Ciro Santilli”,有两个相同的立体声通道:

wget -O in.flac https://raw.githubusercontent.com/cirosantilli/media/d6e9e8d0b01bccef4958eb8b976c3b0a34870cd3/Hello_my_name_is_Ciro_Santilli.flac

输出:

在此处输入图像描述

背景颜色

默认情况下背景是透明的,但是:

,这样我们就可以达到:

ffmpeg -i in.flac -f lavfi -i color=c=black:s=640x320 -filter_complex \
  "[0:a]showwavespic=s=640x320:colors=white[fg];[1:v][fg]overlay=format=auto" \
  -frames:v 1 out.png

现在已添加到 Wiki ;-)

对于新手来说,该 CLI 创建了一个处理图:

black background (1:v) ------------------------> overlay ----> out.png
                                                   ^
                                                   |
in.flac (0:a) ----> showwavespic ----> (fg) -------+

其中,例如,overlay 过滤器采用两个图像输入并生成所需的输出,而 fg 只是分配给中间节点的名称。

输入图像描述这里

分割通道

本教程还介绍了其他选项,例如使用 -filter_complex "showwavespic=s=640x480:colors=black:split_channels=1" 分割通道:

enter这里的图像描述

带轴的 gnuplot 图

好吧,我承认,FFmpeg 还不能单独做到这一点(还!)。但 Wiki 已经提供了一种有效的 gnuplot 数据导出方法:

ffmpeg -i in.flac -ac 1 -filter:a aresample=8000 -map 0:a -c:a pcm_s16le -f data - | \
  gnuplot -p -e "set terminal png size 640,360; set output 'out.png'; plot '<cat' binary filetype=bin format='%int16' endian=little array=1:0 with lines;"

在此处输入图像描述

视频表示

请参阅:https://superuser.com/questions/843774/create-a-video-file-from-an-音频文件和从音频添加可视化

在 Ubuntu 20.04、FFmpeg 4.2.4 上测试。

FFmpeg showwavespic

FFmpeg can do it in a single command as usual:

Sample command:

sudo apt install ffmpeg
ffmpeg -i in.flac -filter_complex "showwavespic=s=640x320:colors=black" \
  -frames:v 1 out.png

You can also set colors in RGB colors=0x0088FF: Using hex colors with ffmpeg's showwaves

Sample test data of me saying "Hello my name is Ciro Santilli" with two identical stereo channels:

wget -O in.flac https://raw.githubusercontent.com/cirosantilli/media/d6e9e8d0b01bccef4958eb8b976c3b0a34870cd3/Hello_my_name_is_Ciro_Santilli.flac

Output:

enter image description here

Background color

The background is transparent by default, but:

and so we reach:

ffmpeg -i in.flac -f lavfi -i color=c=black:s=640x320 -filter_complex \
  "[0:a]showwavespic=s=640x320:colors=white[fg];[1:v][fg]overlay=format=auto" \
  -frames:v 1 out.png

Added to the Wiki now ;-)

For the uninitiated, that CLI creates a processing graph:

black background (1:v) ------------------------> overlay ----> out.png
                                                   ^
                                                   |
in.flac (0:a) ----> showwavespic ----> (fg) -------+

where e.g. the overlay filter takes two image inputs and produces the desired output, and fg is just a name assigned to an intermediate node.

enter image description here

Split channels

The tutorial also covers other options such as split channels with -filter_complex "showwavespic=s=640x480:colors=black:split_channels=1":

enter image description here

gnuplot plot with axes

OK, I'll admit it, FFmpeg can't do this alone (yet!). But the Wiki already provides a data export method to gnuplot that works:

ffmpeg -i in.flac -ac 1 -filter:a aresample=8000 -map 0:a -c:a pcm_s16le -f data - | \
  gnuplot -p -e "set terminal png size 640,360; set output 'out.png'; plot '<cat' binary filetype=bin format='%int16' endian=little array=1:0 with lines;"

enter image description here

Video representations

See: https://superuser.com/questions/843774/create-a-video-file-from-an-audio-file-and-add-visualizations-from-audio

Tested on Ubuntu 20.04, FFmpeg 4.2.4.

海的爱人是光 2024-10-14 22:14:21

如果您有 GUI 环境,则可以使用 audacity 音频编辑器加载 mp3,然后使用打印命令生成波形的 pdf。然后将pdf转换为png。

If you have a GUI environment you can use the audacity audio editor to load the mp3 and then use the print command to generate a pdf of the waveform. Then convert the pdf to png.

长不大的小祸害 2024-10-14 22:14:21

我会做这样的事情:

  • 找到一个工具将 mp3 转换为 PCM,即具有一个 8 或 16 位值的二进制数据
    每个样品。我猜 mplayer 可以做到这一点

  • 将结果通过管道传输到将二进制数据转换为 ascii 的实用程序
    以十进制格式表示数字

  • 使用 gnuplot 将此值列表转换为 png 图形。

瞧,Unix 工具之间管道的力量。现在,如果 gnuplot 能够从二进制格式读取数据,则此列表中的步骤 2 可能是可选的。

I would do something like this :

  • find a tool to convert mp3 to PCM, ie binary data with one 8 or 16 bit value
    per sample. I guess mplayer can do that

  • pipe the result to a utility converting binary data to an ascii
    representation of the numbers in decimal format

  • use gnuplot to transform this list of value into a png graph.

And voilà, the power of piping between unix tools. Now Step 2 in this list might be optionnal if gnuplot is able to read it's data from a binary format.

裂开嘴轻声笑有多痛 2024-10-14 22:14:21

您可能需要考虑 BBC 的音频波形。

audiowaveform 是一个 C++ 命令行应用程序,可从 MP3、WAV 或 FLAC 格式的音频文件生成波形数据。波形数据可用于生成音频的视觉渲染,其外观与音频编辑应用程序类似。

波形数据文件以二进制格式 (.dat) 或 JSON (.json) 保存。给定输入波形数据文件,audiowaveform 还可以在给定时间偏移和缩放级别将音频波形渲染为 PNG 图像。

波形数据是通过首先组合左声道和右声道以产生单声道信号从输入立体声音频信号产生的。下一阶段是计算 N 个输入样本组的最小和最大样本值(其中 N 由 --zoom 命令行选项控制),以便每个 N 个输入样本在输出。

https://github.com/bbcrd/audiowaveform

You might want to consider audiowaveform from the BBC.

audiowaveform is a C++ command-line application that generates waveform data from either MP3, WAV, or FLAC format audio files. Waveform data can be used to produce a visual rendering of the audio, similar in appearance to audio editing applications.

Waveform data files are saved in either binary format (.dat) or JSON (.json). Given an input waveform data file, audiowaveform can also render the audio waveform as a PNG image at a given time offset and zoom level.

The waveform data is produced from an input stereo audio signal by first combining the left and right channels to produce a mono signal. The next stage is to compute the minimum and maximum sample values over groups of N input samples (where N is controlled by the --zoom command-line option), such that each N input samples produces one pair of minimum and maxmimum points in the output.

https://github.com/bbcrd/audiowaveform

月依秋水 2024-10-14 22:14:21

这是 SoX(用于声音、Windows 和 Linux 的命令行工具)中的标准函数
功能

检查 http://sox.sourceforge.net/sox.html 上的“频谱图” 频谱图以便携式网络图形 (PNG) 文件形式呈现,X 轴显示时间,Y 轴显示频率,Z 轴显示音频信号幅度,值由颜色 ( 表示)。如果音频信号包含多个通道,则从通道 1(立体声音频的左通道)开始从上到下显示这些通道。

This is a standard function in SoX (command line tool for sound, Windows & Linux)
Check the 'spectrogram' function on http://sox.sourceforge.net/sox.html

"The spectrogram is rendered in a Portable Network Graphic (PNG) file, and shows time in the X-axis, frequency in the Y-axis, and audio signal magnitude in the Z-axis. Z-axis values are represented by the colour (or optionally the intensity) of the pixels in the X-Y plane. If the audio signal contains multiple channels then these are shown from top to bottom starting from channel 1 (which is the left channel for stereo audio)."

真心难拥有 2024-10-14 22:14:21

基于 qubodup 的答案,

# install stuff
apt install gnuplot
apt install sox
apt install libsox-fmt-mp3

#create plaintext file of amplitude values
sox sound.mp3 sound.dat

# run script saved on audio.gpi file
gnuplot audio.gpi

您还可以在配置文件中注释“设置输出...”行,然后执行

gnuplot audio.gpi > my_sound.png

配置文件在本例中为 audio.gpi,其内部有

#!/usr/bin/env gnuplot

set datafile commentschars ";"

set terminal png #size 800,400
set output "sound.png"

unset border
unset xtics
unset ytics

set key off

plot "sound.dat" with lines

生成如下所示的图像

在此处输入图像描述< /a>

我想要没有轴,没有图例,png(比 svg 小得多)。

Building on the answer of qubodup

# install stuff
apt install gnuplot
apt install sox
apt install libsox-fmt-mp3

#create plaintext file of amplitude values
sox sound.mp3 sound.dat

# run script saved on audio.gpi file
gnuplot audio.gpi

You can also comment the "set output ..." line in the configuration file and do

gnuplot audio.gpi > my_sound.png

The configuration file is audio.gpi in this case and inside it has

#!/usr/bin/env gnuplot

set datafile commentschars ";"

set terminal png #size 800,400
set output "sound.png"

unset border
unset xtics
unset ytics

set key off

plot "sound.dat" with lines

Which produces images like the following

enter image description here

I wanted no axis, no legend, png (much smaller than svg).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文