什么样的声音处理算法可以让你做出这样的可视化?

发布于 2024-08-20 09:47:12 字数 651 浏览 11 评论 0原文

我有兴趣为 MP3 制作 OpenGL 可视化工具作为我的宠物项目。

我偶然发现了这个 YouTube 视频,其中展示了有人展示与增强现实结合使用的可视化工具。

http://www.youtube.com/watch?v=SnshyLJSpnc#t=1m15s

请观看该视频,但忽略该视频的增强现实方面。我只对制作展示台感兴趣,而不是增强现实。

使用什么类型的算法来生成与音乐相关的模式?如果你观察,你会看到几种不同的可视化方法。第一个具有独特的外观:

第一个看起来像在渲染区域上移动的波浪: alt text

另一种“模式”似乎使可视化围绕同心圆的中心移动: alt text

任何精通音频编程的人,可以使用什么类型的算法来生成类似的可视化效果?第一个使用什么样的算法?还是有同心圆的?

任何帮助我了解使用什么算法来生成这些基于音乐的可视化效果的帮助都会对我有很大帮助!

I'm interested in making an OpenGL visualizer for MP3's as a pet project.

I stumbled upon this youtube video which demonstrates someone showing off a visualizer being used in conjunction with Augmented Reality.

http://www.youtube.com/watch?v=SnshyLJSpnc#t=1m15s

Please watch that video, but ignore the augmented reality aspect of that video. I'm only interested in making a Visualizer, not augmented reality.

What kinds of algorithms were used to generate those patterns in relation to the music? If you watch, you can see what looks like several different methods of visualization. The first one has a distinct look:

The first one looked like waves moving over the rendering area:
alt text

Another "mode" seemed to have the visualization move around the center in concentrict circles:
alt text

Anyone who is well versed in Audio Programming, what kinds of algorithms could be used to generate similar looking visualizations? What kind of algorithm did the first one use? Or the one with the concentric circles?

Any help in pointing me to what algorithms were used to generate these visualizations based on the music would help me greatly!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

不寐倦长更 2024-08-27 09:47:12

首先,这些似乎都基于 FFT 算法(快速傅立叶变换),该算法可以获取特定时间片的声波并将其分离成 XY 频谱线图,其中 X 代表频谱(通常基于对数)从 20hz 到 20,000hz),Y 表示每个不同频率下声音的振幅或音量。

如果你看第一个可视化(视频前面的扁平、无色的可视化),你会看到它朴素的形式。您会注意到较低的音符在左侧显示为峰和尖峰,而较高的音符出现在中间和右侧,这是经典的傅里叶变换映射。 (其实这个视频最大的毛病就是后半段,引入后,从左到右的FFT映射有缺陷,导致大部分最高和最低音符被切掉了左右边缘)可视化)。

从现在开始,他只是在这一基本技巧上添加不同的、逐渐复杂的装饰。首先,他添加了非常基本的颜色映射:波形的高度直接映射到其颜色:从红色(最低)到深蓝色/靛蓝(最高),遵循经典的 ROYGBIV 模式(红、橙、黄、绿、蓝、靛、紫)。请记住,高度也对应于该频率下的体积。据我所知,他自始至终都使用相同的颜色映射,没有任何变化。

所有后续的装饰和变化似乎都是渐进式时间映射的不同方式。最初,他只是将波形映射到视觉区域的前面,然后逐渐将它们移走,这样他就可以有效地制作连续运行的 3D 曲面图,频率从左到右,音量(和颜色)从下到上和时间从前到后运行。这就是您在第一张图片中看到的内容。

其他一切都只是越来越复杂的版本,以更复杂的方式映射时间(并且仅是时间)。例如,在您显示的第二个圆形中,我相信他正在围绕中间明显的极点以非常快的径向扫描模式绘制时间。

First, these all appear to be based on FFT algorithims (Fast-Fourier Transforms) which can take a sound wave for a particular time slice and separate it out into an X-Y spectrum line graph, where X represents the frequency spectrum (usually log-based from 20hz to 20,000hz) and Y represents the amplitude or volume of sound at each different frequency.

If you look at the very first visualizations (the flat, colorless ones earlier in the video) you will see exactly this in its unadorned form. You will notice that lower notes appear as peaks and spikes on the left side, whereas higher notes appear in the middle and right, which is classic Fourier Transform mapping. (In fact the biggest fault in this video is that in the second half, after the introduction, the left-to-right FFT mapping is flawed so that most of the highest and lowest notes are cut-off of the left and right edges of the visualization).

From here on he is just adding different and progressively more complicated decorations to this one basic trick. First he adds a very basic color mapping: the height of the waveform maps directly to its color: from red (the lowest) to dark-blue/indigo (the highest), following the classic ROYGBIV pattern (red, orange, yellow, green, blue, indigo, violet). Remember that height also corresponds to volume at that frequency. He uses this same color-mapping throughout without any variation, as far as I can tell.

All of the subsequent decorations and variations appear to just be different ways to play around with progressive time-mapping. Initially, he just maps the waveforms at the front of the visual area and then progressively flows them away, so that he is effectively making a continuously running 3d surface graph, with frequency running left to right, volume (and color) running bottom to top and time running front to back. This is what you have in your first picture.

Everything else is just more and more sophisticated versions of this, mapping time (and time-only) in more complicated ways. For instance in the circular one that you show second, I believe that he is mapping time in a very fast radial sweep pattern around the obvious pole in the middle.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文