绘制可缩放音频波形的正确方法
我正在尝试实现平滑的可缩放音频波形,但对实现缩放的正确方法感到困惑。我在互联网上搜索过,但信息很少或根本没有。
所以这就是我所做的:
从文件中读取音频样本并计算样本每像素 = 10, 20, 40, 80, ....,10240 的波形点。存储每个尺度的数据点(此处总共 11 个)。最大值和最小值也与每个样本每像素的点一起存储。
缩放时,切换到最近的数据集。因此,如果当前宽度的samplesPerPixel为70,则使用与samplesPerPixel = 80对应的数据集。使用log2(samplesPerPixel)可以轻松找到正确的数据集索引。
使用数据集的子采样来绘制波形点。因此,如果我们samplesPerPixel = 41并且我们使用缩放80的数据集,那么我们使用缩放因子80/41进行子采样。
令比例因子 = 80.0/41.0 x = 波形点X[i*比例因子]
我还没有找到更好的方法,也不太确定上述子采样方法是否正确,但可以肯定的是,这种方法会消耗大量内存,并且在开始时加载数据也很慢。音频编辑器如何实现波形放大,有没有有效的方法?
编辑:这是计算 mipmap 的代码。
public class WaveformAudioSample {
var samplesPerPixel:Int = 0
var totalSamples:Int = 0
var samples: [CGFloat] = []
var sampleMax: CGFloat = 0
}
private func downSample(_ waveformSample:WaveformAudioSample, factor:Int) {
NSLog("Averaging samples")
var downSampledAudioSamples:WaveformAudioSample = WaveformAudioSample()
downSampledAudioSamples.samples = [CGFloat](repeating: 0, count: waveformSample.samples.count/factor)
downSampledAudioSamples.samplesPerPixel = waveformSample.samplesPerPixel * factor
downSampledAudioSamples.totalSamples = waveformSample.totalSamples
for i in 0..<waveformSample.samples.count/factor {
var total:CGFloat = 0
for j in 0..<factor {
total = total + waveformSample.samples[i*factor + j]
}
let averagedSample = total/CGFloat(factor)
downSampledAudioSamples.samples[i] = averagedSample
}
NSLog("Averaged samples")
}
I am trying to implement smooth zoomable audio waveform but am puzzled with the correct approach to implement zoom. I searched internet but there is very little or no information.
So here is what I have done:
Read audio samples from file and compute waveform points with samplesPerPixel = 10, 20, 40, 80, ....,10240. Store the datapoints for each scale (11 in total here). Max and min are also stored along with points for each samplesPerPixel.
When zooming, switch to the closest dataset. So if samplesPerPixel at current width is 70, then use dataset corresponding to samplesPerPixel = 80. The correct dataset index is easily found using log2(samplesPerPixel).
Use subsampling of the dataset to draw waveform points. So if we samplesPerPixel = 41 and we are using data set for zoom 80, then we use the scaling factor 80/41 to subsample.
let scaleFactor = 80.0/41.0
x = waveformPointX[i*scaleFactor]
I am yet to find a better approach and not too sure if the above approach of subsampling is correct, but for sure this approach consumes lot of memory and also is slow to load data at the start. How do audio editors implement zooming in waveform, is there an efficient approach?
EDIT: Here is a code for computing mipmaps.
public class WaveformAudioSample {
var samplesPerPixel:Int = 0
var totalSamples:Int = 0
var samples: [CGFloat] = []
var sampleMax: CGFloat = 0
}
private func downSample(_ waveformSample:WaveformAudioSample, factor:Int) {
NSLog("Averaging samples")
var downSampledAudioSamples:WaveformAudioSample = WaveformAudioSample()
downSampledAudioSamples.samples = [CGFloat](repeating: 0, count: waveformSample.samples.count/factor)
downSampledAudioSamples.samplesPerPixel = waveformSample.samplesPerPixel * factor
downSampledAudioSamples.totalSamples = waveformSample.totalSamples
for i in 0..<waveformSample.samples.count/factor {
var total:CGFloat = 0
for j in 0..<factor {
total = total + waveformSample.samples[i*factor + j]
}
let averagedSample = total/CGFloat(factor)
downSampledAudioSamples.samples[i] = averagedSample
}
NSLog("Averaged samples")
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您应该使用数据大小的 2 次方
这将允许您仅使用廉价的位移位和简单的大小调整,而无需任何昂贵的浮点运算或整数乘法和除法。
您应该使用以前的 mipmap 来制作半分辨率 mipmap
这将始终从先前 mipmap 的 2 个样本创建一个样本,因此不会嵌套 for 循环或昂贵的索引计算
不要混合浮点和整数计算如果你能避免的话
即使你有 FPU,int 和 float 之间的转换通常也很慢。理想情况下,将音频数据保存为整数格式...
这里是这些想法的小型 C++/VCL 示例:
忽略窗口 VCL 和渲染相关内容(我只想传递整个源代码,以便您可以了解它是如何实现的)被使用)。重要的只是将输入数据转换为 2 个 mipmap 的函数
mipmap_compute
。一个是保存最小值,另一个是最大值。动态分配并不重要,唯一重要的代码块用注释标记:
其中对于每个 mipmap 只有单个
for
循环,没有任何昂贵的操作。如果您的平台更适合无分支代码,您可以使用内置 brunchless 函数min,max
计算 min,max。类似于:只需使用指向实际选定的 mipmap 的指针即可进一步优化,这将消除
[k]
和[k-1]
索引,从而减少内存每个元素的访问权限。现在您需要做的就是在 2 个 mipmap 之间进行双线性插值以达到您的分辨率,这里是一个小例子:
注意目标大小
n
必须小于或等于最高mipmap
分辨率。 ..这就是它的样子(当我用鼠标滚轮手动更改分辨率时):
不稳定是由 GIF 抓取器引起的……缩放实际上是快速且无缝的。
You should use power of 2 size of your data
This will allow you to use just cheap bit shifts and simple resizing without any costly floating point operations or integer multiplicatin and division.
You should do half resolution mipmaps using previous mipmap
This will always create one sample from 2 samples of previous mipmap so no nested for loops or costly index computations
Do not mix floating and integer computations if you can avoid it
even if you have FPU the conversion between int and float is usually very slow. Ideally keep your audio data in integer format...
Here small C++/VCL example of these ideas:
Ignore the window VCL and rendering related stuff (I just wanted to pass whole source so you can see how it is used). The important is only the function
mipmap_compute
which converts your input data to 2 mipmaps. One is holding min values and the other max values.The dynamic allocatins are not important the only important code chunk is marked with comment:
Where for each mipmap there is only single
for
loop without any expensive operations. If your platform is better with branchless code you can compute the min,max using in-build brunchless functionsmin,max
. Something like:This can be further optimized simply by using pointer to actually selected mipmaps that will get rid of the
[k]
and[k-1]
indexes allowing one less memory access per each element access.Now all you need is to bilinearly interpolate between 2 mipmaps to achieve your resolution, here small example for this:
Beware target size
n
must be less or equal to highestmipmap
resolution...This is how it looks (when I change the resolution manually with mouse wheel):
The choppyness is caused by GIF grabber ... the scaling is fast and seamless in real.
我遇到了类似的问题,需要在 800 点的屏幕上绘制 1.800.000 点的波形。缩放系数是 2000。如果有人感兴趣,这就是我得到很棒结果的方法:
结果:
从 453932 点 到 800 点
Python代码:
I had a similar problem, with 1.800.000 points of a waveform to draw on an 800 points screen. The zoom factor was 2000. If someone is interested, that's how I got awesome results :
Results :
from 453932 points to 800 points
Python code :