有人对实时音频合成编程有一些建议吗？

发布于 2024-10-02 09:15:43 字数 577 浏览 9 评论 0原文

我目前正在从事一个个人项目：在 Flash 中创建一个用于实时音频合成的库。简而言之：将波形发生器、滤波器、混音器等相互连接并向声卡提供原始（实时）数据的工具。像 max/msp 或 Reaktor 之类的东西。

我已经有了一些工作内容，但我想知道我编写的基本设置是否正确。我不想稍后遇到迫使我更改应用程序核心的问题（尽管这种情况总是会发生）。

基本上，我现在所做的是从链的末端开始，即（原始）声音数据“输出”（到声卡）的地方。为此，我需要将字节块（ByteArrays）写入一个对象，为了获取该块，我要求连接到我的“声音输出”模块的任何模块给我他的块。该模块对连接到其输入的模块执行相同的请求，并且这种情况不断发生，直到到达链的开头。

这是正确的方法吗？我可以想象如果有反馈循环，或者有另一个没有输出的模块，就会遇到问题：如果我要在某处连接频谱分析仪，那将是链中的死胡同（没有输出，只有输入的模块）。在我当前的设置中，这样的模块无法工作，因为我只从声音输出模块开始计算。

有没有人有过这样的编程经验？我对有关正确方法的一些想法非常感兴趣。（为了清楚起见：我不是在寻找特定的 Flash 实现，这就是为什么我没有在 flash 或 actionscript 下标记这个问题）

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

一杯敬自由 2024-10-09 09:15:43

我不久前做了类似的事情，并且使用了与您相同的方法 - 从虚拟线路输出开始，并将信号跟踪回顶部。不过，我是按样本执行此操作，而不是按缓冲区执行此操作；如果我今天要编写相同的应用程序，我可能会选择每个缓冲区，因为我怀疑它会表现得更好。

光谱仪被设计为插入模块，也就是说，只有输入和输出都连接时它才会工作，并且会将输入不变地传递到输出。

为了处理反馈，我有一个特殊的辅助模块，它引入了 1 个样本延迟，并且每个周期仅获取一次输入。

另外，我认为使用浮点数进行所有内部处理，从而使用浮点数数组作为缓冲区，会比字节数组容易得多，并且它会节省您始终在整数和浮点数之间转换的额外工作。

回复收藏 0 原文

失而复得 2024-10-09 09:15:43

在更高版本中，网络的不同部分可能有不同的数据包速率。

一个例子是，如果您扩展它以将数据传输到磁盘或从磁盘传输数据。另一个例子
低数据速率控制变量（例如控制回声延迟的变量）可能稍后成为网络的一部分。您可能不希望以与处理音频数据包相同的频率处理控制变量，但它们仍然是“实时”的并且是功能网络的一部分。例如，它们可能需要平滑以避免突然的过渡。

只要您以相同的速率调用所有函数，并且所有函数本质上都占用恒定时间，您的拉取数据方法就可以正常工作。会有
在提取数据和推送数据之间几乎没有选择。拉动对于播放音频来说更自然，推送对于录音来说更自然，但是两者都有效并且最终对底层音频处理函数进行相同的调用。

对于您拥有的光谱仪
多个接收器的问题
数据，但这不是问题。
引入一个虚拟链接从
真正的水槽。虚拟链接可以
导致请求的数据不是
荣幸。只要虚拟链接知道
它是一个假人，不关心
缺少数据，一切都会
好的。这是一种将多个接收器或源减少为单个接收器或源的标准技术。
使用这种网络，您不希望在一次完整更新中执行两次相同的计算。例如，如果您混合信号的高通和低通版本，您不希望对原始信号进行两次评估。您必须执行一些操作，例如记录每个缓冲区的计时器刻度值，并在看到当前刻度值已存在时停止拉取的传播。同样的机制还可以保护您在评估中免受反馈循环的影响。

因此，您关心的这两个问题在您当前的框架内很容易得到解决。

速率匹配网络的不同部分存在不同的数据包速率，这就是当前方法出现问题的地方。如果您要将音频写入磁盘，那么为了提高效率，您将不希望频繁写入大块。您不希望在这些写入期间阻止对更频繁的小型音频输入和输出处理数据包的服务。单一的利率拉动或推动策略本身是不够的。

请接受这样一个事实：在某些时候您可能需要比单一速率网络更复杂的更新方式。当发生这种情况时，您将需要运行不同速率的线程，或者您将编写自己的简单调度程序，可能就像在 n 中调用一次不太频繁评估的函数一样简单，以使速率匹配。您无需为此提前计划。您的音频函数几乎肯定已经将确保其输入缓冲区准备就绪的责任委托给其他函数，并且只有那些其他函数需要更改，而不是音频函数本身。

在这个阶段我建议的一件事是要小心集中音频缓冲区
分配，注意到缓冲区就像栅栏柱。它们不属于音频
功能，它们位于音频功能之间。集中缓冲区分配将使得可以轻松地回顾性地修改网络不同部分的不同速率的更新策略。

In later versions you may have different packet rates in different parts of your network.

One example would be if you extend it to transfer data to or from disk. Another example
would be that low data rate control variables such as one controlling echo-delay may, later, become a part of your network. You probably don't want to process control variables with the same frequency that you process audio packets, but they are still 'real time' and part of the function network. They may for example need smoothing to avoid sudden transitions.

As long as you are calling all your functions at the same rate, and all the functions are essentially taking constant-time, your pull-the-data approach will work fine. There will
be little to choose between pulling data and pushing. Pulling is somewhat more natural for playing audio, pushing is somewhat more natural for recording, but either works and ends up making the same calls to the underlying audio processing functions.

For the spectrometer you've got
the issue of multiple sinks for
data, but it is not a problem.
Introduce a dummy link to it from
the real sink. The dummy link can
cause a request for data that is not
honoured. As long as the dummy link knows
it is a dummy and does not care about
the lack of data, everything will be
OK. This is a standard technique for reducing multiple sinks or sources to a single one.
With this kind of network you do not want to do the same calculation twice in one complete update. For example if you mix a high-passed and low-passed version of a signal you do not want to evaluate the original signal twice. You must do something like record a timer tick value with each buffer, and stop propagation of pulls when you see the current tick value is already present. This same mechanism will also protect you against feedback loops in evaluation.

So, those two issues of concern to you are easily addressed within your current framework.

Rate matching where there are different packet rates in different parts of the network is where the problems with the current approach will start. If you are writing audio to disk then for efficiency you'll want to write large chunks infrequently. You don't want to be blocking your servicing of the more frequent small audio input and output processing packets during those writes. A single rate pulling or pushing strategy on its own won't be enough.

Just accept that at some point you may need a more sophisticated way of updating than a single rate network. When that happens you'll need threads for the different rates that are running, or you'll write your own simple scheduler, possibly as simple as calling less frequently evaluated functions one time in n, to make the rates match. You don't need to plan ahead for this. Your audio functions are almost certainly already delegating responsibility for ensuring their input buffers are ready to other functions, and it will only be those other functions that need to change, not the audio functions themselves.

The one thing I would advise at this stage is to be careful to centralise audio buffer
allocation, noticing that buffers are like fenceposts. They don't belong to an audio
function, they lie between the audio functions. Centralising the buffer allocation will make it easy to retrospectively modify the update strategy for different rates in different parts of the network.

回复收藏 0 原文

~没有更多了~