从头开始创建 DSP 系统
我喜欢电子音乐,并且对它的运作原理很感兴趣。 我在 Stack Overflow 上发现了很多关于可用于播放音频、滤波器等的库的有用问题。但我真正好奇的是实际发生的情况:数据如何在效果器和振荡器之间传递?我已经对 dsp 的数学方面进行了研究,并且我已经解决了问题,但我不确定要使用什么缓冲系统等。最终目标是拥有一个传递数据的效果和振荡器的简单对象层次结构彼此之间(如果我最终没有拔出所有头发来尝试实现它,也许可以使用多线程)。它不会是下一个 Propellerhead Reason,但我对它是如何运作的很感兴趣,这更多的是一种练习,而不是产生最终产品的东西。
目前我使用 .net 和 C#,最近学习了 F#(这可能会也可能不会导致一些有趣的数据处理方式),但如果这些不适合这项工作,我可以在必要时学习另一个系统。
问题是:使用缓冲区通过程序获取大量信号数据的最佳方法是什么?例如,我使用队列、数组、链表等会更好吗?我应该使样本不可变并在每次向系统应用效果时创建一组新数据还是仅编辑缓冲区中的值?我是否应该有一个调度程序/线程池样式的对象来组织传递数据,或者效果函数应该直接在彼此之间传递数据?
谢谢。
编辑:另一个相关问题是我如何使用 Windows API 来播放这个数组?我真的不想使用 DirectShow,因为微软现在几乎已经让它消亡了
EDIT2:感谢所有的答案。在查看了所有技术之后,我将使用 XNA 4(我花了一些时间在互联网上搜索并发现 这个网站解释了如何做到这一点)或 NAudio 来输出音乐......还不确定是哪一个,取决于系统最终的先进程度。当 C# 5.0 发布时,我将使用其异步功能在此基础上创建一个效果架构。我几乎平等地使用了每个人的答案,所以现在我遇到了一个难题,即该向谁提供赏金......
I love electronic music and I am interested in how it all ticks.
I've found lots of helpful questions on Stack Overflow on libraries that can be used to play with audio, filters etc. But what I am really curious about is what is actually hapening: how is the data being passed between effects and oscillators? I have done research into the mathematical side of dsp and I've got that end of the problem sussed but I am unsure what buffering system to use etc. The final goal is to have a simple object heirarchy of effects and oscillators that pass the data between each other (maybe using multithreading if I don't end up pulling out all my hair trying to implement it). It's not going to be the next Propellerhead Reason but I am interested in how it all works and this is more of an exercise than something that will yeild an end product.
At the moment I use .net and C# and I have recently learnt F# (which may or may not lead to some interesting ways of handling the data) but if these are not suitable for the job I can learn another system if necessary.
The question is: what is the best way to get the large amounts of signal data through the program using buffers? For instance would I be better off using a Queue, Array,Linked List etc? Should I make the samples immutable and create a new set of data each time I apply an effect to the system or just edit the values in the buffer? Shoud I have a dispatcher/thread pool style object that organises passing data or should the effect functions pass data directly between each other?
Thanks.
EDIT: another related question is how would I then use the windows API to play this array? I don't really want to use DirectShow because Microsoft has pretty much left it to die now
EDIT2: thanks for all the answers. After looking at all the technologies I will either use XNA 4(I spent a while trawling the internet and found this site which explains how to do it) or NAudio to output the music... not sure which one yet, depends on how advanced the system ends up being. When C# 5.0 comes out I will use its async capabilities to create an effects architecture on top of that. I've pretty much used everybody's answer equally so now I have a conundrum of who to give the bounty to...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
您看过 VST.NET (http://vstnet.codeplex.com/) 吗?这是一个使用 C# 编写 VST 的库,它有一些示例。您还可以考虑编写 VST,以便可以从任何主机应用程序使用您的代码(但即使您不想,查看它们的代码也可能很有用)。
信号数据通常很大并且需要大量处理。不要使用链表!我知道的大多数库只是使用一个数组来放置所有音频数据(毕竟,这就是声卡所期望的)。
来自 VST.NET 示例:
audioChannel 是非托管 float* 缓冲区的包装器。
您可能将样本存储在不可变数组中。然后,当您想要播放它们时,您可以复制输出缓冲区中的数据(如果需要,可以更改频率)并在此缓冲区中执行效果。请注意,您可以使用多个输出缓冲区(或通道)并在最后将它们相加。
编辑
我知道播放阵列的两种低级方法:Windows API 中的 DirectSound 和 WaveOut。 使用 DirectSound 的 C# 示例。 使用 WaveOut 的 C# 示例。但是,您可能更喜欢使用外部更高级别的库,例如 NAudio。 NAudio 对于 .NET 音频操作很方便 - 请参阅此 博客文章,用于向声卡发送正弦波。您可以看到他们也使用浮点数组,这是我推荐的(如果您使用字节进行计算,最终会在声音中出现大量混叠)。
Have you looked at VST.NET (http://vstnet.codeplex.com/)? It's a library to write VST using C# and it has some examples. You can also consider writing a VST, so that your code can be used from any host application (but even if you don't want, looking at their code can be useful).
Signal data is usually big and requires a lot of processing. Do not use a linked list! Most libraries I know simply use an array to put all the audio data (after all, that's what the sound card expect).
From a VST.NET sample:
The audioChannel is a wrapper around an unmanaged float* buffer.
You probably store your samples in an immutable array. Then, when you want to play them, you copy the data in the output buffer (change the frequency if you want) and perform effects in this buffer. Note you can use several output buffers (or channels) and sum them at the end.
Edit
I know two low-level ways to play your array: DirectSound and WaveOut from Windows API. C# Example using DirectSound. C# example with WaveOut. However, you might prefer use an external higher-level library, like NAudio. NAudio is convenient for .NET audio manipulation - see this blog post for sending a sine wave to the audio card. You can see they are also using an array of float, which is what I recommend (if you do your computations using bytes, you'll end up with a lot of aliasing in the sound).
F# 在这里可能是一个不错的选择,因为它非常适合操作函数。函数可能是信号创建和处理的良好构建块。
由于 Array 模块中的高阶函数,F# 通常还擅长操作集合,尤其是数组。
我猜这些品质使得 F# 在金融领域很受欢迎,并且对于信号处理也很有用。
用于技术计算的 Visual F# 2010 有一个专门介绍傅立叶变换的部分,这可能与您想做的事情相关。不过,我想网上有很多关于转换的免费信息。
最后,要播放示例,您可以使用 XNA。我认为最新版本的API(4.0)也允许录制,但我从未使用过。 Xbox 有一个著名的音乐编辑应用程序,名为 ezmuse+ Hamst3r Edition使用了XNA,所以这绝对是可能的。
F# is probably a good choice here, as it's well fitted to manipulate functions. Functions are probably good building blocks for signal creation and processing.
F# is also good at manipulating collections in general, and arrays in particular, thanks to the higher-order functions in the Array module.
These qualities make F# popular in the finance sector and are also useful for signal processing, I would guess.
Visual F# 2010 for Technical Computing has a section dedicated to Fourier Transform, which could be relevant to what you want to do. I guess there is plenty of free information about the transform on the net, though.
Finally, to play samples, you can use XNA. I think the latest version of the API (4.0) also allows recording, but I have never used that. There is a famous music editing app for the Xbox called ezmuse+ Hamst3r Edition that uses XNA, so it's definitely possible.
关于缓冲和异步/线程/同步问题,我建议您查看新的 TPL 数据流库。凭借其块原语、并发数据结构、数据流网络、异步消息处理和 TPL 基于任务的抽象(可与 async/await C# 5 功能一起使用),它非常适合此类应用程序。
With respect to buffering and asynchrony/threading/synchronization issues I suggest you to take a look at the new TPL Data Flow library. With its block primitives, concurrent data structures, data flow networks, async message prcessing, and TPL's Task based abstraction (that can be used with the async/await C# 5 features), it's a very good fit for this type of applications.
我不知道这是否是您真正想要的,但这是我在大学期间的个人项目之一。在我自己实现之前,我并没有真正理解声音和 DSP 是如何工作的。我试图尽可能接近演讲者,所以我只使用 libsndfile 来处理复杂的文件格式。
基本上,我的第一个项目是创建一个大型双精度数组,用正弦波填充它,然后使用 sf_writef_double() 将该数组写入文件以创建我可以播放的内容,并在波形编辑器中查看结果。
接下来,我在正弦调用和写入调用之间添加了另一个函数,以添加效果。
通过这种方式,您可以开始使用非常低级别的振荡器和效果器,并且可以立即看到结果。另外,只需很少的代码就可以实现这样的功能。
就我个人而言,我会从最简单的解决方案开始,然后慢慢添加。尝试仅写入文件并使用音频播放器来播放它,这样您就不必处理音频 api。只需使用单个数组即可启动并就地修改。绝对从单线程开始。随着项目的发展,您可以开始转向其他解决方案,例如用管道代替数组、对其进行多线程处理或使用音频 API。
如果您想创建一个可以发布的项目,根据具体内容,您可能必须迁移到更复杂的库,例如一些实时音频处理。但是,当您达到这一点时,通过执行上述简单方法所学到的基础知识肯定会对您有所帮助。
祝你好运!
I don't know if this is really what you're looking for, but this was one of my personal projects while in college. I didn't truly understand how sound and DSP worked until I implemented it myself. I was trying to get as close to the speaker as possible, so I did it using only libsndfile, to handle the file format intricacies for me.
Basically, my first project was to create a large array of doubles, fill it with a sine wave, then use sf_writef_double() to write that array to a file to create something that I could play, and see the result in a waveform editor.
Next, I added another function in between the sine call, and the write call, to add an effect.
This way you start playing with very low-level oscillators and effects, and you can see the results immediately. Plus, it's very little code to get something like this working.
Personally, I would start with the simplest possible solution you can, then slowly add on. Try just writing out to a file and using your audio player to play it, so you don't have to deal with the audio apis. Just use a single array to start, and modify-in-place. Definitely start off single-threaded. As your project grows, you can start moving to other solutions, like pipes instead of the array, multi-threading it, or working with the audio API.
If you're wanting to create a project you can ship, depending on exactly what it is, you'll probably have to move to more complex libraries, like some real-time audio processing. But the basics you learn by doing the simple way above will definitely help when you get to this point.
Good luck!
我已经完成了相当多的实时 DSP,尽管不是音频方面的。虽然您的想法(不可变缓冲区)与(就地修改的可变缓冲区)都可以工作,但我更喜欢做的是为信号路径中的每个链接创建一个永久缓冲区。大多数效果并不适合就地修改,因为每个输入样本都会影响多个输出样本。当您有重采样阶段时,每个链接的缓冲区技术尤其有效。
在这里,当样本到达时,第一个缓冲区被覆盖。然后,第一个过滤器从其输入缓冲区(第一个缓冲区)读取新数据并写入其输出(第二个缓冲区)。然后它调用第二阶段从第二个缓冲区读取并写入第三个缓冲区。
这种模式完全消除了动态分配,允许每个阶段保留可变数量的历史记录(因为效果需要一些内存),并且就能够重新排列路径中的过滤器而言非常灵活。
I've done quite a bit of real-time DSP, although not with audio. While either of your ideas (immutable buffer) vs (mutable buffer modified in place) could work, what I prefer to do is create a single permanent buffer for each link in the signal path. Most effects don't lend themselves well to modification in place, since each input sample affects multiple output samples. The buffer-for-each-link technique works especially well when you have resampling stages.
Here, when samples arrive, the first buffer is overwritten. Then the first filter reads the new data from its input buffer (the first buffer) and writes to its output (the second buffer). Then it invokes the second stage to read from the second buffer and write into the third.
This pattern completely eliminates dynamic allocation, allows each stage to keep a variable amount of history (since effects need some memory), and is very flexible as far as enabling rearranging the filters in the path.
好吧,我也会尝试一下赏金:)
我实际上处于非常相似的情况。我制作电子音乐已经很多年了,但直到最近几年我才开始探索实际的音频处理。
你提到你研究过数学。我认为这很关键。我目前正在努力阅读 Ken Steiglitz 的《数字信号处理入门 - 及其在数字音频和计算机音乐中的应用》。如果您不知道复数和相量,那么这将非常困难。
我是一名 Linux 人员,所以我开始用 C 语言编写 LADSPA 插件。我认为从基础级别开始,真正了解正在发生的事情是很好的。如果我使用的是 Windows,我会从 Steinberg 下载 VST SDK,并编写一个快速的概念证明插件,该插件只会增加噪音或其他什么。
选择 VST 或 LADSPA 等框架的另一个好处是您可以立即在普通音频套件中使用插件。将您的第一个自制插件应用到音轨的满足感是无与伦比的。另外,您将能够与其他音乐家分享您的插件。
在 C#/F# 中可能有多种方法可以做到这一点,但如果您打算编写 VST 插件,我建议您使用 C++,以避免任何不必要的开销。这似乎是行业标准。
在缓冲方面,我一直在使用循环缓冲区(这里有一篇很好的文章:http://www .dspguide.com/ch28/2.htm)。一个很好的练习是实现有限响应滤波器(Steiglitz 称之为前馈滤波器)——它们依赖于缓冲并且使用起来非常有趣。
我在 Github 上有一个存储库,其中包含一些非常基本的 LADSPA 插件。除了架构差异之外,它们对于编写 VST 插件的人也可能有用。 https://github.com/andreasjansson/my_ladspa_plugins
另一个很好的示例代码来源是 CSound 项目。其中有大量 DSP 代码,该软件主要针对音乐家。
Alright, I'll have a stab at the bounty as well then :)
I'm actually in a very similar situation. I've been making electronic music for ages, but only over the past couple of years I've started exploring actual audio processing.
You mention that you have researched the maths. I think that's crucial. I'm currently fighting my way through Ken Steiglitz' A Digital Signal Processing Primer - With Applications to Digital Audio and Computer Music. If you don't know your complex numbers and phasors it's going to be very difficult.
I'm a Linux guy so I've started writing LADSPA plugins in C. I think it's good to start at that basic level, to really understand what's going on. If I was on Windows I'd download the VST SDK from Steinberg and write a quick proof of concept plugin that just adds noise or whatever.
Another benefit of choosing a framework like VST or LADSPA is that you can immediately use your plugins in your normal audio suite. The satisfaction of applying your first home-built plugin to an audio track is unbeatable. Plus, you will be able to share your plugins with other musicians.
There are probably ways to do this in C#/F#, but I would recommend C++ if you plan to write VST plugins, just to avoid any unnecessary overhead. That seems to be the industry standard.
In terms of buffering, I've been using circular buffers (a good article here: http://www.dspguide.com/ch28/2.htm). A good exercise is to implement a finite response filter (what Steiglitz refers to as a feedforward filter) - these rely on buffering and are quite fun to play around with.
I've got a repo on Github with a few very basic LADSPA plugins. The architectural difference aside, they could potentially be useful for someone writing VST plugins as well. https://github.com/andreasjansson/my_ladspa_plugins
Another good source of example code is the CSound project. There's tonnes of DSP code in there, and the software is aimed primarily at musicians.
首先阅读此和这个。
这会让你知道你必须做什么。
然后,学习 DirectShow 架构 - 并学习如何不这样做,而是尝试创建它的简化版本。
Start with reading this and this.
This will give you idea on WHAT you have to do.
Then, learn DirectShow architecture - and learn HOW not to do it, but try to create your simplified version of it.
您可以查看 BYOND。它是一个用 C# 编程音频/MIDI 乐器和效果创建的环境。它可以独立使用,也可以作为 VST 乐器和效果器使用。
全面披露 我是 BYOND 的开发者。
You could have a look at BYOND. It is an environment for programmatic audio / midi instrument and effect creation in C#. It is available as standalone and as VST instru and effect.
FULL DISCLOSURE I am the developer of BYOND.