流音频中的降噪和压缩

发布于 2024-09-05 00:13:28 字数 4232 浏览 18 评论 0原文

希望你能帮忙。我正在从麦克风录制音频并通过网络实时传输。样本的质量为 11025hz、8 位、单声道。虽然有一点延迟（1秒），但效果很好。我需要帮助的是我现在正在尝试实现降噪和压缩，以使音频更安静并使用更少的带宽。音频样本存储在 C# 的 bytes[] 数组中，我使用 Socket 发送/接收该数组。

谁能建议如何在 C# 中实现压缩和降噪？我不介意使用第三方库，只要它是免费的（LGPL 许可证等）并且可以从 C# 中使用。但是，我更喜欢实际工作的源代码示例。预先感谢您的任何建议。

更新：

我将位大小从 8 位音频更改为 16 位音频，噪音问题得到解决。显然，麦克风的 8 位音频信噪比太低。 11khz、16 位单声道的声音听起来很棒。

然而，自从我发布此内容以来，该项目的要求已经发生了变化。我们现在也正在尝试添加视频。我有一个回调设置，每 100 毫秒从网络摄像头接收实时图像。我需要对音频和视频进行编码，混合它们，在我的套接字上将它们传输到服务器，服务器将流重新传输到另一个客户端，该客户端接收流，解复用流并解码音频和视频，显示图片框中的视频并将音频输出到扬声器。

我正在寻找 ffmpeg 来帮助进行（de|en）编码/[de]复用，并且我还在寻找 SharpFFmpeg 作为 ffmpeg 的 C# 互操作库。

我找不到任何这样做的好例子。我整个星期都在互联网上搜索，但没有真正的运气。非常感谢您提供的任何帮助！

这是一些代码，包括麦克风录音的回调函数：

        private const int AUDIO_FREQ = 11025;
        private const int CHANNELS = 1;
        private const int BITS = 16;
        private const int BYTES_PER_SEC = AUDIO_FREQ * CHANNELS * (BITS / 8);
        private const int BLOCKS_PER_SEC = 40;
        private const int BUFFER_SECS = 1;
        private const int BUF_SIZE = ((int)(BYTES_PER_SEC / BLOCKS_PER_SEC * BUFFER_SECS / 2)) * 2; // rounded to nearest EVEN number

        private WaveLib.WaveOutPlayer m_Player;
        private WaveLib.WaveInRecorder m_Recorder;
        private WaveLib.FifoStream m_Fifo;

        WebCam MyWebCam;

        public void OnPickupHeadset()
        {
            stopRingTone();
            m_Fifo = new WaveLib.FifoStream();

            WaveLib.WaveFormat fmt = new WaveLib.WaveFormat(AUDIO_FREQ, BITS, CHANNELS);
            m_Player = new WaveLib.WaveOutPlayer(-1, fmt, BUF_SIZE, BLOCKS_PER_SEC,
                            new WaveLib.BufferFillEventHandler(PlayerCB));
            m_Recorder = new WaveLib.WaveInRecorder(-1, fmt, BUF_SIZE, BLOCKS_PER_SEC,
                            new WaveLib.BufferDoneEventHandler(RecorderCB));

            MyWebCam = null;
            try
            {
                MyWebCam = new WebCam();                
                MyWebCam.InitializeWebCam(ref pbMyPhoto, pbPhoto.Width, pbPhoto.Height);
                MyWebCam.Start();
            }
            catch { }

        }

        private byte[] m_PlayBuffer;
        private void PlayerCB(IntPtr data, int size)
        {
            try
            {
                if (m_PlayBuffer == null || m_PlayBuffer.Length != size)
                    m_PlayBuffer = new byte[size];

                if (m_Fifo.Length >= size)
                {
                    m_Fifo.Read(m_PlayBuffer, 0, size);
                }
                else
                {
                    // Read what we can 
                    int fifoLength = (int)m_Fifo.Length;
                    m_Fifo.Read(m_PlayBuffer, 0, fifoLength);

                    // Zero out rest of buffer
                    for (int i = fifoLength; i < m_PlayBuffer.Length; i++)
                        m_PlayBuffer[i] = 0;                        
                }

                // Return the play buffer
                Marshal.Copy(m_PlayBuffer, 0, data, size);
            }
            catch { }
        }


        private byte[] m_RecBuffer;
        private void RecorderCB(IntPtr data, int size)
        {
            try
            {
                if (m_RecBuffer == null || m_RecBuffer.Length != size)
                    m_RecBuffer = new byte[size];
                Marshal.Copy(data, m_RecBuffer, 0, size);

                // HERE'S WHERE I WOULD ENCODE THE AUDIO IF I KNEW HOW

                // Send data to server
                if (theForm.CallClient != null)
                {
                    SocketAsyncEventArgs args = new SocketAsyncEventArgs();
                    args.SetBuffer(m_RecBuffer, 0, m_RecBuffer.Length);
                    theForm.CallClient.SendAsync(args);
                }
            }
            catch { }
        }

        //Called from network stack when data received from server (other client)
        public void PlayBuffer(byte[] buffer, int length)
        {
            try
            {
                //HERE'S WHERE I WOULD DECODE THE AUDIO IF I KNEW HOW

                m_Fifo.Write(buffer, 0, length); 
            }
            catch { }
        }

那么我应该从这里去哪里？

原文

hope you can help. I am recording audio from a microphone and streaming it live across a network. The quality of the samples is 11025hz, 8 bit, mono. Although there is a small delay (1 second), it works great. What I need help with is I am trying to now implement noise reduction and compression, to make the audio quieter and use less bandwidth. The audio samples are stored in a C# array of bytes[], which I am sending/receiving using Socket.

Could anyone suggest how, in C#, to implement compression and noise reduction? I do not mind using a third party library as long as it is free (LGPL license, etc) and can be utilized from C#. However, I would prefer actual working source code examples. Thanks in advance for any suggestion you have.

UPDATE:

I changed the bit size from 8 bit audio to 16 bit audio and the noise problem is fixed. Apprarently 8 bit audio from mic had too low signal-to-noise ratio. Voice sounds great at 11khz, 16 bit mono.

The requirements of this project have changed since I posted this, however. We are now trying to add video as well. I have a callback setup that receives live images every 100ms from a webcam. I need to encode the audio and video, mux them, transmit them on my socket to the server, the server re-transmits the stream to the other client, which receives the stream, demuxes the stream and decodes the audio and video, displays the video in a picture box and outputs the audio to the speaker.

I am looking at ffmpeg to help out with the (de|en)coding/[de]muxing, and I am also looking at SharpFFmpeg as a C# interop library to ffmpeg.

I cannot find any good examples of doing this. I have scoured the Internet all week, with no real luck. Any help you can provide is much appreciated!

Here's some code, including my call back function for the mic recording:

        private const int AUDIO_FREQ = 11025;
        private const int CHANNELS = 1;
        private const int BITS = 16;
        private const int BYTES_PER_SEC = AUDIO_FREQ * CHANNELS * (BITS / 8);
        private const int BLOCKS_PER_SEC = 40;
        private const int BUFFER_SECS = 1;
        private const int BUF_SIZE = ((int)(BYTES_PER_SEC / BLOCKS_PER_SEC * BUFFER_SECS / 2)) * 2; // rounded to nearest EVEN number

        private WaveLib.WaveOutPlayer m_Player;
        private WaveLib.WaveInRecorder m_Recorder;
        private WaveLib.FifoStream m_Fifo;

        WebCam MyWebCam;

        public void OnPickupHeadset()
        {
            stopRingTone();
            m_Fifo = new WaveLib.FifoStream();

            WaveLib.WaveFormat fmt = new WaveLib.WaveFormat(AUDIO_FREQ, BITS, CHANNELS);
            m_Player = new WaveLib.WaveOutPlayer(-1, fmt, BUF_SIZE, BLOCKS_PER_SEC,
                            new WaveLib.BufferFillEventHandler(PlayerCB));
            m_Recorder = new WaveLib.WaveInRecorder(-1, fmt, BUF_SIZE, BLOCKS_PER_SEC,
                            new WaveLib.BufferDoneEventHandler(RecorderCB));

            MyWebCam = null;
            try
            {
                MyWebCam = new WebCam();                
                MyWebCam.InitializeWebCam(ref pbMyPhoto, pbPhoto.Width, pbPhoto.Height);
                MyWebCam.Start();
            }
            catch { }

        }

        private byte[] m_PlayBuffer;
        private void PlayerCB(IntPtr data, int size)
        {
            try
            {
                if (m_PlayBuffer == null || m_PlayBuffer.Length != size)
                    m_PlayBuffer = new byte[size];

                if (m_Fifo.Length >= size)
                {
                    m_Fifo.Read(m_PlayBuffer, 0, size);
                }
                else
                {
                    // Read what we can 
                    int fifoLength = (int)m_Fifo.Length;
                    m_Fifo.Read(m_PlayBuffer, 0, fifoLength);

                    // Zero out rest of buffer
                    for (int i = fifoLength; i < m_PlayBuffer.Length; i++)
                        m_PlayBuffer[i] = 0;                        
                }

                // Return the play buffer
                Marshal.Copy(m_PlayBuffer, 0, data, size);
            }
            catch { }
        }


        private byte[] m_RecBuffer;
        private void RecorderCB(IntPtr data, int size)
        {
            try
            {
                if (m_RecBuffer == null || m_RecBuffer.Length != size)
                    m_RecBuffer = new byte[size];
                Marshal.Copy(data, m_RecBuffer, 0, size);

                // HERE'S WHERE I WOULD ENCODE THE AUDIO IF I KNEW HOW

                // Send data to server
                if (theForm.CallClient != null)
                {
                    SocketAsyncEventArgs args = new SocketAsyncEventArgs();
                    args.SetBuffer(m_RecBuffer, 0, m_RecBuffer.Length);
                    theForm.CallClient.SendAsync(args);
                }
            }
            catch { }
        }

        //Called from network stack when data received from server (other client)
        public void PlayBuffer(byte[] buffer, int length)
        {
            try
            {
                //HERE'S WHERE I WOULD DECODE THE AUDIO IF I KNEW HOW

                m_Fifo.Write(buffer, 0, length); 
            }
            catch { }
        }

So where should I go from here?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

澉约 2024-09-12 00:13:28

你们的目标是相互排斥的。 11025Hz/8bit/Mono WAV 文件听起来很吵（带有大量“嘶嘶声”）的原因是它们的采样率和位分辨率较低（44100Hz/16bit/Stereo 是 CD 品质音频的标准）。

如果您继续以该速率进行录制和流式传输，您将会听到嘈杂的音频 - 就这样。消除（或实际上只是减弱）这种噪音的唯一方法是将音频上采样到 44100Hz/16 位，然后对其执行某种降噪算法。这种上采样必须由客户端应用程序执行，因为在流式传输之前在服务器上执行此操作意味着您将流式传输比原始音频大 8 倍的音频（在服务器上执行此操作也是完全没有意义的，因为您会最好一开始就以更密集的格式进行录制）。

您想要做的是以 CD 质量格式录制原始音频，然后将其压缩为 MP3 或 Ogg Vorbis 等标准格式。请参阅之前的问题：

什么是 .NET 的最佳音频压缩库？< /a>

更新： 我没有使用过这个，但是：

http:// /www.ohloh.net/p/OggVorbisDecoder

我认为您需要一个编码器，但我找不到适用于 Ogg Vorbis 的编码器。我认为您也可以尝试编码为 WMV 格式：

http://www.discussweb.com/c-programming/1728-encoding-wmv-file-c-net.html

更新2： 抱歉，我的知识水平流媒体的流量相当低。如果我正在做类似你正在做的事情，我会首先从音频和静态图像创建一个（未压缩的）AVI 文件（通过 PInvoke 使用 avifil32.dll 方法），然后将其压缩为MPEG（或任何标准格式 - YouTube 有一个页面，他们在其中讨论他们的首选格式，并且使用其中一种可能会很好）。

我不确定这是否能满足您的需要，但是此链接：

http://csharpmagics.blogspot.com/

使用这个免费播放器：

http://www.videolan.org/

可能会起作用。