Java 中的回声消除

发布于 2024-09-13 10:45:41 字数 1742 浏览 12 评论 0 原文

我正在实现一个使用纯 Java 的 VOIP 应用程序。当用户不使用耳机时(主要是在带有内置麦克风的笔记本电脑上),会出现回声问题。

当前发生的情况

VOIP 应用程序的具体细节只是 Java 媒体框架的简单数据线。本质上,我想在将音频数据写入扬声器进行输出之前对其执行一些数字信号处理。

  public synchronized void addAudioData(byte[] ayAudioData)
  {
    m_oBuffer.enqueue(ayAudioData);
    this.notify();
  }

正如您所看到的,音频数据到达并在缓冲区中排队。这是为了应对危险的连接并允许不同的数据包大小。这也意味着在将音频数据播放到扬声器线路之前,我可以访问任何高级 DSP 操作所需的尽可能多的音频数据。

我已经管理了一个确实有效的回声消除器,但是它需要大量的交互式用户输入,并且我希望有一个自动回声消除器。

手动回声消除器

public static byte[] removeEcho(int iDelaySamples, float fDecay, byte[] aySamples)
  {
    m_awDelayBuffer = new short[iDelaySamples];
    m_aySamples = new byte[aySamples.length];
    m_fDecay = (float) fDecay;
    System.out.println("Removing echo");
    m_iDelayIndex = 0;

    System.out.println("Sample length:\t" + aySamples.length);
    for (int i = 0; i < aySamples.length; i += 2)
    {
      // update the sample
      short wOldSample = getSample(aySamples, i);

      // remove the echo
      short wNewSample = (short) (wOldSample - fDecay * m_awDelayBuffer[m_iDelayIndex]);
      setSample(m_aySamples, i, wNewSample);

      // update the delay buffer
      m_awDelayBuffer[m_iDelayIndex] = wNewSample;
      m_iDelayIndex++;

      if (m_iDelayIndex == m_awDelayBuffer.length)
      {
        m_iDelayIndex = 0;
      }
    }

    return m_aySamples;
  }

自适应过滤器

我读过 自适应过滤器 是正确的选择。具体来说,是最小均方滤波器。但是,我被困住了。上述大多数示例代码都是用 C 和 C++ 编写的,它们不能很好地转换为 Java。

有人对如何用 Java 实现它们有建议吗?任何其他想法也将不胜感激。提前致谢。

I'm implementing a VOIP application that uses pure Java. There is an echo problem that occurs when users do not use headsets (mostly on laptops with built-in microphones).

What currently happens

The nuts and bolts of the VOIP application is just the plain datalines of Java's media framework. Essentially, I'd like to perform some digital signal processing on audio data before I write it to the speaker for output.

  public synchronized void addAudioData(byte[] ayAudioData)
  {
    m_oBuffer.enqueue(ayAudioData);
    this.notify();
  }

As you can see the audio data arrives and is enqueued in a buffer. This is to cater for dodgy connections and to allow for different packet sizes. It also means I have access to as much audio data as I need for any fancy DSP operations before I play the audio data to the speaker line.

I've managed one echo canceller that does work, however it requires a lot of interactive user input and I'd like to have an automatic echo canceller.

Manual echo canceller

public static byte[] removeEcho(int iDelaySamples, float fDecay, byte[] aySamples)
  {
    m_awDelayBuffer = new short[iDelaySamples];
    m_aySamples = new byte[aySamples.length];
    m_fDecay = (float) fDecay;
    System.out.println("Removing echo");
    m_iDelayIndex = 0;

    System.out.println("Sample length:\t" + aySamples.length);
    for (int i = 0; i < aySamples.length; i += 2)
    {
      // update the sample
      short wOldSample = getSample(aySamples, i);

      // remove the echo
      short wNewSample = (short) (wOldSample - fDecay * m_awDelayBuffer[m_iDelayIndex]);
      setSample(m_aySamples, i, wNewSample);

      // update the delay buffer
      m_awDelayBuffer[m_iDelayIndex] = wNewSample;
      m_iDelayIndex++;

      if (m_iDelayIndex == m_awDelayBuffer.length)
      {
        m_iDelayIndex = 0;
      }
    }

    return m_aySamples;
  }

Adaptive filters

I've read that adaptive filters are the way to go. Specifically, a Least Mean Squares filter. However, I'm stuck. Most sample code for the above are in C and C++ and they don't translate well into Java.

Does anyone have advice on how to implement them in Java? Any other ideas would also be greatly appreciated. Thanks in advance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

上课铃就是安魂曲 2024-09-20 10:45:42

已经过去很多年了!希望这是正确的课程,但是你去吧:

/**
 * This filter performs a pre-whitening Normalised Least Means Square on an
 * array of bytes. This does the actual echo cancelling.
 * 
 * Echo cancellation occurs with the following formula:
 * 
 * e = d - X' * W
 * 
 * e represents the echo-free signal. d represents the actual microphone signal
 * with the echo. X' is the transpose of the loudspeaker signal. W is an array
 * of adaptive weights.
 * 
 */
public class cNormalisedLeastMeansSquareFilter
  implements IFilter
{
  private byte[] m_ayEchoFreeSignal;// e
  private byte[] m_ayEchoSignal;// d
  private byte[] m_ayTransposeOfSpeakerSignal;// X'
  private double[] m_adWeights;// W

  /**
   * The transpose and the weights need to be updated before applying the filter
   * to an echo signal again.
   * 
   * @param ayEchoSignal
   * @param ayTransposeOfSpeakerSignal
   * @param adWeights
   */
  public cNormalisedLeastMeansSquareFilter(byte[] ayEchoSignal, byte[] ayTransposeOfSpeakerSignal, double[] adWeights)
  {
    m_ayEchoSignal = ayEchoSignal;
    m_ayTransposeOfSpeakerSignal = ayTransposeOfSpeakerSignal;
    m_adWeights = adWeights;
  }

  @Override
  public byte[] applyFilter(byte[] ayAudioBytes)
  {
    // e = d - X' * W
    m_ayEchoFreeSignal = new byte[ayAudioBytes.length];
    for (int i = 0; i < m_ayEchoFreeSignal.length; ++i)
    {
      m_ayEchoFreeSignal[i] = (byte) (m_ayEchoSignal[i] - m_ayTransposeOfSpeakerSignal[i] * m_adWeights[i]);
    }
    return m_ayEchoFreeSignal;
  }

It's been ages! Hope this is even the right class, but there you go:

/**
 * This filter performs a pre-whitening Normalised Least Means Square on an
 * array of bytes. This does the actual echo cancelling.
 * 
 * Echo cancellation occurs with the following formula:
 * 
 * e = d - X' * W
 * 
 * e represents the echo-free signal. d represents the actual microphone signal
 * with the echo. X' is the transpose of the loudspeaker signal. W is an array
 * of adaptive weights.
 * 
 */
public class cNormalisedLeastMeansSquareFilter
  implements IFilter
{
  private byte[] m_ayEchoFreeSignal;// e
  private byte[] m_ayEchoSignal;// d
  private byte[] m_ayTransposeOfSpeakerSignal;// X'
  private double[] m_adWeights;// W

  /**
   * The transpose and the weights need to be updated before applying the filter
   * to an echo signal again.
   * 
   * @param ayEchoSignal
   * @param ayTransposeOfSpeakerSignal
   * @param adWeights
   */
  public cNormalisedLeastMeansSquareFilter(byte[] ayEchoSignal, byte[] ayTransposeOfSpeakerSignal, double[] adWeights)
  {
    m_ayEchoSignal = ayEchoSignal;
    m_ayTransposeOfSpeakerSignal = ayTransposeOfSpeakerSignal;
    m_adWeights = adWeights;
  }

  @Override
  public byte[] applyFilter(byte[] ayAudioBytes)
  {
    // e = d - X' * W
    m_ayEchoFreeSignal = new byte[ayAudioBytes.length];
    for (int i = 0; i < m_ayEchoFreeSignal.length; ++i)
    {
      m_ayEchoFreeSignal[i] = (byte) (m_ayEchoSignal[i] - m_ayTransposeOfSpeakerSignal[i] * m_adWeights[i]);
    }
    return m_ayEchoFreeSignal;
  }
恋你朝朝暮暮 2024-09-20 10:45:42

如果有人感兴趣,我设法通过基本上转换 它使用归一化最小均方算法和一些从 C 到 Java 的过滤器。 JNI 路线可能仍然是更好的选择,但如果可能的话,我喜欢坚持使用纯 Java。通过了解他们的滤波器如何工作并在 DSP Tutor 上阅读大量有关滤波器的内容,我成功地对消除多少噪音以及如何消除高频等进行一些控制。

一些提示:

  1. 记住从哪里消除什么。我不得不将其切换几次。
  2. 该方法最重要的变量是收敛速度。这是上面链接代码中名为 Stepsize 的变量。
  3. 我一次一个地取出各个组件,弄清楚它们的作用,构建它们并分别测试它们。例如,我使用了双讲检测器并对其进行了测试以确保其正常工作。然后我一个接一个地取出过滤器并在音频文件上测试它们以确保它们工作,然后我取出归一化的最小均方部分并在将它们放在一起之前进行测试。

希望这对其他人有帮助!

In case anyone is interested, I managed to build a fair, working echo canceller by basically converting the Acoustic Echo Cancellation method mentioned by Paul R that uses a Normalised Least Means Square algorithm and a few filters from C into Java. The JNI route is probably still a better way to go, but I like sticking to pure Java if at all possible. By seeing how their filters work and reading up a great deal on filters on DSP Tutor, I managed to gain some control over how much noise gets removed and how to remove high frequencies, etc.

Some tips:

  1. Keep in mind what you remove from where. I had to switch this around a few times.
  2. The most important variable of this method is the convergence rate. This is the variable called Stepsize in the above link's code.
  3. I took the individual components one at a time, figured out what they did, built them and tested them separately. For example, I took the Double Talk Detector and tested it to ensure that it worked. Then I took the filters one by one and tested them on audio files to ensure that they worked, then I took the normalised least means square part and tested it before putting it all together.

Hope this helps someone else!

白馒头 2024-09-20 10:45:42

使用 Speex AEC。它是开源的,用 C 语言编写(与 JNI 一起使用),并且可以工作。
我已经在 2 个不同的 VoIP 应用程序中成功使用它,并且它消除了大部分回声。

Use the Speex AEC. It's open source, it's written in C (use it with JNI), and it works.
I've successfully used it in 2 different VoIP applications and it gets most of the echo cancelled.

失眠症患者 2024-09-20 10:45:42

这是一个非常复杂的领域,要获得可用的 AEC 解决方案,您需要进行大量的研发工作。所有优秀的 AEC 都是专有的,回声消除远不止实现 LMS 等自适应滤波器。我建议您首先使用 MATLAB(或 Octave)开发您的回声消除算法 - 当您有一些看起来与“现实世界”电信工作得相当好的东西时,您可以用 C 实现该算法并实时测试/评估它。一旦工作正常,您就可以使用 JNI 从 Java 调用 C 实现。

This is a very complex area and to get a usable AEC solution working you'll need to do quite a bit of R&D. All the good AECs are proprietary, and there's a lot more to echo cancellation than just implementing an adaptive filter such as LMS. I suggest you develop your echo cancellation algorithm initially using MATLAB (or Octave) - when you have something that appears to work reasonably well with "real world" telecomms then you can implement the algorithm in C and test/evaluate it in real-time. Once this is working you can use JNI to call the C implementation from Java.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文