Java - 读取、操作和写入 WAV 文件

发布于 2024-09-10 10:20:39 字数 166 浏览 6 评论 0原文

在 Java 程序中,将音频文件(WAV 文件)读取到数字数组(float[]short[]、...),并从数字数组写入 WAV 文件?

In a Java program, what is the best way to read an audio file (WAV file) to an array of numbers (float[], short[], ...), and to write a WAV file from an array of numbers?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

情愿 2024-09-17 10:20:39

我通过 AudioInputStream 读取 WAV 文件。 Java 声音教程 中的以下代码片段效果很好。

int totalFramesRead = 0;
File fileIn = new File(somePathName);
// somePathName is a pre-existing string whose value was
// based on a user selection.
try {
  AudioInputStream audioInputStream = 
    AudioSystem.getAudioInputStream(fileIn);
  int bytesPerFrame = 
    audioInputStream.getFormat().getFrameSize();
    if (bytesPerFrame == AudioSystem.NOT_SPECIFIED) {
    // some audio formats may have unspecified frame size
    // in that case we may read any amount of bytes
    bytesPerFrame = 1;
  } 
  // Set an arbitrary buffer size of 1024 frames.
  int numBytes = 1024 * bytesPerFrame; 
  byte[] audioBytes = new byte[numBytes];
  try {
    int numBytesRead = 0;
    int numFramesRead = 0;
    // Try to read numBytes bytes from the file.
    while ((numBytesRead = 
      audioInputStream.read(audioBytes)) != -1) {
      // Calculate the number of frames actually read.
      numFramesRead = numBytesRead / bytesPerFrame;
      totalFramesRead += numFramesRead;
      // Here, do something useful with the audio data that's 
      // now in the audioBytes array...
    }
  } catch (Exception ex) { 
    // Handle the error...
  }
} catch (Exception e) {
  // Handle the error...
}

我发现编写 WAV 相当棘手。从表面上看,这似乎是一个循环问题,写入的命令依赖于 AudioInputStream 作为参数。

但是如何将字节写入 AudioInputStream 呢?难道不应该有一个 AudioOutputStream 吗?

我发现可以定义一个可以访问原始音频字节数据的对象来实现 TargetDataLine

这需要实现许多方法,但大多数方法可以保留虚拟形式,因为将数据写入文件不需要它们。实现的关键方法是read(byte[] buffer, int bufferoffset, int numberofbytestoread)

由于此方法可能会被多次调用,因此还应该有一个实例变量来指示数据的处理进度,并将其更新为上述 read 方法的一部分。

当您实现此方法后,您的对象可用于创建一个新的 AudioInputStream ,该对象又可用于:

AudioSystem.write(yourAudioInputStream, AudioFileFormat.WAV, yourFileDestination)

提醒一下,可以使用 TargetDataLine 作为源来创建 AudioInputStream

至于直接操作数据,我在上面代码片段示例的最内层循环中对缓冲区中的数据进行了很好的操作,audioBytes

当您处于内部循环中时,您可以将字节转换为整数或浮点数,并乘以 volume 值(范围从 0.01.0)然后将它们转换回小端字节。

我相信,既然您可以访问该缓冲区中的一系列样本,您还可以在该阶段使用各种形式的 DSP 滤波算法。根据我的经验,我发现最好直接对此缓冲区中的数据进行音量更改,因为这样您就可以进行尽可能小的增量:每个样本一个增量,从而最大限度地减少由于音量引起的不连续性而导致点击的机会。

我发现 Java 提供的音量“控制线”倾向于出现音量跳跃会导致点击的情况,我相信这是因为增量仅在单个缓冲区读取的粒度上实现(通常在 1 的范围内)每 1024 个样本的变化),而不是将变化分成更小的部分并为每个样本添加一个。但我不知道音量控制是如何实现的,所以请对这个猜想持保留态度。

总而言之,Java.Sound 确实是一个令人头疼的问题。我责怪教程没有包含直接从字节写入文件的明确示例。我责备教程将播放文件编码的最佳示例隐藏在“如何转换...”部分中。然而,该教程中有很多有价值的免费信息。


编辑:2017 年 12 月 13 日

我已经使用以下代码在我自己的项目中从 PCM 文件写入音频。您可以扩展 InputStream 并将其用作 AudioSystem.write 方法的参数,而不是实现 TargetDataLine

public class StereoPcmInputStream extends InputStream
{
    private float[] dataFrames;
    private int framesCounter;
    private int cursor;
    private int[] pcmOut = new int[2];
    private int[] frameBytes = new int[4];
    private int idx;
    
    private int framesToRead;

    public void setDataFrames(float[] dataFrames)
    {
        this.dataFrames = dataFrames;
        framesToRead = dataFrames.length / 2;
    }
    
    @Override
    public int read() throws IOException
    {
        while(available() > 0)
        {
            idx &= 3; 
            if (idx == 0) // set up next frame's worth of data
            {
                framesCounter++; // count elapsing frames

                // scale to 16 bits
                pcmOut[0] = (int)(dataFrames[cursor++] * Short.MAX_VALUE);
                pcmOut[1] = (int)(dataFrames[cursor++] * Short.MAX_VALUE);
            
                // output as unsigned bytes, in range [0..255]
                frameBytes[0] = (char)pcmOut[0];
                frameBytes[1] = (char)(pcmOut[0] >> 8);
                frameBytes[2] = (char)pcmOut[1];
                frameBytes[3] = (char)(pcmOut[1] >> 8);
            
            }
            return frameBytes[idx++]; 
        }
        return -1;
    }

    @Override 
    public int available()
    {
        // NOTE: not concurrency safe.
        // 1st half of sum: there are 4 reads available per frame to be read
        // 2nd half of sum: the # of bytes of the current frame that remain to be read
        return 4 * ((framesToRead - 1) - framesCounter) 
                + (4 - (idx % 4));
    }    

    @Override
    public void reset()
    {
        cursor = 0;
        framesCounter = 0;
        idx = 0;
    }

    @Override
    public void close()
    {
        System.out.println(
            "StereoPcmInputStream stopped after reading frames:" 
                + framesCounter);
    }
}

此处要导出的源数据采用立体声浮点形式,范围从 -1 到 1。生成的流的格式为 16 位、立体声、小端。

对于我的特定应用程序,我省略了 skipmarkSupported 方法。但如果需要的话添加它们应该不难。

I read WAV files via an AudioInputStream. The following snippet from the Java Sound Tutorials works well.

int totalFramesRead = 0;
File fileIn = new File(somePathName);
// somePathName is a pre-existing string whose value was
// based on a user selection.
try {
  AudioInputStream audioInputStream = 
    AudioSystem.getAudioInputStream(fileIn);
  int bytesPerFrame = 
    audioInputStream.getFormat().getFrameSize();
    if (bytesPerFrame == AudioSystem.NOT_SPECIFIED) {
    // some audio formats may have unspecified frame size
    // in that case we may read any amount of bytes
    bytesPerFrame = 1;
  } 
  // Set an arbitrary buffer size of 1024 frames.
  int numBytes = 1024 * bytesPerFrame; 
  byte[] audioBytes = new byte[numBytes];
  try {
    int numBytesRead = 0;
    int numFramesRead = 0;
    // Try to read numBytes bytes from the file.
    while ((numBytesRead = 
      audioInputStream.read(audioBytes)) != -1) {
      // Calculate the number of frames actually read.
      numFramesRead = numBytesRead / bytesPerFrame;
      totalFramesRead += numFramesRead;
      // Here, do something useful with the audio data that's 
      // now in the audioBytes array...
    }
  } catch (Exception ex) { 
    // Handle the error...
  }
} catch (Exception e) {
  // Handle the error...
}

To write a WAV, I found that quite tricky. On the surface it seems like a circular problem, the command that writes relies on an AudioInputStream as a parameter.

But how do you write bytes to an AudioInputStream? Shouldn't there be an AudioOutputStream?

What I found was that one can define an object that has access to the raw audio byte data to implement TargetDataLine.

This requires a lot of methods be implemented, but most can stay in dummy form as they are not required for writing data to a file. The key method to implement is read(byte[] buffer, int bufferoffset, int numberofbytestoread).

As this method will probably be called multiple times, there should also be an instance variable that indicates how far through the data one has progressed, and update that as part of the above read method.

When you have implemented this method, then your object can be used in to create a new AudioInputStream which in turn can be used with:

AudioSystem.write(yourAudioInputStream, AudioFileFormat.WAV, yourFileDestination)

As a reminder, an AudioInputStream can be created with a TargetDataLine as a source.

As to the direct manipulating the data, I have had good success acting on the data in the buffer in the innermost loop of the snippet example above, audioBytes.

While you are in that inner loop, you can convert the bytes to integers or floats and multiply a volume value (ranging from 0.0 to 1.0) and then convert them back to little endian bytes.

I believe since you have access to a series of samples in that buffer you can also engage various forms of DSP filtering algorithms at that stage. In my experience I have found that it is better to do volume changes directly on data in this buffer because then you can make the smallest possible increment: one delta per sample, minimizing the chance of clicks due to volume-induced discontinuities.

I find the "control lines" for volume provided by Java tend to situations where the jumps in volume will cause clicks, and I believe this is because the deltas are only implemented at the granularity of a single buffer read (often in the range of one change per 1024 samples) rather than dividing the change into smaller pieces and adding them one per sample. But I'm not privy to how the Volume Controls were implemented, so please take that conjecture with a grain of salt.

All and all, Java.Sound has been a real headache to figure out. I fault the Tutorial for not including an explicit example of writing a file directly from bytes. I fault the Tutorial for burying the best example of Play a File coding in the "How to Convert..." section. However, there's a LOT of valuable FREE info in that tutorial.


EDIT: 12/13/17

I've since used the following code to write audio from a PCM file in my own projects. Instead of implementing TargetDataLine one can extend InputStream and use that as a parameter to the AudioSystem.write method.

public class StereoPcmInputStream extends InputStream
{
    private float[] dataFrames;
    private int framesCounter;
    private int cursor;
    private int[] pcmOut = new int[2];
    private int[] frameBytes = new int[4];
    private int idx;
    
    private int framesToRead;

    public void setDataFrames(float[] dataFrames)
    {
        this.dataFrames = dataFrames;
        framesToRead = dataFrames.length / 2;
    }
    
    @Override
    public int read() throws IOException
    {
        while(available() > 0)
        {
            idx &= 3; 
            if (idx == 0) // set up next frame's worth of data
            {
                framesCounter++; // count elapsing frames

                // scale to 16 bits
                pcmOut[0] = (int)(dataFrames[cursor++] * Short.MAX_VALUE);
                pcmOut[1] = (int)(dataFrames[cursor++] * Short.MAX_VALUE);
            
                // output as unsigned bytes, in range [0..255]
                frameBytes[0] = (char)pcmOut[0];
                frameBytes[1] = (char)(pcmOut[0] >> 8);
                frameBytes[2] = (char)pcmOut[1];
                frameBytes[3] = (char)(pcmOut[1] >> 8);
            
            }
            return frameBytes[idx++]; 
        }
        return -1;
    }

    @Override 
    public int available()
    {
        // NOTE: not concurrency safe.
        // 1st half of sum: there are 4 reads available per frame to be read
        // 2nd half of sum: the # of bytes of the current frame that remain to be read
        return 4 * ((framesToRead - 1) - framesCounter) 
                + (4 - (idx % 4));
    }    

    @Override
    public void reset()
    {
        cursor = 0;
        framesCounter = 0;
        idx = 0;
    }

    @Override
    public void close()
    {
        System.out.println(
            "StereoPcmInputStream stopped after reading frames:" 
                + framesCounter);
    }
}

The source data to be exported here is in the form of stereo floats ranging from -1 to 1. The format of the resulting stream is 16-bit, stereo, little-endian.

I omitted skip and markSupported methods for my particular application. But it shouldn't be difficult to add them if they are needed.

許願樹丅啲祈禱 2024-09-17 10:20:39

这是直接写入 wav 文件的源代码。
您只需要了解数学和声音工程即可产生您想要的声音。
在此示例中,方程计算双耳节拍。

import javax.sound.sampled.AudioFileFormat;
import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.IOException;

public class Program {
    public static void main(String[] args) throws IOException {
        final double sampleRate = 44100.0;
        final double frequency = 440;
        final double frequency2 = 90;
        final double amplitude = 1.0;
        final double seconds = 2.0;
        final double twoPiF = 2 * Math.PI * frequency;
        final double piF = Math.PI * frequency2;

        float[] buffer = new float[(int)(seconds * sampleRate)];

        for (int sample = 0; sample < buffer.length; sample++) {
            double time = sample / sampleRate;
            buffer[sample] = (float)(amplitude * Math.cos(piF * time) * Math.sin(twoPiF * time));
        }

        final byte[] byteBuffer = new byte[buffer.length * 2];

        int bufferIndex = 0;
        for (int i = 0; i < byteBuffer.length; i++) {
            final int x = (int)(buffer[bufferIndex++] * 32767.0);

            byteBuffer[i++] = (byte)x;
            byteBuffer[i] = (byte)(x >>> 8);
        }

        File out = new File("out10.wav");

        final boolean bigEndian = false;
        final boolean signed = true;

        final int bits = 16;
        final int channels = 1;

        AudioFormat format = new AudioFormat((float)sampleRate, bits, channels, signed, bigEndian);
        ByteArrayInputStream bais = new ByteArrayInputStream(byteBuffer);
        AudioInputStream audioInputStream = new AudioInputStream(bais, format, buffer.length);
        AudioSystem.write(audioInputStream, AudioFileFormat.Type.WAVE, out);
        audioInputStream.close();
    }
}

This is the source code to write directly to a wav file.
You just need to know the mathematics and sound engineering to produce the sound you want.
In this example the equation calculates a binaural beat.

import javax.sound.sampled.AudioFileFormat;
import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.IOException;

public class Program {
    public static void main(String[] args) throws IOException {
        final double sampleRate = 44100.0;
        final double frequency = 440;
        final double frequency2 = 90;
        final double amplitude = 1.0;
        final double seconds = 2.0;
        final double twoPiF = 2 * Math.PI * frequency;
        final double piF = Math.PI * frequency2;

        float[] buffer = new float[(int)(seconds * sampleRate)];

        for (int sample = 0; sample < buffer.length; sample++) {
            double time = sample / sampleRate;
            buffer[sample] = (float)(amplitude * Math.cos(piF * time) * Math.sin(twoPiF * time));
        }

        final byte[] byteBuffer = new byte[buffer.length * 2];

        int bufferIndex = 0;
        for (int i = 0; i < byteBuffer.length; i++) {
            final int x = (int)(buffer[bufferIndex++] * 32767.0);

            byteBuffer[i++] = (byte)x;
            byteBuffer[i] = (byte)(x >>> 8);
        }

        File out = new File("out10.wav");

        final boolean bigEndian = false;
        final boolean signed = true;

        final int bits = 16;
        final int channels = 1;

        AudioFormat format = new AudioFormat((float)sampleRate, bits, channels, signed, bigEndian);
        ByteArrayInputStream bais = new ByteArrayInputStream(byteBuffer);
        AudioInputStream audioInputStream = new AudioInputStream(bais, format, buffer.length);
        AudioSystem.write(audioInputStream, AudioFileFormat.Type.WAVE, out);
        audioInputStream.close();
    }
}
一抹苦笑 2024-09-17 10:20:39

有关您想要实现的目标的更多详细信息将会有所帮助。如果原始 WAV 数据适合您,只需使用 FileInputStream 和扫描仪将其转换为数字即可。但让我尝试为您提供一些有意义的示例代码来帮助您入门:

有一个名为 com.sun.media.sound.WaveFileWriter 的类用于此目的。

InputStream in = ...;
OutputStream out = ...;

AudioInputStream in = AudioSystem.getAudioInputStream(in);

WaveFileWriter writer = new WaveFileWriter();
writer.write(in, AudioFileFormat.Type.WAVE, outStream);

您可以实现自己的 AudioInputStream,它可以执行任何巫术将数字数组转换为音频数据。

writer.write(new VoodooAudioInputStream(numbers), AudioFileFormat.Type.WAVE, outStream);

正如 @stacker 提到的,您当然应该熟悉 API。

Some more detail on what you'd like to achieve would be helpful. If raw WAV data is okay for you, simply use a FileInputStream and probably a Scanner to turn it into numbers. But let me try to give you some meaningful sample code to get you started:

There is a class called com.sun.media.sound.WaveFileWriter for this purpose.

InputStream in = ...;
OutputStream out = ...;

AudioInputStream in = AudioSystem.getAudioInputStream(in);

WaveFileWriter writer = new WaveFileWriter();
writer.write(in, AudioFileFormat.Type.WAVE, outStream);

You could implement your own AudioInputStream that does whatever voodoo to turn your number arrays into audio data.

writer.write(new VoodooAudioInputStream(numbers), AudioFileFormat.Type.WAVE, outStream);

As @stacker mentioned, you should get yourself familiar with the API of course.

阳光的暖冬 2024-09-17 10:20:39

如果您需要访问实际示例值,则 javax.sound.sample 包不适合处理 WAV 文件。该软件包可让您更改音量、采样率等,但如果您想要其他效果(例如添加回声),则只能靠您自己了。 (Java 教程暗示应该可以直接处理示例值,但技术作者夸大了。)

该站点有一个用于处理 WAV 文件的简单类:http://www.labbookpages.co.uk/audio/javaWavFiles.html

The javax.sound.sample package is not suitable for processing WAV files if you need to have access to the actual sample values. The package lets you change volume, sample rate, etc., but if you want other effects (say, adding an echo), you are on your own. (The Java tutorial hints that it should be possible to process the sample values directly, but the tech writer overpromised.)

This site has a simple class for processing WAV files: http://www.labbookpages.co.uk/audio/javaWavFiles.html

我乃一代侩神 2024-09-17 10:20:39

首先,您可能需要了解 WAVE 结构的标题和数据位置,您可以找到规范 此处
请注意,数据是小端字节序。

有一个 API 可以帮助您实现目标。

First of all, you may need to know the headers and data positions of a WAVE structure, you can find the spec here.
Be aware that the data are little endian.

There's an API which may helps you to achieve your goal.

简单 2024-09-17 10:20:39

Wave 文件由 javax.sound.sample package

由于这不是一个简单的 API,因此您应该阅读一篇介绍 API 的文章/教程,例如

Java 声音简介

Wave files are supported by the javax.sound.sample package

Since isn't a trivial API you should read an article / tutorial which introduces the API like

Java Sound, An Introduction

儭儭莪哋寶赑 2024-09-17 10:20:39

我使用 FileInputStream 进行了一些魔法:

    byte[] byteInput = new byte[(int)file.length() - 44];
    short[] input = new short[(int)(byteInput.length / 2f)];


    try{

        FileInputStream fis = new FileInputStream(file);
        fis.read(byteInput, 44, byteInput.length - 45);
        ByteBuffer.wrap(byteInput).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer().get(input);

    }catch(Exception e  ){
        e.printStackTrace();
    }

您的示例值位于 short[] input 中!

I use FileInputStream with some magic:

    byte[] byteInput = new byte[(int)file.length() - 44];
    short[] input = new short[(int)(byteInput.length / 2f)];


    try{

        FileInputStream fis = new FileInputStream(file);
        fis.read(byteInput, 44, byteInput.length - 45);
        ByteBuffer.wrap(byteInput).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer().get(input);

    }catch(Exception e  ){
        e.printStackTrace();
    }

Your sample values are in short[] input!

好久不见√ 2024-09-17 10:20:39

如果有人仍然需要它,我正在开发一个音频框架,旨在解决这个问题和类似的问题。虽然它是在 Kotlin 上。你可以在 GitHub 上找到它: https://github.com/WaveBeans/wavebeans

它看起来像this:

wave("file:///path/to/file.wav")
    .map { it.asInt() } // here it as Sample type, need to convert it to desired type
    .asSequence(44100.0f) // framework processes everything as sequence/stream
    .toList() // read fully
    .toTypedArray() // convert to array

并且它不依赖于 Java Audio。

If anyone still can find it required, there is an audio framework I'm working on that aimed to solve that and similar issues. Though it's on Kotlin. You can find it on GitHub: https://github.com/WaveBeans/wavebeans

It would look like this:

wave("file:///path/to/file.wav")
    .map { it.asInt() } // here it as Sample type, need to convert it to desired type
    .asSequence(44100.0f) // framework processes everything as sequence/stream
    .toList() // read fully
    .toTypedArray() // convert to array

And it's not dependent on Java Audio.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文