本文目录导读:

- 使用Java标准库(最简单,适用于WAV/AU/AIFF)
- 手动解析WAV文件头(更灵活,不依赖具体API)
- 解码MP3(使用第三方库)
- 使用FFmpeg命令行(最通用,适合生产环境)
- 音频特征提取示例:计算RMS(均方根音量)
- 总结与选择建议
在Java中实现音频解析,通常涉及读取音频文件的二进制数据、解码音频格式(如WAV、MP3、FLAC等)、提取音频特征(如采样率、位深度、声道数)或处理PCM数据。
下面我会介绍几种常见的实现方式,从基础到进阶,并提供一个可运行的WAV解析示例。
使用Java标准库(最简单,适用于WAV/AU/AIFF)
Java的javax.sound.sampled包提供了基本的音频文件读取能力。
示例:读取WAV文件并获取基本属性
import javax.sound.sampled.*;
import java.io.File;
import java.io.IOException;
public class WavParserBasic {
public static void main(String[] args) throws UnsupportedAudioFileException, IOException {
File audioFile = new File("your_audio.wav");
AudioInputStream audioStream = AudioSystem.getAudioInputStream(audioFile);
AudioFormat format = audioStream.getFormat();
System.out.println("=== WAV 文件信息 ===");
System.out.println("采样率: " + format.getSampleRate() + " Hz");
System.out.println("采样位数: " + format.getSampleSizeInBits() + " bit");
System.out.println("声道数: " + format.getChannels());
System.out.println("帧大小: " + format.getFrameSize() + " 字节");
System.out.println("帧率: " + format.getFrameRate() + " fps");
System.out.println("编码方式: " + format.getEncoding());
System.out.println("是否大端序: " + format.isBigEndian());
// 读取完整的PCM数据(如果文件不大)
long frames = audioStream.getFrameLength();
byte[] audioBytes = new byte[(int) (frames * format.getFrameSize())];
audioStream.read(audioBytes);
audioStream.close();
System.out.println("总帧数: " + frames);
System.out.println("总时长: " + (frames / format.getFrameRate() + " 秒"));
// 获取左声道数据(以16位PCM为例)
if (format.getSampleSizeInBits() == 16 && format.getChannels() == 2) {
short[] leftChannel = new short[(int) frames];
for (int i = 0; i < frames; i++) {
int byteIndex = i * 4; // 16bit * 2通道 = 4字节
leftChannel[i] = (short) ((audioBytes[byteIndex + 1] << 8) | (audioBytes[byteIndex] & 0xFF));
}
}
}
}
手动解析WAV文件头(更灵活,不依赖具体API)
如果你需要脱离javax.sound的限制(比如在Android或特定环境中),可以手动解析WAV的RIFF头。
示例:纯字节流解析WAV
import java.io.*;
public class WavParserManual {
public static class WavInfo {
public int sampleRate;
public int bitsPerSample;
public int channels;
public long dataSize;
public byte[] data;
}
public static WavInfo parseWav(File file) throws IOException {
try (RandomAccessFile raf = new RandomAccessFile(file, "r")) {
WavInfo info = new WavInfo();
// 验证RIFF头
byte[] riff = new byte[4];
raf.read(riff);
if (!new String(riff, "ASCII").equals("RIFF")) {
throw new IOException("不是有效的WAV文件");
}
// 文件大小(从当前位置算起)
raf.readInt(); // 跳过文件大小
// 验证WAVE标识
byte[] wave = new byte[4];
raf.read(wave);
if (!new String(wave, "ASCII").equals("WAVE")) {
throw new IOException("不是WAVE格式");
}
// 查找"fmt "块
while (true) {
byte[] chunkID = new byte[4];
raf.read(chunkID);
int chunkSize = readLittleEndianInt(raf);
String id = new String(chunkID, "ASCII");
if (id.equals("fmt ")) {
// 解析格式块
int audioFormat = readLittleEndianShort(raf); // 1=PCM
info.channels = readLittleEndianShort(raf);
info.sampleRate = readLittleEndianInt(raf);
raf.readInt(); // 字节率 (采样率* channels * bits/8)
readLittleEndianShort(raf); // 块对齐
info.bitsPerSample = readLittleEndianShort(raf);
// 如果还有扩展数据,跳过
if (chunkSize > 16) {
raf.skipBytes(chunkSize - 16);
}
} else if (id.equals("data")) {
info.dataSize = chunkSize;
info.data = new byte[(int) info.dataSize];
raf.read(info.data);
break;
} else {
// 跳过其他块
raf.skipBytes(chunkSize);
}
}
return info;
}
}
private static int readLittleEndianInt(RandomAccessFile raf) throws IOException {
byte[] bytes = new byte[4];
raf.read(bytes);
return (bytes[3] & 0xFF) << 24 | (bytes[2] & 0xFF) << 16 |
(bytes[1] & 0xFF) << 8 | (bytes[0] & 0xFF);
}
private static short readLittleEndianShort(RandomAccessFile raf) throws IOException {
byte[] bytes = new byte[2];
raf.read(bytes);
return (short) ((bytes[1] & 0xFF) << 8 | (bytes[0] & 0xFF));
}
public static void main(String[] args) throws IOException {
WavInfo info = parseWav(new File("test.wav"));
System.out.println("采样率: " + info.sampleRate);
System.out.println("位深: " + info.bitsPerSample);
System.out.println("声道数: " + info.channels);
System.out.println("数据大小: " + info.dataSize + " 字节");
}
}
解码MP3(使用第三方库)
Java标准库不支持MP3解析,通常使用:
- JLayer (JavaZoom):轻量级MP3解码库
- FFmpeg:通过JNI调用
- JAudioTagger:用于读取元数据
使用JLayer解码MP3
<!-- Maven依赖 -->
<dependency>
<groupId>javazoom</groupId>
<artifactId>jlayer</artifactId>
<version>1.0.1</version>
</dependency>
import javazoom.jl.decoder.*;
import javazoom.jl.player.*;
import java.io.*;
public class MP3Decoder {
public static void decode(String filePath) {
try {
FileInputStream fis = new FileInputStream(filePath);
Bitstream bitstream = new Bitstream(fis);
Decoder decoder = new Decoder();
Header header = bitstream.readFrame();
if (header == null) return;
System.out.println("采样率: " + header.frequency());
System.out.println("比特率: " + header.bitrate());
System.out.println("声道模式: " + header.mode());
// 逐帧解码(此处只演示头信息)
bitstream.closeFrame();
int frameCount = 0;
SampleBuffer output;
while ((header = bitstream.readFrame()) != null) {
output = (SampleBuffer) decoder.decodeFrame(header, bitstream);
// output.getBuffer() 包含PCM数据
// 你可以在此处处理音频数据
bitstream.closeFrame();
frameCount++;
if (frameCount >= 10) break; // 只读10帧作为示例
}
bitstream.close();
} catch (Exception e) {
e.printStackTrace();
}
}
}
使用FFmpeg命令行(最通用,适合生产环境)
对于复杂的音频格式(FLAC、OGG、AAC等),最可靠的方式是通过FFmpeg转码。
import java.io.BufferedReader;
import java.io.InputStreamReader;
public class FFmpegAudioParser {
public static void parseWithFFmpeg(String inputFile, String outputWav) throws Exception {
// 1. 获取音频信息
ProcessBuilder pbInfo = new ProcessBuilder(
"ffprobe", "-v", "quiet", "-print_format", "json",
"-show_format", "-show_streams", inputFile
);
Process infoProcess = pbInfo.start();
BufferedReader reader = new BufferedReader(
new InputStreamReader(infoProcess.getInputStream())
);
String line;
StringBuilder json = new StringBuilder();
while ((line = reader.readLine()) != null) {
json.append(line);
}
System.out.println("音频信息JSON: " + json.toString());
// 2. 转换为16位PCM WAV
ProcessBuilder pbConvert = new ProcessBuilder(
"ffmpeg", "-i", inputFile,
"-acodec", "pcm_s16le", // 16位小端
"-ar", "44100", // 采样率
"-ac", "2", // 双声道
outputWav
);
Process convertProcess = pbConvert.start();
convertProcess.waitFor();
System.out.println("转换完成: " + outputWav);
// 3. 现在可以用javax.sound解析输出WAV了
}
}
音频特征提取示例:计算RMS(均方根音量)
public class AudioFeatureExtractor {
/**
* 计算音频块的RMS(音量)
* @param pcmData 16位PCM字节数组
* @return 归一化的RMS值 (0~1)
*/
public static double calculateRMS(byte[] pcmData) {
int sampleCount = pcmData.length / 2; // 16位=2字节
double sum = 0;
for (int i = 0; i < pcmData.length; i += 2) {
// 小端16位转换
short sample = (short) ((pcmData[i + 1] << 8) | (pcmData[i] & 0xFF));
sum += sample * sample;
}
double rms = Math.sqrt(sum / sampleCount);
// 归一化到0~1 (16位的最大值是32768)
return rms / 32768.0;
}
public static void main(String[] args) {
// 假设有PCM数据
byte[] audioBlock = new byte[]{/* 你的PCM数据 */};
double volume = calculateRMS(audioBlock);
System.out.println("音量等级: " + (volume * 100) + "%");
}
}
总结与选择建议
| 场景 | 推荐方案 |
|---|---|
| 简单解析WAV/AU | 使用javax.sound.sampled |
| 无依赖的WAV解析 | 手动解析WAV头(如上例) |
| 需要解析MP3/AAC/FLAC | 使用JLayer + FFmpeg |
| 需要批量处理/转码 | 集成FFmpeg(通过命令行或JavaCPP) |
| 实时音频处理 | 使用javax.sound.sampled的TargetDataLine |
| 生产环境/高要求 | 使用TarsosDSP(专门的音频处理库) |
最后提醒:处理真实音频文件时,注意内存管理——大文件不要一次性读取到内存,而是使用流式处理(按块读取解码)。