假设您有一个 96 kbit mp3,并且您将文件转码为 320 kbit mp3。如何以编程方式检测原始比特率或质量? 生成损失的产生是因为每次应用有损算法时新的信息将被视为“不必要的”并被丢弃。算法如何使用此属性来检测音频的转码。
128 kbps LAME mp3 转码为 320 kbps LAME mp3(I Feel You,Depeche Mode)10.8 MB。
此图像取自 此网站。上面的两条轨道看起来几乎相同,但差异足以支持这个论点。
Lets say you have a 96 kbit mp3 and you Transcode the file into a 320 kbit mp3. How could you programmatically detect the original bit rate or quality? Generation loss is created because each time a lossy algorithm is applied new information will be deemed "unnecessary" and is discarded. How could an algorithm use this property to detect the transcoding of audio.
128 kbps LAME mp3 transcoded to 320 kbps LAME mp3 (I Feel You, Depeche Mode) 10.8 MB.
This image was taken from the bottom of this site. The 2 tracks above look nearly identical, but the difference is enough to support this argument.
发布评论
评论(4)
一种方法是分析信号的频谱。我不确定是否可以确定确切的原始速率,但您绝对可以区分真正的 320 kbps mp3 和转码后的 96 -> 320kbps。 96 kbps mp3 将在 15 kHz 左右进行更高的频率切割。 320 kbps 在大约 18-20 kHz 甚至更高(取决于编码器)时应具有非零值。
One way to do it is to analyze spectrum of the signal. I'm not sure if it's possible to determine the exact original rate, but you can definitely tell between a real 320 kbps mp3 and the transcoded 96 -> 320 kbps. The 96 kbps mp3 will have higher frequencies cut at 15 kHz or so. The 320 kbps should have non-zero at around 18-20 kHz or even higher (that depends on the encoder).
比特率存储在 MPEG 帧头中。除非你用ID3之类的东西存储原始比特率,否则没有简单的方法。
编辑:更新了答案,看起来我误解了原来的问题。
The bit rate is stored in the MPEG frame header. Unless you store the original bit rate with something like ID3, then no easy way.
EDIT: Updated the answer, looks like I misunderstood th original question.
如果您通过将原始 MP3 转换为未压缩格式(如 WAV)然后以更高的比特率重新编码为 MP3 来进行转码,则仅根据转换后的文件将无法确定原始文件的比特率。我认为这个过程可能会产生一些令人难以置信的微妙音频伪影,可以对其进行统计分析,但在我看来,这将是一项相当艰巨的工作,而且不太可能成功。
我不确定是否有可能在不解码和重新编码的情况下提高 MP3 的速率,但即使可能,该过程仍然不会在新文件中保留原始比特率。同样,这个过程可能会产生某种奇怪的、可测量的伪影,这些伪影可能暗示原始比特率,但我对此表示怀疑。
更新:现在我想了一下,也许可以以某种方式检测到这一点,尽管我不知道如何以编程方式执行此操作。人耳可以做出这样的区分(无论如何,其中一些):我可以清楚地区分 128k MP3 和 196k MP3 之间的差异,因此区分 96k 和 320k 将是小菜一碟。已升级编码的 96k MP3 仍将保留 96k 版本中存在的所有音频音损(不幸的是,还包括新的音损)。
但是,我不知道您将如何用代码来确定这一点。如果我必须完成这项工作,我会训练鸽子来完成它(我不是在开玩笑)。
If you're transcoding by converting the original MP3 to an uncompressed format (like WAV) and then re-encoding to MP3 at the higher bitrate, then it would be impossible to determine the original file's bitrate given only the converted file. I suppose this process might produce some incredibly subtle audio artifacts that could be analyzed statistically, but this would be a pretty herculean effort, in my opinion, and unlikely to succeed.
I'm not sure if it's even possible to up-rate an MP3 without decoding and reencoding, but even if it is possible, the process still would not preserve the original bitrate in the new file. Again, this process may produce some kind of weird, measurable artifacts that might hint at the original bitrate, but I doubt it.
Update: now that I think about it, it might be possible somehow to detect this, although I have no idea how to do it programmatically. The human ear can make distinctions like this (some of them, anyway): I can tell the difference clearly between 128k MP3s and 196k MP3s, so discriminating between 96k and 320k would be a piece of cake. A 96k MP3 that had been upcoded would still have all the audio artifacts present in the 96k version (plus new ones, unfortunately).
I don't know how you would go about determining this with code, however. If I had to make this work, I'd train pigeons to do it (and I'm not kidding about that).
您在光谱显示中看到的差异可能主要是由于量化误差造成的。如果您将较低比特率音频文件的位深度(分辨率)最大化,并在上转换(过采样)时保持该位深度,则频谱显示应该更接近地匹配。编码器还可能使用一些抖动来避免由于量化误差而产生的音频伪影。
如果位深度在较低比特率下已经达到最大值,那么添加的点将很明显,您会在波形中看到一些锯齿状边缘。否则,如果有足够的位深度,您将无法确定哪些点是原始的,哪些是添加的。对于高端上变频器来说尤其如此,它们将使用曲线来投影新点,而不是简单地在现有点之间均匀地绘制新点。
根据定义,采样率决定了可能的频率范围,因此正如 Igor 建议的那样,这将是确定原始比特率的最佳选择。
The difference that you see in the spectral display is probably mostly due to quantization error. If you max out the bit depth (resolution) on the lower bitrate audio file, and keep that bit depth when you upconvert (oversample) it, the spectral displays should match more closely. The encoder also probably used some dithering to avoid audio artifacts due to the quantization errors.
If the bit depth were already maxed out at the lower bitrate, then added points will be obvious and you'll see some jagged edges in the waveform. Otherwise, given sufficient bit depth, you won't be able to determine which points were original and which were added. This is especially true of higher end upconverters that will use curves to project the new points instead of simply plotting the new points evenly between the existing ones.
By definition, the sample rate determines the possible frequency range, so this is going to be your best bet in determining the original bitrate, as Igor suggested.