当前位置：文江博客话题详情

创建一个包含隐藏二进制数据的 wav 并读取它（Java）

发布于 2024-11-16 13:42:56 字数 134 浏览 4 评论 0 原文

我愿意做的是将文本字符串转换为高频（18500Hz +）的 wav 文件格式：这将是编码器。并创建一个引擎来从 wav 格式的录音中解码此文本字符串，该引擎将支持错误控制，因为我显然不会使用相同的文件来读取，而是使用此声音的录音。

谢谢

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

笑忘罢 2024-11-23 13:42:56

一个重要的考虑因素是您是否要将字符串隐藏到现有的音频文件中（因此它听起来像普通文件，但有一条编码消息 - 称为隐写术），或者您是否只是为了编码数据而创建一个听起来像乱码的文件。我假设是后者，因为您没有要求在现有文件中隐藏消息。

因此，我假设您不是在寻找有关写入 WAV 文件的低级详细信息（我确信您可以找到有关如何读取单个样本并将其写入 WAV 文件的文档）。显然，最简单的方法是简单地获取源字符串的每个字节，并将其作为样本存储在 WAV 文件中（假设是 8 位录音。如果是 16 位录音，则每个样本可以存储两个字节）如果是立体声 16 位录音，则每个样本可以存储四个字节）。然后您可以读回 WAV 文件并将样本作为字节读回。这是一种简单的方法，但正如您所说，您希望能够录制（大概是模拟的）声音，然后将其读回到 WAV 文件中，并且仍然能够读取数据。

使用上述方法，如果模拟录音不完全完美（又怎么可能完美），您将丢失消息的字节。这意味着您需要以这样的方式存储消息：丢失字节或有轻微错误的字节不会成为问题。如何执行此操作将在很大程度上取决于声音文件将发生何种类型的“损坏”。我预计有两种主要形式的损坏：

“垂直”损坏：样本（字节）的值将比原来的值稍高或稍低。
“水平”损坏：样品可能会被水平平均、拉伸或挤压。从字节的角度来看，这意味着一些样本可能会重复，而另一些样本可能会丢失。

为了解决这个问题，您需要在消息中添加一些冗余。更多冗余意味着消息将占用更多空间（更长），但会更可靠。

我建议考虑一下旧的（移动之前）电话拨号音是如何工作的：每个按键都会生成一个独特的音调并将其通过电线发送。这些音调足够长，音高上的距离也足够远，即使考虑到上述形式的损坏，它们也可以被区分出来。因此，选择两个参数：a) 长度和 b) 频率增量。对于每个数据字节，选择一个频率，将 256 个字节值频率增量赫兹分开。然后，生成该频率的 length 毫秒的正弦波。这比上面每个样本一个字节的方法编码了更多的冗余，因为每个字节占用许多样本，如果丢失一些样本，也没关系。

当您读回它们时，读取每长度毫秒的音频数据，然后估计正弦波的频率。将其映射到频率最接近的字节值。

显然，较长的长度值和较远的频率增量将使信号更可靠，但分别要求声音更长和更高的频率。因此，您必须尝试使用这些值来看看什么有效。

最后一些想法，因为您的标题说“隐藏”二进制数据：

如果您确实希望数据“隐藏”，请考虑在将其编码为音频之前对其进行加密。
如果您想采用隐写术方法，则必须阅读音频隐写术（我想您可以使用上述技术，但您必须将它们作为极低音量信号插入到现有声音之上）。

An important consideration will be whether or not you want to hide the string into an existing audio file (so it sounds like a normal file, but has an encoded message -- that is called steganography), or whether you will just be creating a file that sounds like gibberish, for the purpose of encoding data only. I'm assuming the latter since you didn't ask to hide a message in an existing file.

So I assume you are not looking for low-level details on writing WAV files (I am sure you can find documentation on how to read and write individual samples to a WAV file). Obviously, the simplest approach would be to simply take each byte of the source string, and store it as a sample in the WAV file (assuming an 8-bit recording. If it's a 16-bit recording, you can store two bytes per sample. If it's a stereo 16-bit recording, you can store four bytes per sample). Then you can just read the WAV file back in and read the samples back as bytes. That's the simple approach but as you say, you want to be able to make a (presumably analog) recording of the sound, and then read it back into a WAV file, and still be able to read the data.

With the approach above, if the analog recording is not exactly perfect (and how could it be), you would lose bytes of the message. This means you need to store the message in such a way that missing bytes, or bytes that have a slight error, are not going to be a problem. How you do this will depend highly upon exactly what sort of "damage" will be happening to the sound file. I would expect two major forms of damage:

"Vertical" damage: A sample (byte) would have a slightly higher or lower value than it originally had.
"Horizontal" damage: Samples may be averaged, stretched or squashed horizontally. From a byte perspective, this means some samples may be repeated, while others may be missing.

To combat this, you need some redundancy in the message. More redundancy means the message will take up more space (be longer), but will be more reliable.

I would recommend thinking about how old (pre-mobile) telephone dial tones worked: each key generated a unique tone and sent it across the wire. The tones are long enough, and far enough apart pitch-wise that they can be distinguished even given the above forms of damage. So, choose two parameters: a) length and b) frequency-delta. For each byte of data, select a frequency, spacing the 256 byte values frequency-delta Hertz apart. Then, generate a sine wave for length milliseconds of that frequency. This encodes a lot more redundancy than the above one-byte-per-sample approach, since each byte takes up many samples, and if you lose some samples, it doesn't matter.

When you read them back in, read every length milliseconds of audio data and then estimate the frequency of the sine wave. Map this onto the byte value with the nearest frequency.

Obviously, longer values of length and further-apart frequency-delta will make the signal more reliable, but require the sound to be longer and higher-frequency, respectively. So you will have to play around with these values to see what works.

Some last thoughts, since your title says "hidden" binary data:

If you really want the data to be "hidden", consider encrypting it before encoding it to audio.
If you want to take the steganography approach, you will have to read up on audio steganography (I imagine you can use the above techniques, but you will have to insert them as extremely low-volume signals on top of the existing sound).

回复收藏 0 原文

~没有更多了~

关于作者

梨涡

暂无简介

0 文章

0 评论

23 人气

关注发私信

友情链接

文江博客

创建一个包含隐藏二进制数据的 wav 并读取它（Java）

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

胡图图

zt006

z祗昰~

冰葑

野の

天空

友情链接

创建一个包含隐藏二进制数据的 wav 并读取它（Java）

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

胡图图

zt006

z祗昰~

冰葑

野の

天空

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。