在二进制文件中查找模式

发布于 2024-10-20 17:45:56 字数 920 浏览 5 评论 0原文

我正在用 C 开发一个小项目，我必须解析未记录文件格式的二进制文件。由于我对 CI 很陌生，所以有两个问题想问一些更有经验的程序员。

第一个似乎很简单。如何从二进制文件中提取所有字符串并将它们放入数组中？基本上，我正在寻找 C 语言中 strings 程序的简单实现。

当我在任何文本编辑器中打开二进制文件时，我会得到很多垃圾，其中混有一些可读的字符串。我可以使用以下命令提取这些字符串命令行中的字符串。现在我想在 C 中做类似的事情，就像下面的伪代码一样：

while (!EOF) {
     if (string found) {
          put it into array[i]
          i++
       }
     return i;
}

第二个问题有点复杂，我相信这是实现相同目标的正确方法。当我在十六进制编辑器中查看该文件时，很容易注意到一些模式。例如，在每个字符串之前有一个字节值 02 (0x02)，后跟字符串的长度和字符串本身。例如 02 18 52 4F 4F 54 4B 69 57 69 4B 61 4B 69 是一个字符串，字符串部分以粗体显示。

现在我尝试创建的函数将像这样工作：

while(!EOF) {
     for(i=0; i<buffer_size; ++i) {
          if(buffer[i] hex value == 02) {
               int n = read the next byte;
               string = read the next n bytes as char;
               put string into array;
          }
     }
}

感谢您的任何指示。 :)

原文

I'm working on a small project in C where I have to parse a binary file of undocumented file format. As I'm quite new to C I have two questions to some more experienced programmers.

The first seems to be an easy one. How do I extract all the strings from the binary file and put them into an array? Basically I am looking for a simple implementation of strings program in C.

When I open the binary file in any text editor I get a lot of rubbish with some readable strings mixed in. I can extract this strings using strings in the command line. Now I'd like to do something similar in C, like in the pseudocode below:

while (!EOF) {
     if (string found) {
          put it into array[i]
          i++
       }
     return i;
}

The second problem is a little bit more complicated and is, I believe, the proper way of achieving the same thing. When I look at the file in HEX editor it's easy to notice some patterns. For example before each string there is a byte of value 02 (0x02) followed by the length of the string and the string itself. For example 02 18 52 4F 4F 54 4B 69 57 69 4B 61 4B 69 is a string with the string part in bold.

Now the function I'm trying to create would work like this:

while(!EOF) {
     for(i=0; i<buffer_size; ++i) {
          if(buffer[i] hex value == 02) {
               int n = read the next byte;
               string = read the next n bytes as char;
               put string into array;
          }
     }
}

Thanks for any pointers. :)

分享到QQ

分享到微博