当前位置：文江博客话题详情

使用 grep 搜索文件中的十六进制字符串

发布于 2024-11-15 07:10:11 字数 1771 浏览 10 评论 0原文

有谁知道如何使用 grep 或类似工具来检索文件中十六进制字符串的偏移量？

我有一堆十六进制转储（来自 GDB），我需要检查字符串，然后再次运行并检查值是否已更改。

我尝试过 hexdump 和 dd，但问题是因为它是一个流，我丢失了文件的偏移量。

一定有人遇到过这个问题并有解决方法。我能做些什么？

澄清一下：

我有一系列从 GDB 转储的内存区域（通常为几百 MB），
我试图通过搜索存储数字的所有位置来缩小数字范围，然后再次执行此操作并检查是否存储了新值在同一个内存位置。
我无法让 grep 执行任何操作，因为我正在寻找十六进制值，因此我每次尝试（大致上有无数次）它都不会给我正确的输出。
十六进制转储只是完整的二进制文件，模式在浮点值范围内，所以 8？字节？
据我所知，这些模式不是换行的。我知道它会改变什么，我可以执行相同的过程并比较列表以查看哪个匹配。

Perl 可能是一个选择，但在这一点上，我认为我对 bash 及其工具缺乏了解是罪魁祸首。

所需的输出格式

解释我得到的输出有点困难，因为我确实没有得到任何输出。

我期待（并期待）以下内容：

<offset>:<searched value>

这是我通常使用 grep -URbFo得到的相当好的标准输出。。 >

我尝试过的：

A.问题是，当我尝试搜索十六进制值时，我遇到的问题是如果不搜索十六进制值，所以如果我搜索 00 我应该得到像一百万次点击，因为那总是空格，而是搜索 00 作为文本，因此在十六进制中为 3030。有什么想法吗？

B. 我可以通过 hexdump 或链接的某些内容强制它，但因为它是一个流，所以它不会给我它在其中找到匹配项的偏移量和文件名。

C. 使用 grep -b 选项似乎没有为了工作，我确实尝试了所有对我的情况有用的标志，但没有任何效果。

D. 以 xxd -u /usr/bin/xxd 为例，我得到了一个有用的输出，但我不能用它来搜索。

0004760: 73CC 6446 161E 266A 3140 5E79 4D37 FDC6  s.dF..&j1@^yM7..
0004770: BF04 0E34 A44E 5BE7 229F 9EEF 5F4F DFFA  ...4.N[."..._O..
0004780: FADE 0C01 0000 000C 0000 0000 0000 0000  ................

很好的输出，正是我想看到的，但在这种情况下它对我不起作用..

E. 以下是我发布此文章后尝试过的一些方法：

xxd -u /usr/bin/xxd | grep 'DF'
00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003  @.........S.....

root# grep -ibH "df" /usr/bin/xxd
Binary file /usr/bin/xxd matches
xxd -u /usr/bin/xxd | grep -H 'DF'
(standard input):00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003  @.........S.....

原文

Does anyone know how to get grep, or similar tool, to retrieve offsets of hex strings in a file?

I have a bunch of hexdumps (from GDB) that I need to check for strings and then run again and check if the value has changed.

I have tried hexdump and dd, but the problem is because it's a stream, I lose my offset for the files.

Someone must have had this problem and a workaround. What can I do?

To clarify:

I have a series of dumped memory regions from GDB (typically several hundred MB)
I am trying to narrow down a number by searching for all the places the number is stored, then doing it again and checking if the new value is stored at the same memory location.
I cannot get grep to do anything because I am looking for hex values so all the times I have tried (like a bazillion, roughly) it will not give me the correct output.
The hex dumps are just complete binary files, the paterns are within float values at larges so 8? bytes?
The patterns are not line-wrapping, as far as I am aware. I am aware of the what it changes to, and I can do the same process and compare the lists to see which match.

Perl COULD be a option, but at this point, I would assume my lack of knowledge with bash and its tools is the main culprit.

Desired output format

It's a little hard to explain the output I am getting since I really am not getting any output.

I am anticipating (and expecting) something along the lines of:

<offset>:<searched value>

Which is the pretty well standard output I would normally get with grep -URbFo <searchterm> . > <output>

What I tried:

A. Problem is, when I try to search for hex values, I get the problem of if just not searching for the hex values, so if I search for 00 I should get like a million hits, because thats always the blankspace, but instead its searching for 00 as text, so in hex, 3030.
Any idea's?

B. I CAN force it through hexdump or something of the link but because its a stream it will not give me the offsets and filename that it found a match in.

C. Using grep -b option doesnt seem to work either, I did try all the flags that seemed useful to my situation, and nothing worked.

D. Using xxd -u /usr/bin/xxd as an example I get a output that would be useful, but I cannot use that for searching..

0004760: 73CC 6446 161E 266A 3140 5E79 4D37 FDC6  s.dF..&j1@^yM7..
0004770: BF04 0E34 A44E 5BE7 229F 9EEF 5F4F DFFA  ...4.N[."..._O..
0004780: FADE 0C01 0000 000C 0000 0000 0000 0000  ................

Nice output, just what I want to see, but it just doesn't work for me in this situation..

E. Here are some of the things I've tried since posting this:

xxd -u /usr/bin/xxd | grep 'DF'
00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003  @.........S.....

root# grep -ibH "df" /usr/bin/xxd
Binary file /usr/bin/xxd matches
xxd -u /usr/bin/xxd | grep -H 'DF'
(standard input):00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003  @.........S.....

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

他夏了夏天 2024-11-22 07:10:11

这似乎对我有用：

LANG=C grep --only-matching --byte-offset --binary --text --perl-regexp "<\x-hex pattern>" <file>

简短形式：

LANG=C grep -obUaP "<\x-hex pattern>" <file>

示例：

LANG=C grep -obUaP "\x01\x02" /bin/grep

输出（cygwin 二进制）：

153: <\x01\x02>
33210: <\x01\x02>
53453: <\x01\x02>

因此您可以再次 grep 来提取偏移量。但不要忘记再次使用二进制模式。

注意：需要 LANG=C 以避免 utf8 编码问题。

This seems to work for me:

LANG=C grep --only-matching --byte-offset --binary --text --perl-regexp "<\x-hex pattern>" <file>

short form:

LANG=C grep -obUaP "<\x-hex pattern>" <file>

Example:

LANG=C grep -obUaP "\x01\x02" /bin/grep

Output (cygwin binary):

153: <\x01\x02>
33210: <\x01\x02>
53453: <\x01\x02>

So you can grep this again to extract offsets. But don't forget to use binary mode again.

Note: LANG=C is needed to avoid utf8 encoding issues.

回复收藏 0 原文

绾颜 2024-11-22 07:10:11

还有一个非常方便的工具，名为 binwalk，用 python 编写，它提供了二进制模式匹配（并且相当多）除此之外还有很多）。以下是搜索二进制字符串的方法，该字符串以十进制和十六进制输出偏移量（来自文档):

$ binwalk -R "\x00\x01\x02\x03\x04" firmware.bin
DECIMAL     HEX         DESCRIPTION
--------------------------------------------------------------------------
377654      0x5C336     Raw string signature

There's also a pretty handy tool called binwalk, written in python, which provides for binary pattern matching (and quite a lot more besides). Here's how you would search for a binary string, which outputs the offset in decimal and hex (from the docs):

$ binwalk -R "\x00\x01\x02\x03\x04" firmware.bin
DECIMAL     HEX         DESCRIPTION
--------------------------------------------------------------------------
377654      0x5C336     Raw string signature

回复收藏 0 原文

入画浅相思 2024-11-22 07:10:11

在达到可接受的解决方案之前，我们尝试了几件事：

xxd -u /usr/bin/xxd | grep 'DF'
00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003  @.........S.....


root# grep -ibH "df" /usr/bin/xxd
Binary file /usr/bin/xxd matches
xxd -u /usr/bin/xxd | grep -H 'DF'
(standard input):00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003  @.........S.....

然后发现我们可以获得可用的结果

xxd -u /usr/bin/xxd > /tmp/xxd.hex ; grep -H 'DF' /tmp/xxd

请注意，使用像“DF”这样的简单搜索目标将错误地匹配跨字节边界的字符，即，

xxd -u /usr/bin/xxd | grep 'DF'
00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003  @.........S.....
--------------------^^

因此我们使用 ORed 正则表达式来搜索' DF' OR 'DF '（searchTarget 前面或后面有一个空格字符）。

最终结果似乎是

xxd -u -ps -c 10000000000 DumpFile > DumpFile.hex
egrep ' DF|DF ' Dumpfile.hex

0001020: 0089 0424 8D95 D8F5 FFFF 89F0 E8DF F6FF  ...$............
-----------------------------------------^^
0001220: 0C24 E871 0B00 0083 F8FF 89C3 0F84 DF03  .$.q............
--------------------------------------------^^

We tried several things before arriving at an acceptable solution:

xxd -u /usr/bin/xxd | grep 'DF'
00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003  @.........S.....


root# grep -ibH "df" /usr/bin/xxd
Binary file /usr/bin/xxd matches
xxd -u /usr/bin/xxd | grep -H 'DF'
(standard input):00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003  @.........S.....

Then found we could get usable results with

xxd -u /usr/bin/xxd > /tmp/xxd.hex ; grep -H 'DF' /tmp/xxd

Note that using a simple search target like 'DF' will incorrectly match characters that span across byte boundaries, i.e.

xxd -u /usr/bin/xxd | grep 'DF'
00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003  @.........S.....
--------------------^^

So we use an ORed regexp to search for ' DF' OR 'DF ' (the searchTarget preceded or followed by a space char).

The final result seems to be

xxd -u -ps -c 10000000000 DumpFile > DumpFile.hex
egrep ' DF|DF ' Dumpfile.hex

0001020: 0089 0424 8D95 D8F5 FFFF 89F0 E8DF F6FF  ...$............
-----------------------------------------^^
0001220: 0C24 E871 0B00 0083 F8FF 89C3 0F84 DF03  .$.q............
--------------------------------------------^^

回复收藏 0 原文

满意归宿 2024-11-22 07:10:11

grep 有一个 -P 开关，允许使用 perl regexp 语法
perl 正则表达式允许使用 \x.. 语法查看字节。

因此，您可以使用以下命令在文件中查找给定的十六进制字符串： grep -aP "\xdf"

但输出不会很有用；确实最好对 hexdump 输出执行正则表达式；

然而，grep -P 对于查找与给定二进制模式匹配的文件很有用。
或者对文本中实际发生的模式进行二进制查询
（例如，请参阅如何正则表达式 CJK 表意文字（utf-8 格式））

回复收藏 0 原文

不必了 2024-11-22 07:10:11

我只是使用了这个：

grep -c 
搜索并计算文件中的页面控制字符。
因此，要在输出中包含偏移量：
grep -b -o 
我只是将结果通过管道传递给 less，因为我要查找的字符打印得不好，而且 less清晰地显示结果。

 输出示例：
21:^L
23:^L
2005:^L

\x0c' filename

搜索并计算文件中的页面控制字符。
因此，要在输出中包含偏移量：

我只是将结果通过管道传递给 less，因为我要查找的字符打印得不好，而且 less清晰地显示结果。

 输出示例：

\x0c' filename | less

我只是将结果通过管道传递给 less，因为我要查找的字符打印得不好，而且 less清晰地显示结果。
输出示例：

\x0c' filename

搜索并计算文件中的页面控制字符。

因此，要在输出中包含偏移量：

我只是将结果通过管道传递给 less，因为我要查找的字符打印得不好，而且 less清晰地显示结果。
输出示例：

I just used this:

grep -c 
To search for and count a page control character in the file..
So to include an offset in the output:
grep -b -o 
I am just piping the result to less because the character I am greping for does not print well and the less displays the results cleanly.

 Output example:
21:^L
23:^L
2005:^L

\x0c' filename

To search for and count a page control character in the file..
So to include an offset in the output:

I am just piping the result to less because the character I am greping for does not print well and the less displays the results cleanly.

 Output example:

\x0c' filename | less

I am just piping the result to less because the character I am greping for does not print well and the less displays the results cleanly.
Output example:

\x0c' filename

To search for and count a page control character in the file..

So to include an offset in the output:

I am just piping the result to less because the character I am greping for does not print well and the less displays the results cleanly.
Output example:

回复收藏 0 原文