使用 grep 搜索文件中的十六进制字符串

发布于 2024-11-15 07:10:11 字数 1771 浏览 5 评论 0原文

有谁知道如何使用 grep 或类似工具来检索文件中十六进制字符串的偏移量

我有一堆十六进制转储(来自 GDB),我需要检查字符串,然后再次运行并检查值是否已更改。

我尝试过 hexdumpdd,但问题是因为它是一个流,我丢失了文件的偏移量。

一定有人遇到过这个问题并有解决方法。我能做些什么?

澄清一下:

  • 我有一系列从 GDB 转储的内存区域(通常为几百 MB),
  • 我试图通过搜索存储数字的所有位置来缩小数字范围,然后再次执行此操作并检查是否存储了新值在同一个内存位置。
  • 我无法让 grep 执行任何操作,因为我正在寻找十六进制值,因此我每次尝试(大致上有无数次)它都不会给我正确的输出。
  • 十六进制转储只是完整的二进制文件,模式在浮点值范围内,所以 8?字节?
  • 据我所知,这些模式不是换行的。我知道它会改变什么,我可以执行相同的过程并比较列表以查看哪个匹配。

Perl 可能是一个选择,但在这一点上,我认为我对 bash 及其工具缺乏了解是罪魁祸首。

所需的输出格式

解释我得到的输出有点困难,因为我确实没有得到任何输出。

我期待(并期待)以下内容:

<offset>:<searched value>

这是我通常使用 grep -URbFo得到的相当好的标准输出。 。 >

我尝试过的:

A.问题是,当我尝试搜索十六进制值时,我遇到的问题是如果不搜索十六进制值,所以如果我搜索 00 我应该得到像一百万次点击,因为那总是空格,而是搜索 00 作为文本,因此在十六进制中为 3030。 有什么想法吗?

B. 我可以通过 hexdump 或链接的某些内容强制它,但因为它是一个流,所以它不会给我它在其中找到匹配项的偏移量和文件名。

C. 使用 grep -b 选项似乎没有为了工作,我确实尝试了所有对我的情况有用的标志,但没有任何效果。

D. 以 xxd -u /usr/bin/xxd 为例,我得到了一个有用的输出,但我不能用它来搜索。

0004760: 73CC 6446 161E 266A 3140 5E79 4D37 FDC6  s.dF..&j1@^yM7..
0004770: BF04 0E34 A44E 5BE7 229F 9EEF 5F4F DFFA  ...4.N[."..._O..
0004780: FADE 0C01 0000 000C 0000 0000 0000 0000  ................

很好的输出,正是我想看到的,但在这种情况下它对我不起作用..

E. 以下是我发布此文章后尝试过的一些方法:

xxd -u /usr/bin/xxd | grep 'DF'
00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003  @.........S.....

root# grep -ibH "df" /usr/bin/xxd
Binary file /usr/bin/xxd matches
xxd -u /usr/bin/xxd | grep -H 'DF'
(standard input):00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003  @.........S.....

Does anyone know how to get grep, or similar tool, to retrieve offsets of hex strings in a file?

I have a bunch of hexdumps (from GDB) that I need to check for strings and then run again and check if the value has changed.

I have tried hexdump and dd, but the problem is because it's a stream, I lose my offset for the files.

Someone must have had this problem and a workaround. What can I do?

To clarify:

  • I have a series of dumped memory regions from GDB (typically several hundred MB)
  • I am trying to narrow down a number by searching for all the places the number is stored, then doing it again and checking if the new value is stored at the same memory location.
  • I cannot get grep to do anything because I am looking for hex values so all the times I have tried (like a bazillion, roughly) it will not give me the correct output.
  • The hex dumps are just complete binary files, the paterns are within float values at larges so 8? bytes?
  • The patterns are not line-wrapping, as far as I am aware. I am aware of the what it changes to, and I can do the same process and compare the lists to see which match.

Perl COULD be a option, but at this point, I would assume my lack of knowledge with bash and its tools is the main culprit.

Desired output format

It's a little hard to explain the output I am getting since I really am not getting any output.

I am anticipating (and expecting) something along the lines of:

<offset>:<searched value>

Which is the pretty well standard output I would normally get with grep -URbFo <searchterm> . > <output>

What I tried:

A. Problem is, when I try to search for hex values, I get the problem of if just not searching for the hex values, so if I search for 00 I should get like a million hits, because thats always the blankspace, but instead its searching for 00 as text, so in hex, 3030.
Any idea's?

B. I CAN force it through hexdump or something of the link but because its a stream it will not give me the offsets and filename that it found a match in.

C. Using grep -b option doesnt seem to work either, I did try all the flags that seemed useful to my situation, and nothing worked.

D. Using xxd -u /usr/bin/xxd as an example I get a output that would be useful, but I cannot use that for searching..

0004760: 73CC 6446 161E 266A 3140 5E79 4D37 FDC6  s.dF..&j1@^yM7..
0004770: BF04 0E34 A44E 5BE7 229F 9EEF 5F4F DFFA  ...4.N[."..._O..
0004780: FADE 0C01 0000 000C 0000 0000 0000 0000  ................

Nice output, just what I want to see, but it just doesn't work for me in this situation..

E. Here are some of the things I've tried since posting this:

xxd -u /usr/bin/xxd | grep 'DF'
00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003  @.........S.....

root# grep -ibH "df" /usr/bin/xxd
Binary file /usr/bin/xxd matches
xxd -u /usr/bin/xxd | grep -H 'DF'
(standard input):00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003  @.........S.....

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

他夏了夏天 2024-11-22 07:10:11

这似乎对我有用:

LANG=C grep --only-matching --byte-offset --binary --text --perl-regexp "<\x-hex pattern>" <file>

简短形式:

LANG=C grep -obUaP "<\x-hex pattern>" <file>

示例:

LANG=C grep -obUaP "\x01\x02" /bin/grep

输出(cygwin 二进制):

153: <\x01\x02>
33210: <\x01\x02>
53453: <\x01\x02>

因此您可以再次 grep 来提取偏移量。但不要忘记再次使用二进制模式。

注意:需要 LANG=C 以避免 utf8 编码问题。

This seems to work for me:

LANG=C grep --only-matching --byte-offset --binary --text --perl-regexp "<\x-hex pattern>" <file>

short form:

LANG=C grep -obUaP "<\x-hex pattern>" <file>

Example:

LANG=C grep -obUaP "\x01\x02" /bin/grep

Output (cygwin binary):

153: <\x01\x02>
33210: <\x01\x02>
53453: <\x01\x02>

So you can grep this again to extract offsets. But don't forget to use binary mode again.

Note: LANG=C is needed to avoid utf8 encoding issues.

绾颜 2024-11-22 07:10:11

还有一个非常方便的工具,名为 binwalk,用 python 编写,它提供了二进制模式匹配(并且相当多)除此之外还有很多)。以下是搜索二进制字符串的方法,该字符串以十进制和十六进制输出偏移量(来自文档):

$ binwalk -R "\x00\x01\x02\x03\x04" firmware.bin
DECIMAL     HEX         DESCRIPTION
--------------------------------------------------------------------------
377654      0x5C336     Raw string signature

There's also a pretty handy tool called binwalk, written in python, which provides for binary pattern matching (and quite a lot more besides). Here's how you would search for a binary string, which outputs the offset in decimal and hex (from the docs):

$ binwalk -R "\x00\x01\x02\x03\x04" firmware.bin
DECIMAL     HEX         DESCRIPTION
--------------------------------------------------------------------------
377654      0x5C336     Raw string signature
入画浅相思 2024-11-22 07:10:11

在达到可接受的解决方案之前,我们尝试了几件事:

xxd -u /usr/bin/xxd | grep 'DF'
00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003  @.........S.....


root# grep -ibH "df" /usr/bin/xxd
Binary file /usr/bin/xxd matches
xxd -u /usr/bin/xxd | grep -H 'DF'
(standard input):00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003  @.........S.....

然后发现我们可以获得可用的结果

xxd -u /usr/bin/xxd > /tmp/xxd.hex ; grep -H 'DF' /tmp/xxd

请注意,使用像“DF”这样的简单搜索目标将错误地匹配跨字节边界的字符,即,

xxd -u /usr/bin/xxd | grep 'DF'
00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003  @.........S.....
--------------------^^

因此我们使用 ORed 正则表达式来搜索' DF' OR 'DF '(searchTarget 前面或后面有一个空格字符)。

最终结果似乎是

xxd -u -ps -c 10000000000 DumpFile > DumpFile.hex
egrep ' DF|DF ' Dumpfile.hex

0001020: 0089 0424 8D95 D8F5 FFFF 89F0 E8DF F6FF  ...$............
-----------------------------------------^^
0001220: 0C24 E871 0B00 0083 F8FF 89C3 0F84 DF03  .$.q............
--------------------------------------------^^

We tried several things before arriving at an acceptable solution:

xxd -u /usr/bin/xxd | grep 'DF'
00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003  @.........S.....


root# grep -ibH "df" /usr/bin/xxd
Binary file /usr/bin/xxd matches
xxd -u /usr/bin/xxd | grep -H 'DF'
(standard input):00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003  @.........S.....

Then found we could get usable results with

xxd -u /usr/bin/xxd > /tmp/xxd.hex ; grep -H 'DF' /tmp/xxd

Note that using a simple search target like 'DF' will incorrectly match characters that span across byte boundaries, i.e.

xxd -u /usr/bin/xxd | grep 'DF'
00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003  @.........S.....
--------------------^^

So we use an ORed regexp to search for ' DF' OR 'DF ' (the searchTarget preceded or followed by a space char).

The final result seems to be

xxd -u -ps -c 10000000000 DumpFile > DumpFile.hex
egrep ' DF|DF ' Dumpfile.hex

0001020: 0089 0424 8D95 D8F5 FFFF 89F0 E8DF F6FF  ...$............
-----------------------------------------^^
0001220: 0C24 E871 0B00 0083 F8FF 89C3 0F84 DF03  .$.q............
--------------------------------------------^^
满意归宿 2024-11-22 07:10:11

grep 有一个 -P 开关,允许使用 perl regexp 语法
perl 正则表达式允许使用 \x.. 语法查看字节。

因此,您可以使用以下命令在文件中查找给定的十六进制字符串: grep -aP "\xdf"

但输出不会很有用;确实最好对 hexdump 输出执行正则表达式;

然而,grep -P 对于查找与给定二进制模式匹配的文件很有用。
或者对文本中实际发生的模式进行二进制查询
(例如,请参阅如何正则表达式 CJK 表意文字(utf-8 格式)

grep has a -P switch allowing to use perl regexp syntax
the perl regex allows to look at bytes, using \x.. syntax.

so you can look for a given hex string in a file with: grep -aP "\xdf"

but the outpt won't be very useful; indeed better do a regexp on the hexdump output;

The grep -P can be useful however to just find files matrching a given binary pattern.
Or to do a binary query of a pattern that actually happens in text
(see for example How to regexp CJK ideographs (in utf-8) )

不必了 2024-11-22 07:10:11

我只是使用了这个:

grep -c 

搜索并计算文件中的页面控制字符。

因此,要在输出中包含偏移量:

grep -b -o 

我只是将结果通过管道传递给 less,因为我要查找的字符打印得不好,而且 less清晰地显示结果。
输出示例:

21:^L
23:^L
2005:^L
\x0c' filename

搜索并计算文件中的页面控制字符。

因此,要在输出中包含偏移量:


我只是将结果通过管道传递给 less,因为我要查找的字符打印得不好,而且 less清晰地显示结果。
输出示例:


\x0c' filename | less

我只是将结果通过管道传递给 less,因为我要查找的字符打印得不好,而且 less清晰地显示结果。
输出示例:

\x0c' filename

搜索并计算文件中的页面控制字符。

因此,要在输出中包含偏移量:

我只是将结果通过管道传递给 less,因为我要查找的字符打印得不好,而且 less清晰地显示结果。
输出示例:

I just used this:

grep -c 

To search for and count a page control character in the file..

So to include an offset in the output:

grep -b -o 

I am just piping the result to less because the character I am greping for does not print well and the less displays the results cleanly.
Output example:

21:^L
23:^L
2005:^L
\x0c' filename

To search for and count a page control character in the file..

So to include an offset in the output:


I am just piping the result to less because the character I am greping for does not print well and the less displays the results cleanly.
Output example:


\x0c' filename | less

I am just piping the result to less because the character I am greping for does not print well and the less displays the results cleanly.
Output example:

\x0c' filename

To search for and count a page control character in the file..

So to include an offset in the output:

I am just piping the result to less because the character I am greping for does not print well and the less displays the results cleanly.
Output example:

少女的英雄梦 2024-11-22 07:10:11

如果你想搜索可打印字符串,你可以使用:

strings -ao filename | grep string

strings将从带有偏移量的二进制文件中输出所有可打印字符串,并且grep将在其中搜索。

如果您想搜索任何二进制字符串,这是您的朋友:

If you want search for printable strings, you can use:

strings -ao filename | grep string

strings will output all printable strings from a binary with offsets, and grep will search within.

If you want search for any binary string, here is your friend:

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文