使用 grep 搜索文件中的十六进制字符串
有谁知道如何使用 grep 或类似工具来检索文件中十六进制字符串的偏移量?
我有一堆十六进制转储(来自 GDB),我需要检查字符串,然后再次运行并检查值是否已更改。
我尝试过 hexdump
和 dd
,但问题是因为它是一个流,我丢失了文件的偏移量。
一定有人遇到过这个问题并有解决方法。我能做些什么?
澄清一下:
- 我有一系列从 GDB 转储的内存区域(通常为几百 MB),
- 我试图通过搜索存储数字的所有位置来缩小数字范围,然后再次执行此操作并检查是否存储了新值在同一个内存位置。
- 我无法让
grep
执行任何操作,因为我正在寻找十六进制值,因此我每次尝试(大致上有无数次)它都不会给我正确的输出。 - 十六进制转储只是完整的二进制文件,模式在浮点值范围内,所以 8?字节?
- 据我所知,这些模式不是换行的。我知道它会改变什么,我可以执行相同的过程并比较列表以查看哪个匹配。
Perl 可能是一个选择,但在这一点上,我认为我对 bash 及其工具缺乏了解是罪魁祸首。
所需的输出格式
解释我得到的输出有点困难,因为我确实没有得到任何输出。
我期待(并期待)以下内容:
<offset>:<searched value>
这是我通常使用 grep -URbFo
我尝试过的:
A.问题是,当我尝试搜索十六进制值时,我遇到的问题是如果不搜索十六进制值,所以如果我搜索 00 我应该得到像一百万次点击,因为那总是空格,而是搜索 00 作为文本,因此在十六进制中为 3030。 有什么想法吗?
B. 我可以通过 hexdump 或链接的某些内容强制它,但因为它是一个流,所以它不会给我它在其中找到匹配项的偏移量和文件名。
C. 使用 grep -b
选项似乎没有为了工作,我确实尝试了所有对我的情况有用的标志,但没有任何效果。
D. 以 xxd -u /usr/bin/xxd 为例,我得到了一个有用的输出,但我不能用它来搜索。
0004760: 73CC 6446 161E 266A 3140 5E79 4D37 FDC6 s.dF..&j1@^yM7..
0004770: BF04 0E34 A44E 5BE7 229F 9EEF 5F4F DFFA ...4.N[."..._O..
0004780: FADE 0C01 0000 000C 0000 0000 0000 0000 ................
很好的输出,正是我想看到的,但在这种情况下它对我不起作用..
E. 以下是我发布此文章后尝试过的一些方法:
xxd -u /usr/bin/xxd | grep 'DF'
00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003 @.........S.....
root# grep -ibH "df" /usr/bin/xxd
Binary file /usr/bin/xxd matches
xxd -u /usr/bin/xxd | grep -H 'DF'
(standard input):00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003 @.........S.....
Does anyone know how to get grep, or similar tool, to retrieve offsets of hex strings in a file?
I have a bunch of hexdumps (from GDB) that I need to check for strings and then run again and check if the value has changed.
I have tried hexdump
and dd
, but the problem is because it's a stream, I lose my offset for the files.
Someone must have had this problem and a workaround. What can I do?
To clarify:
- I have a series of dumped memory regions from GDB (typically several hundred MB)
- I am trying to narrow down a number by searching for all the places the number is stored, then doing it again and checking if the new value is stored at the same memory location.
- I cannot get
grep
to do anything because I am looking for hex values so all the times I have tried (like a bazillion, roughly) it will not give me the correct output. - The hex dumps are just complete binary files, the paterns are within float values at larges so 8? bytes?
- The patterns are not line-wrapping, as far as I am aware. I am aware of the what it changes to, and I can do the same process and compare the lists to see which match.
Perl COULD be a option, but at this point, I would assume my lack of knowledge with bash and its tools is the main culprit.
Desired output format
It's a little hard to explain the output I am getting since I really am not getting any output.
I am anticipating (and expecting) something along the lines of:
<offset>:<searched value>
Which is the pretty well standard output I would normally get with grep -URbFo <searchterm> . > <output>
What I tried:
A. Problem is, when I try to search for hex values, I get the problem of if just not searching for the hex values, so if I search for 00 I should get like a million hits, because thats always the blankspace, but instead its searching for 00 as text, so in hex, 3030.
Any idea's?
B. I CAN force it through hexdump or something of the link but because its a stream it will not give me the offsets and filename that it found a match in.
C. Using grep -b
option doesnt seem to work either, I did try all the flags that seemed useful to my situation, and nothing worked.
D. Using xxd -u /usr/bin/xxd
as an example I get a output that would be useful, but I cannot use that for searching..
0004760: 73CC 6446 161E 266A 3140 5E79 4D37 FDC6 s.dF..&j1@^yM7..
0004770: BF04 0E34 A44E 5BE7 229F 9EEF 5F4F DFFA ...4.N[."..._O..
0004780: FADE 0C01 0000 000C 0000 0000 0000 0000 ................
Nice output, just what I want to see, but it just doesn't work for me in this situation..
E. Here are some of the things I've tried since posting this:
xxd -u /usr/bin/xxd | grep 'DF'
00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003 @.........S.....
root# grep -ibH "df" /usr/bin/xxd
Binary file /usr/bin/xxd matches
xxd -u /usr/bin/xxd | grep -H 'DF'
(standard input):00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003 @.........S.....
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
这似乎对我有用:
简短形式:
示例:
输出(cygwin 二进制):
因此您可以再次 grep 来提取偏移量。但不要忘记再次使用二进制模式。
注意:需要
LANG=C
以避免 utf8 编码问题。This seems to work for me:
short form:
Example:
Output (cygwin binary):
So you can grep this again to extract offsets. But don't forget to use binary mode again.
Note:
LANG=C
is needed to avoid utf8 encoding issues.还有一个非常方便的工具,名为 binwalk,用 python 编写,它提供了二进制模式匹配(并且相当多)除此之外还有很多)。以下是搜索二进制字符串的方法,该字符串以十进制和十六进制输出偏移量(来自文档):
There's also a pretty handy tool called binwalk, written in python, which provides for binary pattern matching (and quite a lot more besides). Here's how you would search for a binary string, which outputs the offset in decimal and hex (from the docs):
在达到可接受的解决方案之前,我们尝试了几件事:
然后发现我们可以获得可用的结果
请注意,使用像“DF”这样的简单搜索目标将错误地匹配跨字节边界的字符,即,
因此我们使用 ORed 正则表达式来搜索' DF' OR 'DF '(searchTarget 前面或后面有一个空格字符)。
最终结果似乎是
We tried several things before arriving at an acceptable solution:
Then found we could get usable results with
Note that using a simple search target like 'DF' will incorrectly match characters that span across byte boundaries, i.e.
So we use an ORed regexp to search for ' DF' OR 'DF ' (the searchTarget preceded or followed by a space char).
The final result seems to be
grep 有一个 -P 开关,允许使用 perl regexp 语法
perl 正则表达式允许使用 \x.. 语法查看字节。
因此,您可以使用以下命令在文件中查找给定的十六进制字符串: grep -aP "\xdf"
但输出不会很有用;确实最好对 hexdump 输出执行正则表达式;
然而,grep -P 对于查找与给定二进制模式匹配的文件很有用。
或者对文本中实际发生的模式进行二进制查询
(例如,请参阅如何正则表达式 CJK 表意文字(utf-8 格式) )
grep has a -P switch allowing to use perl regexp syntax
the perl regex allows to look at bytes, using \x.. syntax.
so you can look for a given hex string in a file with:
grep -aP "\xdf"
but the outpt won't be very useful; indeed better do a regexp on the hexdump output;
The grep -P can be useful however to just find files matrching a given binary pattern.
Or to do a binary query of a pattern that actually happens in text
(see for example How to regexp CJK ideographs (in utf-8) )
我只是使用了这个:
我只是将结果通过管道传递给 less,因为我要查找的字符打印得不好,而且 less清晰地显示结果。
\x0c' filename输出示例:
搜索并计算文件中的页面控制字符。
因此,要在输出中包含偏移量:
我只是将结果通过管道传递给 less,因为我要查找的字符打印得不好,而且 less清晰地显示结果。
输出示例:
I just used this:
I am just piping the result to less because the character I am greping for does not print well and the less displays the results cleanly.
\x0c' filenameOutput example:
To search for and count a page control character in the file..
So to include an offset in the output:
I am just piping the result to less because the character I am greping for does not print well and the less displays the results cleanly.
Output example:
如果你想搜索可打印字符串,你可以使用:
strings将从带有偏移量的二进制文件中输出所有可打印字符串,并且grep将在其中搜索。
如果您想搜索任何二进制字符串,这是您的朋友:
If you want search for printable strings, you can use:
strings will output all printable strings from a binary with offsets, and grep will search within.
If you want search for any binary string, here is your friend: