“拖尾”使用 bash 基于字符串位置的二进制文件?
我有一堆二进制文件,每个文件都在文件末尾附近包含一个嵌入式字符串,但位于不同的位置(每个文件中仅出现一次)。我需要提取从字符串位置开始直到文件末尾的文件部分并将其转储到新文件中。
例如。如果文件的内容是“AWREDEDEDEXXXERESSDSDS”并且感兴趣的字符串是“XXX”,那么我需要的文件部分是“XXXERESSDSDS”。
在 bash 中执行此操作最简单的方法是什么?
I've got a bunch of binary files, each containing an embedded string near the end of the file but at different places (only occurs once in each file). I need to extract the part of the file starting at the location of the string till the end of the file and dump it into a new file.
eg. If the file's contents is "AWREDEDEDEXXXERESSDSDS" and the string of interest is "XXX", then the part of the file I need is "XXXERESSDSDS".
What's the easiest way to do this in bash?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
在PERL中,内置了一个变量,专门指字符串中匹配正则表达式之后的部分。这就是我会使用的方法。不仅仅是 Bash 和实用程序,PERL 也很常见,所以您应该没问题。
In PERL, there is a variable built in that specifically refers to the part of the string after the matched regular expression. That would be the method I would use. It is not just Bash and utilities, but PERL is so commonly installed that you should be OK.
以下是一个性能不是很好的小型 hack shell 解决方案。但它有效。
编写脚本文件
tail.sh
如下:Call tail.sh INPUTNAME OUTPUTNAME PATTERN
ps: 抱歉忘记了第一篇文章中 grep 的一个选项
Following is a small hack shell solution that is not very performant. But it works.
Write the script file
tail.sh
as follows:Call tail.sh INPUTNAME OUTPUTNAME PATTERN
p.s.: sorry forgot one option to grep in first post
您想要
string
和grep
吗?例如
Would
strings
andgrep
do you want?e.g.
我想出了这个解决方案:
ls -1 *.bin 仅以列表格式打印扩展名为“bin”的文件名
xargs strings -n4 -- radix=d -f 列出文件中的所有字符串及其位置,并在输出中包含文件名
grep "string" 打印包含以下内容的行“string”(每个文件中仅出现一次)
awk '{sub(/:/, ""); print $2 " " $1 " " $1".";}' 去掉字符串添加的文件名后面的冒号,打印字符串的位置、文件名以及带句点的文件名(这个line 用作 split 命令的参数
xargs -l1 split -b 使用 awk 的输出作为其余参数执行每行的 split 命令
rm *.aa 删除分割文件的第一部分。“aa”是分割文件部分的默认后缀。
可能有更好/更快的方法。 /更安全的方法,但这对我的目的来说很好。
I came up with this solution:
ls -1 *.bin Print only the filenames with the extension "bin" in a list format
xargs strings -n4 --radix=d -f List all the strings in the file and their positions and include the filename in the output
grep "string" Print lines containing "string" (it only occurs once in each file)
awk '{sub(/:/, ""); print $2 " " $1 " " $1".";}' Remove the colon after the filename added by strings, and print the position of the string, the filename, and the filename with a period (this line is used as the arguments for the split command
xargs -l1 split -b Execute the split command for each line using the output of awk as the rest of the arguments
rm *.aa Delete the first parts of the split files. "aa" is the default suffix for the part of the split files.
There are probably better/faster/safer ways of doing this but it's fine for my purposes.
试试这个:
由于您有二进制数据,您可能希望将输出重定向到文件。
或者通过
hexdump
或类似的管道进行测试:Try this:
Since you have binary data, you might want to redirect the output to a file.
Or pipe it through
hexdump
or similar for testing: