如何在 UNIX TRU64 环境中对文件内的字符串执行递归目录搜索?
不幸的是,由于我们的 Unix Tru64 环境的限制,我无法使用 GREP -r 开关在多个目录和子目录的文件中执行字符串搜索。
理想情况下,我想传递两个参数。第一个是我希望搜索开始的目录。第二个是包含所有要搜索的字符串的列表的文件。该列表将包含各种目录路径名,并将包含特殊字符:
即:
/aaa/bbb/ccc
/eee/dddd/ggggggg/
等等。
本练习的目的是识别可能具有我的列表中识别的特定硬编码路径名的所有 shell 脚本。
我在调查过程中发现了一个可能很接近的示例,但我不确定如何自定义它以接受字符串参数文件:
例如: find etb -exec grep test {} \;
其中“etb”是目录,“test”是要搜索的硬编码字符串。
Unfortunately, due to the limitations of our Unix Tru64 environment, I am unable to use the GREP -r switch to perform my search for strings within files across multiple directories and sub directories.
Ideally, I would like to pass two parameters. The first will be the directory I want my search is to start on. The second is a file containing a list of all the strings to be searched. This list will consist of various directory path names and will include special characters:
ie:
/aaa/bbb/ccc
/eee/dddd/ggggggg/
etc..
The purpose of this exercise is to identify all shell scripts that may have specific hard coded path names identified in my list.
There was one example I found during my investigations that perhaps comes close, but I am not sure how to customize this to accept a file of string arguments:
eg: find etb -exec grep test {} \;
where 'etb' is the directory and 'test', a hard coded string to be searched.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这应该可以做到:
find dir -type f -exec grep -F -f strings.txt {} \;
dir
是开始搜索的目录strings.txt
是要匹配的字符串文件,每行一个-F
表示将搜索字符串视为文字而不是正则表达式-f strings.txt
表示使用strings.txt
中的字符串进行匹配如果您只想匹配文件名,可以将
-l
添加到 grep 开关。脚注:
有些人更喜欢涉及
xargs
的解决方案,例如find dir -type f -print0 | xargs -0 grep -F -f strings.txt
在某些情况下可能更健壮/更高效。
This should do it:
find dir -type f -exec grep -F -f strings.txt {} \;
dir
is the directory from which searching will commencestrings.txt
is the file of strings to match, one per line-F
means treat search strings as literal rather than regular expressions-f strings.txt
means use the strings instrings.txt
for matchingYou can add
-l
to the grep switches if you just want filenames that match.Footnote:
Some people prefer a solution involving
xargs
, e.g.find dir -type f -print0 | xargs -0 grep -F -f strings.txt
which is perhaps a little more robust/efficient in some cases.
通过阅读,我假设我们无法使用gnu coreutil,并且egrep不可用。
我假设(由于某种原因)系统已损坏,并且转义无法按预期进行。
在正常情况下,
grep -rf patternfile.txt /some/dir/
是正确的方法。假设:gnu coreutil 不可用。 grep -r 不起作用。特殊字符的处理被破坏。
现在,你可以使用 awk 了吗?不 ?。它让生活变得更加轻松。但为了安全起见。
假设:工作
sed
,od
或hexdump
或xxd
(来自 vim 包)之一可用。让我们称之为patternfile.txt
1. 将列表转换为grep 喜欢的正则表达式
示例patternfile.txt 包含
(示例不打印特殊字符,但它在那里。)我们必须将其转换为类似
(/foo/|/bar/doe/|/root/)< /code>
假设
echo -en
命令未损坏,且xxd
、或od
或hexdump
可用,使用 hexdump
catpatternfile.txt |hexdump -ve '1/1 "%02x \n"' |tr -d '\n'
使用 od
catpatternfile.txt |od -A none -t x1|tr -d '\n'
并将其通过管道输入(hexdump 和 od 都很常见)
|sed 的:[ ]*0a[ ]*$::g'|sed 的: 0a:\\|:g' |sed 的:^[ ]*::g'|sed 的:^: :g' |sed 's: :\\x:g'
然后将结果通过管道传输到
|sed 's:^:\\(:g' |sed 's:$:\\):g'
并且您有一个已转义的正则表达式模式。
2. 将转义模式输入到损坏的正则表达式
中假设最小的 shell 转义可用,
我们使用 grep "$(echo -en "ESCAPED_PATTERN" )" 来完成我们的工作。
3. 总结一下
构建转义的正则表达式模式(使用 hexdump 为例)
将转义所有字符并用 (|) 将其括起来括号,以便执行正则表达式 OR 匹配。
4. 递归目录查找
在正常情况下,即使
grep -r
损坏,find /dir/ -exec grep {} \;
也应该可以工作。有些人可能更喜欢安装 xargs(除非您碰巧有错误的 xargs)。
我们更喜欢 find /somedir/ -type f -print0 |xargs -0 grep -f 'patternfile.txt' 方法,但因为
这是不可用的(无论出于何种正当原因),
我们需要对每个文件执行
grep
,这通常是错误的方式。但让我们这样做吧。
假设:
find -type f
有效。假设:
xargs
已损坏或不可用。首先,如果您的管道有问题,它可能无法处理大量文件。
因此,我们在此类系统中避免使用
xargs
(我知道,我知道,就让我们假装它坏了)。查找 /whatever/dir/to/start/looking/ -type f > list-of-all-file-to-search-for.txt
如果您的 shell 可以很好地处理大尺寸列表,
cat list-of-all-file-to-search-for.txt 中的文件;执行 grep REGEXP_PATTERN "$file" ;
是一个很好的方法。不幸的是,有些系统不喜欢这样,done ;
在这种情况下,您可能需要
cat list-of-all-file-to-search-for.txt |
cat list-of-all-file-to-search-for.txt | split --help -a 4 -d -l 2000 file-smaller-chunk.part。
把它变成更小的块。现在这是一个严重损坏的系统。
然后是 file-smaller-chunk.part.* 中的文件;对 cat "$file" 中的 single_line 执行操作;执行 grep REGEXP_PATTERN "$single_line" ;完毕 ;完成;
应该有效。
一个
cat filelist.txt |读取文件时;执行 grep REGEXP_PATTERN $file ;完成;
可以用作某些系统上的解决方法。
如果我的 shell 不处理引号怎么办?
您可能必须事先转义文件列表。
在
awk
、perl
等中可以做得更好,但是因为我们限制自己sed
,让我们开始吧。我们假设
0x27,“代码
”实际上可以工作。cat list-of-all-file-to-search-for.txt |sed 's@['\'']@'\''\\'\'\''@g'|sed 's :^:'\'':g'|sed 's:$:'\'':g'
我唯一需要使用它的时候是再次将输出输入 bash 时。
如果我的 shell 不能处理这个怎么办?
xargs
失败,grep -r
失败,shell 的 for 循环失败。我们还有其他事情吗?是的。
转义所有适合您的 shell 的输入,并创建一个脚本。
但你知道吗,我有主板,并且为 csh 编写自动化脚本似乎
错误的。所以我要在这里停下来。
带回家注意
使用该工具完成正确的工作。在
bc
上编写解释器是完美的有能力,但这完全是错误的。安装 coreutils、perl、更好的 grep
任何。让生活变得更美好。
By reading, I assume we can not use the gnu coreutil, and egrep is not available.
I assume (for some reason) the system is broken, and escapes do not work as expected.
Under normal situations,
grep -rf patternfile.txt /some/dir/
is the way to go.Assumptions : gnu coreutil not available. grep -r does not work. handling of special character is broken.
Now, you have working awk ? no ?. It makes life so much easier. But lets be on the safe side.
Assume : working
sed
,one ofod
ORhexdump
ORxxd
(from vim package) is available.Lets call this patternfile.txt
1. Convert list into a regexp that grep likes
Example patternfile.txt contains
(example does not print special char, but it's there.) we must turn it into something like
(/foo/|/bar/doe/|/root/)
Assuming
echo -en
command is not broken, andxxd
, orod
, orhexdump
is available,Using hexdump
cat patternfile.txt |hexdump -ve '1/1 "%02x \n"' |tr -d '\n'
Using od
cat patternfile.txt |od -A none -t x1|tr -d '\n'
and pipe it into (common for both hexdump and od)
|sed 's:[ ]*0a[ ]*$::g'|sed 's: 0a:\\|:g' |sed 's:^[ ]*::g'|sed 's:^: :g' |sed 's: :\\x:g'
then pipe result into
|sed 's:^:\\(:g' |sed 's:$:\\):g'
and you have a regexp pattern that is escaped.
2. Feed the escaped pattern into broken regexp
Assuming the bare minimum shell escape is available,
we use
grep "$(echo -en "ESCAPED_PATTERN" )"
to do our job.3. To sum it up
Building a escaped regexp pattern (using hexdump as example )
will escape all characters and enclose it with (|) brackets so a regexp OR match will be performed.
4. Recrusive directory lookup
Under normal situations, even when
grep -r
is broken,find /dir/ -exec grep {} \;
should work.Some may prefer
xargs
instaed (unless you happen to have buggy xargs).We prefer
find /somedir/ -type f -print0 |xargs -0 grep -f 'patternfile.txt'
approach, but sincethis is not available (for whatever valid reason),
we need to exec
grep
for each file,and this is normaly the wrong way.But lets do it.
Assume :
find -type f
works.Assume :
xargs
is broken OR not available.First, if you have a buggy pipe, it might not handle large number of files.
So we avoid
xargs
in such systems (i know, i know, just lets pretend it is broken ).find /whatever/dir/to/start/looking/ -type f > list-of-all-file-to-search-for.txt
IF your shell handles large size lists nicely,
for file in cat list-of-all-file-to-search-for.txt ; do grep REGEXP_PATTERN "$file" ;
is a nice way to get by. Unfortunetly, some systems do not like that,done ;
and in that case, you may require
cat list-of-all-file-to-search-for.txt | split --help -a 4 -d -l 2000 file-smaller-chunk.part.
to turn it into smaller chunks. Now this is for a seriously broken system.
then a
for file in file-smaller-chunk.part.* ; do for single_line in cat "$file" ; do grep REGEXP_PATTERN "$single_line" ; done ; done ;
should work.
A
cat filelist.txt |while read file ; do grep REGEXP_PATTERN $file ; done ;
may be used as workaround on some systems.
What if my shell doe not handle quotes ?
You may have to escape the file list beforehand.
It can be done much nicer in
awk
,perl
, whatever, but since we restrict our selves tosed
, lets do it.We assume
0x27, the ' code
will actually work.cat list-of-all-file-to-search-for.txt |sed 's@['\'']@'\''\\'\'\''@g'|sed 's:^:'\'':g'|sed 's:$:'\'':g'
The only time I had to use this was when feeding output into bash again.
What if my shell does not handle that ?
xargs
fails ,grep -r
fails , shell's for loop fails.Do we have other things ? YES.
Escape all input suitable for your shell, and make a script.
But you know what, I got board, and writing automated scripts for csh just seems
wrong. So I am going to stop here.
Take home note
Use the tool for the right job. Writing a interpreter on
bc
is perfectlycapable, but it is just plain wrong. Install coreutils,
perl
, a bettergrep
what ever. makes life a better thing.