如何在 UNIX TRU64 环境中对文件内的字符串执行递归目录搜索?

发布于 2024-10-01 02:33:49 字数 399 浏览 7 评论 0原文

不幸的是,由于我们的 Unix Tru64 环境的限制,我无法使用 GREP -r 开关在多个目录和子目录的文件中执行字符串搜索。

理想情况下,我想传递两个参数。第一个是我希望搜索开始的目录。第二个是包含所有要搜索的字符串的列表的文件。该列表将包含各种目录路径名,并将包含特殊字符:

即:
/aaa/bbb/ccc
/eee/dddd/ggggggg/
等等。

本练习的目的是识别可能具有我的列表中识别的特定硬编码路径名的所有 shell 脚本。

我在调查过程中发现了一个可能很接近的示例,但我不确定如何自定义它以接受字符串参数文件:

例如: find etb -exec grep test {} \;

其中“etb”是目录,“test”是要搜索的硬编码字符串。

Unfortunately, due to the limitations of our Unix Tru64 environment, I am unable to use the GREP -r switch to perform my search for strings within files across multiple directories and sub directories.

Ideally, I would like to pass two parameters. The first will be the directory I want my search is to start on. The second is a file containing a list of all the strings to be searched. This list will consist of various directory path names and will include special characters:

ie:
/aaa/bbb/ccc
/eee/dddd/ggggggg/
etc..

The purpose of this exercise is to identify all shell scripts that may have specific hard coded path names identified in my list.

There was one example I found during my investigations that perhaps comes close, but I am not sure how to customize this to accept a file of string arguments:

eg: find etb -exec grep test {} \;

where 'etb' is the directory and 'test', a hard coded string to be searched.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

墨离汐 2024-10-08 02:33:49

这应该可以做到:

find dir -type f -exec grep -F -f strings.txt {} \;

dir 是开始搜索的目录

strings.txt 是要匹配的字符串文件,每行一个

-F 表示将搜索字符串视为文字而不是正则表达式

-f strings.txt 表示使用 strings.txt 中的字符串进行匹配

如果您只想匹配文件名,可以将 -l 添加到 grep 开关。

脚注:

有些人更喜欢涉及xargs的解决方案,例如

find dir -type f -print0 | xargs -0 grep -F -f strings.txt

在某些情况下可能更健壮/更高效。

This should do it:

find dir -type f -exec grep -F -f strings.txt {} \;

dir is the directory from which searching will commence

strings.txt is the file of strings to match, one per line

-F means treat search strings as literal rather than regular expressions

-f strings.txt means use the strings in strings.txt for matching

You can add -l to the grep switches if you just want filenames that match.

Footnote:

Some people prefer a solution involving xargs, e.g.

find dir -type f -print0 | xargs -0 grep -F -f strings.txt

which is perhaps a little more robust/efficient in some cases.

如歌彻婉言 2024-10-08 02:33:49

通过阅读,我假设我们无法使用gnu coreutil,并且egrep不可用。
我假设(由于某种原因)系统已损坏,并且转义无法按预期进行。

在正常情况下,grep -rf patternfile.txt /some/dir/ 是正确的方法。

包含要搜索的所有字符串列表的文件

假设:gnu coreutil 不可用。 grep -r 不起作用。特殊字符的处理被破坏。

现在,你可以使用 awk 了吗?不 ?。它让生活变得更加轻松。但为了安全起见。

假设:工作 sedodhexdumpxxd (来自 vim 包)之一可用。

让我们称之为patternfile.txt


1. 将列表转换为grep 喜欢的正则表达式

示例patternfile.txt 包含

/foo/

/酒吧/美国能源部/

/根目录/

(示例不打印特殊字符,但它在那里。)我们必须将其转换为类似

(/foo/|/bar/doe/|/root/)< /code>

假设 echo -en 命令未损坏,且 xxd 、或 odhexdump 可用,

使用 hexdump

catpatternfile.txt |hexdump -ve '1/1 "%02x \n"' |tr -d '\n'

使用 od

catpatternfile.txt |od -A none -t x1|tr -d '\n'

并将其通过管道输入(hexdump 和 od 都很常见)
|sed 的:[ ]*0a[ ]*$::g'|sed 的: 0a:\\|:g' |sed 的:^[ ]*::g'|sed 的:^: :g' |sed 's: :\\x:g'
然后将结果通过管道传输到
|sed 's:^:\\(:g' |sed 's:$:\\):g'
并且您有一个已转义的正则表达式模式。


2. 将转义模式输入到损坏的正则表达式

中假设最小的 shell 转义可用,
我们使用 grep "$(echo -en "ESCAPED_PATTERN" )" 来完成我们的工作。


3. 总结一下

构建转义的正则表达式模式(使用 hexdump 为例)

grep "$(echo -en "$( cat patternfile.txt |hexdump -ve '1/1 "%02x \n"' |tr -d '\n' |sed 's:[ ] *0a[ ]*$::g'|sed 's: 0a:\\|:g' |sed 's:^[ ]*::g'|sed 's:^: :g' |sed 's : :\\x:g'|sed 's:^:\\(:g' |sed 's:$:\\):g')")"

将转义所有字符并用 (|) 将其括起来括号,以便执行正则表达式 OR 匹配。

4. 递归目录查找

在正常情况下,即使 grep -r 损坏,find /dir/ -exec grep {} \; 也应该可以工作。
有些人可能更喜欢安装 xargs(除非您碰巧有错误的 xargs)。
我们更喜欢 find /somedir/ -type f -print0 |xargs -0 grep -f 'patternfile.txt' 方法,但因为
这是不可用的(无论出于何种正当原因),
我们需要对每个文件执行grep,这通常是错误的方式。
但让我们这样做吧。

假设:find -type f 有效。
假设:xargs 已损坏或不可用。

首先,如果您的管道有问题,它可能无法处理大量文件。
因此,我们在此类系统中避免使用 xargs (我知道,我知道,就让我们假装它坏了)。

查找 /whatever/dir/to/start/looking/ -type f > list-of-all-file-to-search-for.txt

如果您的 shell 可以很好地处理大尺寸列表,
cat list-of-all-file-to-search-for.txt 中的文件;执行 grep REGEXP_PATTERN "$file" ;
done ;
是一个很好的方法。不幸的是,有些系统不喜欢这样,
在这种情况下,您可能需要
cat list-of-all-file-to-search-for.txt | cat list-of-all-file-to-search-for.txt | split --help -a 4 -d -l 2000 file-smaller-chunk.part。
把它变成更小的块。现在这是一个严重损坏的系统。
然后是 file-smaller-chunk.part.* 中的文件;对 cat "$file" 中的 single_line 执行操作;执行 grep REGEXP_PATTERN "$single_line" ;完毕 ;完成;
应该有效。

一个
cat filelist.txt |读取文件时;执行 grep REGEXP_PATTERN $file ;完成;
可以用作某些系统上的解决方法。

如果我的 shell 不处理引号怎么办?

您可能必须事先转义文件列表。

awkperl 等中可以做得更好,但是因为我们限制自己
sed,让我们开始吧。
我们假设0x27,“代码”实际上可以工作。
cat list-of-all-file-to-search-for.txt |sed 's@['\'']@'\''\\'\'\''@g'|sed 's :^:'\'':g'|sed 's:$:'\'':g'
我唯一需要使用它的时候是再次将输出输入 bash 时。

如果我的 shell 不能处理这个怎么办?

xargs 失败,grep -r 失败,shell 的 for 循环失败。

我们还有其他事情吗?是的。

转义所有适合您的 shell 的输入,并创建一个脚本。

但你知道吗,我有主板,并且为 csh 编写自动化脚本似乎
错误的。所以我要在这里停下来。

带回家注意

使用该工具完成正确的工作。在 bc 上编写解释器是完美的
有能力,但这完全是错误的。安装 coreutils、perl、更好的 grep
任何。让生活变得更美好。

By reading, I assume we can not use the gnu coreutil, and egrep is not available.
I assume (for some reason) the system is broken, and escapes do not work as expected.

Under normal situations, grep -rf patternfile.txt /some/dir/ is the way to go.

a file containing a list of all the strings to be searched

Assumptions : gnu coreutil not available. grep -r does not work. handling of special character is broken.

Now, you have working awk ? no ?. It makes life so much easier. But lets be on the safe side.

Assume : working sed ,one of od OR hexdump OR xxd (from vim package) is available.

Lets call this patternfile.txt


1. Convert list into a regexp that grep likes

Example patternfile.txt contains

/foo/

/bar/doe/

/root/

(example does not print special char, but it's there.) we must turn it into something like

(/foo/|/bar/doe/|/root/)

Assuming echo -en command is not broken, and xxd , or od, or hexdump is available,

Using hexdump

cat patternfile.txt |hexdump -ve '1/1 "%02x \n"' |tr -d '\n'

Using od

cat patternfile.txt |od -A none -t x1|tr -d '\n'

and pipe it into (common for both hexdump and od)
|sed 's:[ ]*0a[ ]*$::g'|sed 's: 0a:\\|:g' |sed 's:^[ ]*::g'|sed 's:^: :g' |sed 's: :\\x:g'
then pipe result into
|sed 's:^:\\(:g' |sed 's:$:\\):g'
and you have a regexp pattern that is escaped.


2. Feed the escaped pattern into broken regexp

Assuming the bare minimum shell escape is available,
we use grep "$(echo -en "ESCAPED_PATTERN" )" to do our job.


3. To sum it up

Building a escaped regexp pattern (using hexdump as example )

grep "$(echo -en "$( cat patternfile.txt |hexdump -ve '1/1 "%02x \n"' |tr -d '\n' |sed 's:[ ]*0a[ ]*$::g'|sed 's: 0a:\\|:g' |sed 's:^[ ]*::g'|sed 's:^: :g' |sed 's: :\\x:g'|sed 's:^:\\(:g' |sed 's:$:\\):g')")"

will escape all characters and enclose it with (|) brackets so a regexp OR match will be performed.

4. Recrusive directory lookup

Under normal situations, even when grep -r is broken, find /dir/ -exec grep {} \; should work.
Some may prefer xargs instaed (unless you happen to have buggy xargs).
We prefer find /somedir/ -type f -print0 |xargs -0 grep -f 'patternfile.txt' approach, but since
this is not available (for whatever valid reason),
we need to exec grep for each file,and this is normaly the wrong way.
But lets do it.

Assume : find -type f works.
Assume : xargs is broken OR not available.

First, if you have a buggy pipe, it might not handle large number of files.
So we avoid xargs in such systems (i know, i know, just lets pretend it is broken ).

find /whatever/dir/to/start/looking/ -type f > list-of-all-file-to-search-for.txt

IF your shell handles large size lists nicely,
for file in cat list-of-all-file-to-search-for.txt ; do grep REGEXP_PATTERN "$file" ;
done ;
is a nice way to get by. Unfortunetly, some systems do not like that,
and in that case, you may require
cat list-of-all-file-to-search-for.txt | split --help -a 4 -d -l 2000 file-smaller-chunk.part.
to turn it into smaller chunks. Now this is for a seriously broken system.
then a for file in file-smaller-chunk.part.* ; do for single_line in cat "$file" ; do grep REGEXP_PATTERN "$single_line" ; done ; done ;
should work.

A
cat filelist.txt |while read file ; do grep REGEXP_PATTERN $file ; done ;
may be used as workaround on some systems.

What if my shell doe not handle quotes ?

You may have to escape the file list beforehand.

It can be done much nicer in awk, perl, whatever, but since we restrict our selves to
sed, lets do it.
We assume 0x27, the ' code will actually work.
cat list-of-all-file-to-search-for.txt |sed 's@['\'']@'\''\\'\'\''@g'|sed 's:^:'\'':g'|sed 's:$:'\'':g'
The only time I had to use this was when feeding output into bash again.

What if my shell does not handle that ?

xargs fails , grep -r fails , shell's for loop fails.

Do we have other things ? YES.

Escape all input suitable for your shell, and make a script.

But you know what, I got board, and writing automated scripts for csh just seems
wrong. So I am going to stop here.

Take home note

Use the tool for the right job. Writing a interpreter on bc is perfectly
capable, but it is just plain wrong. Install coreutils, perl, a better grep
what ever. makes life a better thing.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文