处理搜索字符串中的元字符

发布于 2024-08-19 07:23:10 字数 534 浏览 9 评论 0原文

我有一个用户输入，将在可能包含元字符的搜索字符串中使用，

对于例如 C# 或 C++，

我在函数中的 grep 命令是：

grep -E "$1|$2" test.txt

在直接替换下：

grep -E "C\+\+|testWord" test.txt
grep -E "C\#|testWord" test.txt

第一个捕获行很好，但不是第二个。奇怪的是，# 被完全忽略了。如果没有直接替换，两者都会用 c 后跟 testWord 而不是 c++ 和 c# 分别捕获任何内容

我尝试使用 sed 处理它

$temp = `echo $1 | sed 's/[\#\!\&\;\`\"\'\|\*\?\~\<\>\^\(\)\[\]\{\}\$\+\\]/\\&/g'`

，但它无法正常工作。或者还有其他方法来处理带有元字符的用户输入吗？

提前致谢

原文

I have a user input that would be used in a search string that may contain a metacharacter

For e.g. C# or C++

my grep command in a function was:

grep -E "$1|$2" test.txt

under direct replacement:

grep -E "C\+\+|testWord" test.txt
grep -E "C\#|testWord" test.txt

the first caught the lines fine but not the second.
Strangely, # was completely ignored.
Without direct replacement, both catch anything with c followed by testWord instead of c++ and c# respectively

I've tried handling it using sed

$temp = `echo $1 | sed 's/[\#\!\&\;\`\"\'\|\*\?\~\<\>\^\(\)\[\]\{\}\$\+\\]/\\&/g'`

but it doesn't work right.
Or is there any other way to handle user input with metacharacters?

Thanks in advance

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

夜雨飘雪 2024-08-26 07:23:10

如果您将输入作为参数传递给脚本

#!/bin/bash

input1="$1"
input2="$2"
while read -r line
do
    case "$line" in
        *$input1*|*$input2* ) echo "found: $line";;
    esac
done  <"BooksDB.txt

“

输出

$ cat file
this is  a line
this line has C++ and C#
this line has only C++ and that's it
this line has only C# and that's it
this is end line Caa

$ ./shell.sh C++ C#
found: this line has C++ and C#
found: this line has only C++ and that's it
found: this line has only C# and that's it

如果您从 read 获取输入

read -p "Enter input1:" input1
read -p "Enter input2:" input2
while read -r line
do
    case "$line" in
        *$input1|*$input2* ) echo "found: $line";;
    esac
done <"BooksDB.txt"

if you are passing the input as arguments to the script

#!/bin/bash

input1="$1"
input2="$2"
while read -r line
do
    case "$line" in
        *$input1*|*$input2* ) echo "found: $line";;
    esac
done  <"BooksDB.txt

output

$ cat file
this is  a line
this line has C++ and C#
this line has only C++ and that's it
this line has only C# and that's it
this is end line Caa

$ ./shell.sh C++ C#
found: this line has C++ and C#
found: this line has only C++ and that's it
found: this line has only C# and that's it

if you are getting input from read

read -p "Enter input1:" input1
read -p "Enter input2:" input2
while read -r line
do
    case "$line" in
        *$input1|*$input2* ) echo "found: $line";;
    esac
done <"BooksDB.txt"

回复收藏 0 原文

帅的被狗咬 2024-08-26 07:23:10

这对我有用：

$ testfun1(){ echo "foo $1" | grep "$1"; }
$ testfun1 C#
foo C#
$ testfun2(){ read a; echo "bar $a" | grep "$a"; }
$ testfun2
C#
bar C#

编辑：

您可以在不使用-E的情况下尝试此表单：

$ testfun3(){ grep "$1\|$2" test.txt; }
$ testfun3 C++ awk
something about C++
blah awk blah
$ testfun3 C# sed
blah sed blah
the text containing C#
$ testfun3 C# C++
something about C++
the text containing C#

This works for me:

$ testfun1(){ echo "foo $1" | grep "$1"; }
$ testfun1 C#
foo C#
$ testfun2(){ read a; echo "bar $a" | grep "$a"; }
$ testfun2
C#
bar C#

Edit:

You might try this form without -E:

$ testfun3(){ grep "$1\|$2" test.txt; }
$ testfun3 C++ awk
something about C++
blah awk blah
$ testfun3 C# sed
blah sed blah
the text containing C#
$ testfun3 C# C++
something about C++
the text containing C#

回复收藏 0 原文

何以笙箫默 2024-08-26 07:23:10

只需引用 $1 和 $2 中的所有 grep 元字符，然后将它们添加到 grep 表达式中即可。

像这样的事情：

quoted1=`echo "$1" | sed -e 's/\([]\.?^${}+*[]\)/\\\\\1/g'`
quoted2=`echo "$2" | sed -e 's/\([]\.?^${}+*[]\)/\\\\\1/g'`
grep -E "$quoted1\|$quoted2" test.txt

应该有效。调整元字符列表以适应。处理|有点棘手，因为反斜杠使它变得特别，但由于我们已经反斜杠反斜杠我认为它是安全的。

Just quote all the grep metacharacters in $1 and $2 before adding them to your grep expression.

Something like this:

quoted1=`echo "$1" | sed -e 's/\([]\.?^${}+*[]\)/\\\\\1/g'`
quoted2=`echo "$2" | sed -e 's/\([]\.?^${}+*[]\)/\\\\\1/g'`
grep -E "$quoted1\|$quoted2" test.txt

ought to work. Adjust the metachar list to suit. Handling | is a little tricky because backslashing makes it special, but since we're already backslashing backslashes I think it's safe.

回复收藏 0 原文

~没有更多了~