返回匹配但不完全相同的字符串

发布于 2025-01-25 01:26:15 字数 227 浏览 1 评论 0原文

是否有任何方法可以找到一个包含给定字符串的单词，但不是确切的匹配。例如，

# cat t.txt
first line
ind is a shortform of india

我试图返回“印度”一词，因为它包含字符串“ ind”，但我不需要确切的匹配。我尝试了...

# grep -o 'ind' t.txt
ind
ind

原文

Is there any way to find a word that contains a given string but is not the exact match. For e.g.

# cat t.txt
first line
ind is a shortform of india

I am trying to return the word "india" because it contains the string "ind" but I do not need the exact match. I have tried this...

# grep -o 'ind' t.txt
ind
ind

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

疏忽 2025-02-01 01:26:15

您能否尝试以下操作：

grep -Eo '[A-Za-z]+ind|ind[A-Za-z]+' t.txt

输出：

india

REGEX [A-ZA-Z]+IND | IND | IND [A-ZA-Z]+匹配Ind包括上述或以下字母。

Would you please try the following:

grep -Eo '[A-Za-z]+ind|ind[A-Za-z]+' t.txt

Output:

india

The regex [A-Za-z]+ind|ind[A-Za-z]+ matches ind including the preceding or following alphabets.

回复收藏 0 原文

风吹短裙飘 2025-02-01 01:26:15

$ grep -Eo '[[:alpha:]]+ind[[:alpha:]]*|[[:alpha:]]*ind[[:alpha:]]+' file
india
fooindbar

以上是在此输入文件上运行的（请注意出现在字符串中间而不是启动或结束的ind> ind的添加测试用例）：

$ cat file
first line
ind is a shortform of india
this fooindbar is the mid-word text

您可以使用GNU awk（用于MULTII -char rs，rt和\ s for [：SPACE：]]）如果您喜欢：

$ awk -v RS='\\s+' '/[[:alpha:]]+ind[[:alpha:]]*|[[:alpha:]]*ind[[:alpha:]]+/' file
india
fooindbar

或：：

$ awk -v RS='[[:alpha:]]+ind[[:alpha:]]*|[[:alpha:]]*ind[[:alpha:]]+' 'RT{print RT}' file
india
fooindbar

$ grep -Eo '[[:alpha:]]+ind[[:alpha:]]*|[[:alpha:]]*ind[[:alpha:]]+' file
india
fooindbar

the above was run on this input file (note the added test case of ind appearing in the middle of a string instead of just the start or end):

$ cat file
first line
ind is a shortform of india
this fooindbar is the mid-word text

You can do the same with GNU awk (for multi-char RS, RT, and \s shorthand for [[:space:]]) if you prefer:

$ awk -v RS='\\s+' '/[[:alpha:]]+ind[[:alpha:]]*|[[:alpha:]]*ind[[:alpha:]]+/' file
india
fooindbar

or:

$ awk -v RS='[[:alpha:]]+ind[[:alpha:]]*|[[:alpha:]]*ind[[:alpha:]]+' 'RT{print RT}' file
india
fooindbar

回复收藏 0 原文

丘比特射中我 2025-02-01 01:26:15

我将使用gnu awk进行此任务以下方式，让file.txt content in

first line
ind is a shortform of india

OUTPUT

awk 'BEGIN{RS="[[:space:]]+"}match($0,/ind/)&&length>RLENGTH{print}' file.txt

OUTPONS

india

说明：我告知GNU awk该行分隔仪（ rs）是一个或多个空格，这样，每个单词都将被视为行。然后，对于每一行（即每个单词），我使用匹配函数返回1（如果找到else 0），并设置rstart和rllength值。如果找到匹配，我会检查当前行的长度是否大于匹配的，如果是，则i print sate Word。在自己的行中输出

india ind india ind india

单词都

india
india
india

请注意，每个

I would use GNU AWK for this task following way, let file.txt content be

first line
ind is a shortform of india

then

awk 'BEGIN{RS="[[:space:]]+"}match($0,/ind/)&&length>RLENGTH{print}' file.txt

output

india

Explanation: I inform GNU AWK that row separator (RS) is one or more whitespaces, this way every word will be treated as row. Then for every row (that is every word) I use match function which return 1 if found else 0 and set RSTART and RLENGTH values. If match is found I check if length of current row (that is word) is greater than that of match, if it is so I print said word. Note that every word is outputted at own line so for example if input file content would be