返回匹配但不完全相同的字符串
是否有任何方法可以找到一个包含给定字符串的单词,但不是确切的匹配。例如,
# cat t.txt
first line
ind is a shortform of india
我试图返回“印度”一词,因为它包含字符串“ ind”,但我不需要确切的匹配。我尝试了...
# grep -o 'ind' t.txt
ind
ind
Is there any way to find a word that contains a given string but is not the exact match. For e.g.
# cat t.txt
first line
ind is a shortform of india
I am trying to return the word "india" because it contains the string "ind" but I do not need the exact match. I have tried this...
# grep -o 'ind' t.txt
ind
ind
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您能否尝试以下操作:
输出:
REGEX
[A-ZA-Z]+IND | IND | IND [A-ZA-Z]+
匹配Ind
包括上述或以下字母。Would you please try the following:
Output:
The regex
[A-Za-z]+ind|ind[A-Za-z]+
matchesind
including the preceding or following alphabets.以上是在此输入文件上运行的(请注意出现在字符串中间而不是启动或结束的
ind> ind
的添加测试用例):您可以使用GNU awk(用于MULTII -char rs,rt和
\ s
for
[:SPACE:]]
)如果您喜欢:或::
the above was run on this input file (note the added test case of
ind
appearing in the middle of a string instead of just the start or end):You can do the same with GNU awk (for multi-char RS, RT, and
\s
shorthand for[[:space:]]
) if you prefer:or:
我将使用gnu
awk
进行此任务以下方式,让file.txt
content inOUTPUT
OUTPONS
说明:我告知GNU
awk
该行分隔仪(rs
)是一个或多个空格,这样,每个单词都将被视为行。然后,对于每一行(即每个单词),我使用匹配 函数返回1(如果找到else 0),并设置rstart
和rllength
值。如果找到匹配,我会检查当前行的长度
是否大于匹配的,如果是,则iprint
sate Word。 在自己的行中输出单词都
请注意,每个
I would use GNU
AWK
for this task following way, letfile.txt
content bethen
output
Explanation: I inform GNU
AWK
that row separator (RS
) is one or more whitespaces, this way every word will be treated as row. Then for every row (that is every word) I usematch
function which return 1 if found else 0 and setRSTART
andRLENGTH
values. If match is found I check iflength
of current row (that is word) is greater than that of match, if it is so Iprint
said word. Note that every word is outputted at own line so for example if input file content would bethen output would be
(tested in gawk 4.2.1)