正则表达式仅在外壳中捕获字母数字字符串
试图编写正则以捕获给定的字母数值,但也捕获其他数字值。获得欲望输出的正确方法应该是什么?
代码
grep -Eo '(\[[[:alnum:]]\)\w+' file > output
$ cat file
2022-04-29 08:45:11,754 [14] [Y23467] [546] This is a single line
2022-04-29 08:45:11,764 [15] [fpes] [547] This is a single line
2022-04-29 08:46:12,454 [143] [mwalkc] [548] This is a single line
2022-04-29 08:49:12,554 [143] [skhat2] [549] This is a single line
2022-04-29 09:40:13,852 [5] [narl12] [550] This is a single line
2022-04-29 09:45:14,754 [1426] [Y23467] [550] This is a single line
电流输出 -
[14
[Y23467
[546
[15
[fpes
[547
[143
[mwalkc
[548
[143
[skhat2
[549
[5
[narl12
[550
[1426
[Y23467
[550
预期输出 -
Y23467
fpes
mwalkc
skhat2
narl12
Y23467
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
第一个解决方案: 在您显示的样本中,请尝试以下
awk
代码。简单的说明将是,使用gsub
函数替换[
和]
在第四字段中,在此之后打印第四字段。第二解决方案: 带有GNU
GREP
请尝试以下解决方案。说明: 添加了GNU
GREP
中使用的上述正句的详细说明。1st solution: With your shown samples, please try following
awk
code. Simple explanation would be, usinggsub
function to substitute[
and]
in 4th field, printing 4th field after that.2nd solution: With GNU
grep
please try following solution.Explanation: Adding detailed explanation for above regex used in GNU
grep
.使用
[:alnum:]]
或\ w
表示它可以匹配字母数字或单词字符。如果可以有数字,但是应该有一个字符AZ,并且支持
-p
作为Perl兼容的正则态度:说明
[
\ k
忘记到目前为止匹配的内容\ d*[A-ZA-Z]
匹配可选数字,至少一个char a -za-Z[\ da-Za-Z]*
匹配可选的chars a-za-z和Digits(?=])
servert]
在正确的输出到正确的输出
,如果只有1个出现,您也可以将SED与捕获组
\(... \)
一起使用,并使用\ 1 <替换中的组使用该组。 /代码>
Using
[[:alnum:]]
or\w
means that it can possibly match alphanumeric or word characters.If there can be numbers, but there should be a character a-z and using
-P
for a perl compatible regex is supported:Explanation
\[
Match[
\K
Forget what is matched so far\d*[A-Za-z]
Match optional digits and at least a single char a-zA-Z[\dA-Za-z]*
Match optional chars a-zA-Z and digits(?=])
Assert]
to the rightOutput
If there can be only 1 occurrence, you might also use sed with a capture group
\(...\)
and use the group in the replacement using\1
您的问题有几个部分。首先,我会尝试帮助您解决正格(但这可能会解锁更多问题);接下来,我将向您展示另一种选择。
正则
是
[:alnum:]]
的内容,是因为它捕获了包含字母数字字符的任何东西。因此,它将捕获“ 123”,并将捕获“ ABC”,因为所有这些字符都是字母数字。它单独审判每个角色,无法像您想要的那样捕获“同时具有数字和字母的部分”。但是,通过将几个
grep
s链接在一起,我们可以过滤出仅包含数字的行。为了进一步完善,您的正则有几个错误。首先,您在捕获的部分内包含
\ [
,这就是为什么它在结果中捕获[
),因此您应该更改(\ [<[到
\ [(
)移动[
在Parantheses(...)
。接下来,您的组合
[:alnum:]] 带有
\ w+
可能不会做您期望的。是所有字母数字,还有一些额外的)。 >替代方案
,为什么不使用
剪切
cut -d''-f4
如果您也想去卸下方括号,尝试
There are several parts to your problem. First I'll try to help you with your regex (but it will probably unlock more problems); next I'll show you an alternative.
The Regex
The thing to understand about
[[:alnum:]]
is that it captures anything that contains an alphanumeric character. So it will capture "123", and it will capture "abc", as all of those characters are alphanumeric. It judges each character individually and cannot capture "only sections that have both numbers and letters" like what you want.However, by chaining several
grep
s together, we could filter out lines which only contain numbers.To refine this further, there look to be a couple of bugs in your regex. First, you have included
\[
inside the captured part, which is why it's capturing the[
in your results, so you should change(\[
to\[(
to move the[
outside of the captured part in parantheses( ... )
.Next, your combination of
[[:alnum:]]
with\w+
probably doesn't do what you expect. It looks for a single alphanumeric character, followed by one or more "word" characters (which is all the alphanumerics, plus some extra ones). You probably want([[:alnum:]]+)
instead of([[:alnum:]])\w+
Alternative
Why not use
cut
instead?cut -d' ' -f4
will take the 4th field (with "space" as the delimiter between fields)If you also want to remove the square brackets, try
使用
sed
Using
sed
使用
fpat
与gnuawk
:设置 as
'[[[:alnum:]]*]*]'我们匹配
[
char,然后零o更多字母数字字符,然后是]
char。使用
gsub()
函数我们删除初始[
和final]
chars。我们在上一个字段之前打印字段,即
$(nf-1)
字段,没有[
and]
] tarne。 /p>Using
FPAT
with GNUawk
:setting
FPAT
as'[[[:alnum:]]*]'
we match[
char followed by zero o more alphanumeric chars followed by]
char.with
gsub()
function we remove initial[
and final]
chars.we print the field previous to the last field, i.e.
$(NF-1)
field, without[
and]
characters.