当前位置：文江博客话题详情

正则表达式仅在外壳中捕获字母数字字符串

发布于 2025-01-24 23:46:48 字数 789 浏览 4 评论 0 原文

试图编写正则以捕获给定的字母数值，但也捕获其他数字值。获得欲望输出的正确方法应该是什么？

代码

grep -Eo '(\[[[:alnum:]]\)\w+' file > output

$ cat file
2022-04-29 08:45:11,754 [14] [Y23467] [546] This is a single line
2022-04-29 08:45:11,764 [15] [fpes] [547] This is a single line
2022-04-29 08:46:12,454 [143] [mwalkc] [548] This is a single line
2022-04-29 08:49:12,554 [143] [skhat2] [549] This is a single line
2022-04-29 09:40:13,852 [5] [narl12] [550] This is a single line
2022-04-29 09:45:14,754 [1426] [Y23467] [550] This is a single line

电流输出 -

[14
[Y23467
[546
[15
[fpes
[547
[143
[mwalkc
[548
[143
[skhat2
[549
[5
[narl12
[550
[1426
[Y23467
[550

预期输出 -

Y23467
fpes
mwalkc
skhat2
narl12
Y23467

原文

Trying to write the regex to capture the given alphanumeric values but its also capturing other numeric values. What should be the correct way to get the desire output?

code

grep -Eo '(\[[[:alnum:]]\)\w+' file > output

$ cat file
2022-04-29 08:45:11,754 [14] [Y23467] [546] This is a single line
2022-04-29 08:45:11,764 [15] [fpes] [547] This is a single line
2022-04-29 08:46:12,454 [143] [mwalkc] [548] This is a single line
2022-04-29 08:49:12,554 [143] [skhat2] [549] This is a single line
2022-04-29 09:40:13,852 [5] [narl12] [550] This is a single line
2022-04-29 09:45:14,754 [1426] [Y23467] [550] This is a single line

current output -

[14
[Y23467
[546
[15
[fpes
[547
[143
[mwalkc
[548
[143
[skhat2
[549
[5
[narl12
[550
[1426
[Y23467
[550

expected output -

Y23467
fpes
mwalkc
skhat2
narl12
Y23467

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

尘世孤行 2025-01-31 23:46:48

第一个解决方案： 在您显示的样本中，请尝试以下 awk 代码。简单的说明将是，使用 gsub 函数替换 [和] 在第四字段中，在此之后打印第四字段。

awk '{gsub(/\[|\]/,"",$4);print $4}' Input_file

第二解决方案： 带有GNU GREP 请尝试以下解决方案。

grep -oP '^[0-9]{4}(-[0-9]{2}){2} [0-9]{2}(:[0-9]{2}){2},[0-9]{1,3} \[[0-9]+\] \[\K[^]]*' Input_file

说明： 添加了GNU GREP 中使用的上述正句的详细说明。

^[0-9]{4}(-[0-9]{2}){2}  ##From starting of value matching 4 digits followed by dash 2 digits combination of 2 times.
 [0-9]{2}(:[0-9]{2}){2}  ##Matching space followed by 2 digits followed by : 2 digits combination of 2 times.
,[0-9]{1,3}              ##Matching comma followed by digits from 1 to 3 number.
 \[[0-9]+\] \[\K         ##Matching space followed by [ digits(1 or more occurrences of digits) followed by space [ and
                         ##then using \K to forget all the previously matched values.
[^]]*                    ##Matching everything just before 1st occurrence of ] to get actual values.

1st solution: With your shown samples, please try following awk code. Simple explanation would be, using gsub function to substitute [ and ] in 4th field, printing 4th field after that.

awk '{gsub(/\[|\]/,"",$4);print $4}' Input_file

2nd solution: With GNU grep please try following solution.

grep -oP '^[0-9]{4}(-[0-9]{2}){2} [0-9]{2}(:[0-9]{2}){2},[0-9]{1,3} \[[0-9]+\] \[\K[^]]*' Input_file

Explanation: Adding detailed explanation for above regex used in GNU grep.

^[0-9]{4}(-[0-9]{2}){2}  ##From starting of value matching 4 digits followed by dash 2 digits combination of 2 times.
 [0-9]{2}(:[0-9]{2}){2}  ##Matching space followed by 2 digits followed by : 2 digits combination of 2 times.
,[0-9]{1,3}              ##Matching comma followed by digits from 1 to 3 number.
 \[[0-9]+\] \[\K         ##Matching space followed by [ digits(1 or more occurrences of digits) followed by space [ and
                         ##then using \K to forget all the previously matched values.
[^]]*                    ##Matching everything just before 1st occurrence of ] to get actual values.

回复收藏 0 原文

淑女气质 2025-01-31 23:46:48

使用 [：alnum：]] 或 \ w 表示它可以匹配字母数字或单词字符。

如果可以有数字，但是应该有一个字符AZ，并且支持 -p 作为Perl兼容的正则态度：

grep -oP '\[\K\d*[A-Za-z][\dA-Za-z]*(?=])' file

说明

\ [匹配 [
\ k 忘记到目前为止匹配的内容
\ d*[A-ZA-Z] 匹配可选数字，至少一个char a -za-Z
[\ da-Za-Z]*匹配可选的chars a-za-z和Digits
（？=]） servert ] 在正确的输出到正确的

输出

Y23467
fpes
mwalkc
skhat2
narl12
Y23467

，如果只有1个出现，您也可以将SED与捕获组 \（... \）一起使用，并使用 \ 1 <替换中的组使用该组。 /代码>

sed 's/.*\[\([[:digit:]]*[[:alpha:]][[:alnum:]]*\)].*/\1/' file

Using [[:alnum:]] or \w means that it can possibly match alphanumeric or word characters.

If there can be numbers, but there should be a character a-z and using -P for a perl compatible regex is supported:

grep -oP '\[\K\d*[A-Za-z][\dA-Za-z]*(?=])' file

Explanation

\[ Match [
\K Forget what is matched so far
\d*[A-Za-z] Match optional digits and at least a single char a-zA-Z
[\dA-Za-z]* Match optional chars a-zA-Z and digits
(?=]) Assert ] to the right

Output

Y23467
fpes
mwalkc
skhat2
narl12
Y23467

If there can be only 1 occurrence, you might also use sed with a capture group $...$ and use the group in the replacement using \1

sed 's/.*\[\([[:digit:]]*[[:alpha:]][[:alnum:]]*\)].*/\1/' file

回复收藏 0 原文

雄赳赳气昂昂 2025-01-31 23:46:48

您的问题有几个部分。首先，我会尝试帮助您解决正格（但这可能会解锁更多问题）；接下来，我将向您展示另一种选择。

正则

是 [：alnum：]] 的内容，是因为它捕获了包含字母数字字符的任何东西。因此，它将捕获“ 123”，并将捕获“ ABC”，因为所有这些字符都是字母数字。它单独审判每个角色，无法像您想要的那样捕获“同时具有数字和字母的部分”。

但是，通过将几个 grep s链接在一起，我们可以过滤出仅包含数字的行。

grep -Eo '(\[[[:alnum:]]\)\w+' file | grep -v -Eo '\[[[:digit:]]+(\w+|$)' > output

为了进一步完善，您的正则有几个错误。首先，您在捕获的部分内包含 \ [，这就是为什么它在结果中捕获 [），因此您应该更改（\ [<[到 \ [（）移动 [在Parantheses （...）。

接下来，您的组合 [：alnum：]] 带有 \ w+可能不会做您期望的。是所有字母数字，还有一些额外的）。 >

替代方案

，为什么不使用剪切 cut -d''-f4

$ cut -d' ' -f 4 file 
[Y23467]
[fpes]
[mwalkc]
[skhat2]
[narl12]
[Y23467]

如果您也想去卸下方括号，尝试

$ cut -d' ' -f 4 file | grep -Eo '\w+'
Y23467
fpes
mwalkc
skhat2
narl12
Y23467

There are several parts to your problem. First I'll try to help you with your regex (but it will probably unlock more problems); next I'll show you an alternative.

The Regex

The thing to understand about [[:alnum:]] is that it captures anything that contains an alphanumeric character. So it will capture "123", and it will capture "abc", as all of those characters are alphanumeric. It judges each character individually and cannot capture "only sections that have both numbers and letters" like what you want.

However, by chaining several greps together, we could filter out lines which only contain numbers.

grep -Eo '(\[[[:alnum:]]\)\w+' file | grep -v -Eo '\[[[:digit:]]+(\w+|$)' > output

To refine this further, there look to be a couple of bugs in your regex. First, you have included \[ inside the captured part, which is why it's capturing the [ in your results, so you should change (\[ to \[( to move the [ outside of the captured part in parantheses ( ... ).

Next, your combination of [[:alnum:]] with \w+ probably doesn't do what you expect. It looks for a single alphanumeric character, followed by one or more "word" characters (which is all the alphanumerics, plus some extra ones). You probably want ([[:alnum:]]+) instead of ([[:alnum:]])\w+

Alternative

Why not use cut instead? cut -d' ' -f4 will take the 4th field (with "space" as the delimiter between fields)

$ cut -d' ' -f 4 file 
[Y23467]
[fpes]
[mwalkc]
[skhat2]
[narl12]
[Y23467]

If you also want to remove the square brackets, try

$ cut -d' ' -f 4 file | grep -Eo '\w+'
Y23467
fpes
mwalkc
skhat2
narl12
Y23467

回复收藏 0 原文

猫瑾少女 2025-01-31 23:46:48

使用 sed

$ sed 's/\([^[]*\[\)\{2\}\([^]]*\).*/\2/' input_file
Y23467
fpes
mwalkc
skhat2
narl12
Y23467

Using sed

$ sed 's/\([^[]*\[\)\{2\}\([^]]*\).*/\2/' input_file
Y23467
fpes
mwalkc
skhat2
narl12
Y23467

回复收藏 0 原文

狼亦尘 2025-01-31 23:46:48

使用 fpat 与gnu awk ：

awk -v FPAT='[[[:alnum:]]*]' '{gsub(/^\[|\]$/, "",$(NF-1));print $(NF-1)}' file
Y23467
fpes
mwalkc
skhat2
narl12
Y23467

设置 as '[[[：alnum：]]*]*]'我们匹配 [ char，然后零o更多字母数字字符，然后是] char。
使用 gsub（）函数我们删除初始 [和final ] chars。
我们在上一个字段之前打印字段，即 $（nf-1）字段，没有 [ and ] ] tarne。 /p>

Using FPAT with GNU awk:

awk -v FPAT='[[[:alnum:]]*]' '{gsub(/^\[|\]$/, "",$(NF-1));print $(NF-1)}' file
Y23467
fpes
mwalkc
skhat2
narl12
Y23467

setting FPAT as '[[[:alnum:]]*]' we match [ char followed by zero o more alphanumeric chars followed by ] char.
with gsub() function we remove initial [ and final ] chars.
we print the field previous to the last field, i.e. $(NF-1) field, without [ and ] characters.

回复收藏 0 原文

~没有更多了~

关于作者

戏剧牡丹亭

暂无简介

文章

27 人气

关注发私信

友情链接

文江博客

正则表达式仅在外壳中捕获字母数字字符串

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（5）

正则

替代方案

The Regex

Alternative

关于作者

相关话题

热门标签

推荐作者

qq_aHcEbj

qq_ikhFfg

寻找我们的幸福

把昨日还给我

wj_zym

巴黎夜雨

友情链接

正则表达式仅在外壳中捕获字母数字字符串

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（5）

正则

替代方案

The Regex

Alternative

关于作者

相关话题

热门标签

推荐作者

qq_aHcEbj

qq_ikhFfg

寻找我们的幸福

把昨日还给我

wj_zym

巴黎夜雨

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。