Awk 脚本帮助 - 逻辑问题

发布于 2024-07-07 03:07:09 字数 2164 浏览 7 评论 0原文

我目前正在编写一个简单的 .sh 脚本来解析 Exim 日志文件以查找与“ o' ”匹配的字符串。 目前,当查看output.txt时,每行(606行)都打印了一个0。 我猜我的逻辑是错误的,因为 awk 不会抛出任何错误。

这是我的代码(针对串联和计数器问题进行了更新)。 编辑:我从 dmckee 的答案中采用了一些新代码,为了简单起见,我现在正在使用这些代码来代替旧代码。

awk '/o'\''/ {
         line = "> ";
         for(i = 20; i <= 33; i++) {
           line = line " " $i;
         }
         print line;
    }' /var/log/exim/main.log > output.txt

有任何想法吗?

编辑:为了清楚起见,我在电子邮件地址中查找“o”,因为 ' 是电子邮件地址中的非法字符(在我们的数据库中,仅以 o' 前缀的名称出现)。

编辑 2:根据评论请求,这里是一些所需输出的经过清理的示例:

[xxx.xxx.xxx.xxx] kathleen.o'[email protected] <kathleen.o'[email protected]> routing defer (-51): retry time not reached

[xxx.xxx.xxx.xxx] julie.o'[email protected] <julie.o'[email protected]> routing defer (-51): retry time not reached

[xxx.xxx.xxx.xxx] james.o'[email protected] <james.o'[email protected]> routing defer (-51): retry time not reached

[xxx.xxx.xxx.xxx] daniel_o'[email protected] <aniel_o'[email protected]> routing defer (-51): retry time not reached

我在循环中从 20 开始的原因是因为第 20 个字段之前的所有内容都只是标准日志信息,我在这里不需要这些信息。 我所需要的只是此解决方案的 IP 及其他内容(每个 550 错误的消息对于正在使用的每个邮件服务器都是不同的。我正在编制常见错误的列表)

I'm currently writing a simple .sh script to parse an Exim log file for strings matching " o' ". Currently, when viewing output.txt, all that is there is a 0 printed on every line(606 lines). I'm guessing my logic is wrong, as awk does not throw any errors.

Here is my code(updated for concatenation and counter issues). Edit: I've adopted some new code from dmckee's answer that I'm now working with over the old code in favor of simplicity.

awk '/o'\''/ {
         line = "> ";
         for(i = 20; i <= 33; i++) {
           line = line " " $i;
         }
         print line;
    }' /var/log/exim/main.log > output.txt

Any ideas?

EDIT: For clarity's sake, I'm grepping for "o'" in email addresses, because ' is an illegal character in email addresses(and in our databases, appears only with o'-prefixed names).

EDIT 2: As per commentary request, here is a sanitized sample of some desired output:

[xxx.xxx.xxx.xxx] kathleen.o'[email protected] <kathleen.o'[email protected]> routing defer (-51): retry time not reached

[xxx.xxx.xxx.xxx] julie.o'[email protected] <julie.o'[email protected]> routing defer (-51): retry time not reached

[xxx.xxx.xxx.xxx] james.o'[email protected] <james.o'[email protected]> routing defer (-51): retry time not reached

[xxx.xxx.xxx.xxx] daniel_o'[email protected] <aniel_o'[email protected]> routing defer (-51): retry time not reached

The reason I'm starting at 20 in my loop is because everything before the 20th field is just standard log information that isn't needed for my purposes here. All I need is everything from the IP and beyond for this solution(the messages for each 550 error are different for each mail server in use out there. I'm compiling a list of common ones)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

不美如何 2024-07-14 03:07:09

+ 在 awk 中表示数字加法。 如果要连接,只需将常量和/或表达式用空格分隔即可。

所以,这

line += " " + $i

应该变成

line = line " " $i

编辑: Iff exim 日志文件(我更喜欢 Postfix :) 由一个空格分隔,以下不是更简单吗:

grep -F o\' /var/log/exim/main.log | cut -d\  -f20-33 >output.txt

+ means numerical addition in awk. If you want to concatenate, just place the constants and/or expressions separated with spaces.

So, this

line += " " + $i

should become

line = line " " $i

EDIT: Iff exim log files (I am more into Postfix :) are separated by a single space, isn't the following more simple:

grep -F o\' /var/log/exim/main.log | cut -d\  -f20-33 >output.txt

?

云仙小弟 2024-07-14 03:07:09

这里并不真正需要 grep。 让 awk 为您选择匹配行(并按照 ΤΖΩΤΖIΟΥ 修复串联错误):

awk '/o'\''/ {
             line = "> ";
             for(i = 20; i <= 33; i++) {
               line = line " " $i;
             }
             print line;
        }' /var/log/exim/main.log > output.txt

当然,如果您像上面那样在提示符下执行此操作,您最终会需要一些奇怪的转义。 它在脚本中更干净...


编辑:在第一次传递时我错过了 += 问题...

同时假设您上面给出的行是部分的,因为它只有 13 个字段(默认情况下字段以空格分隔) 。

There is no real need for the grep here. Let awk select the matching lines for you (and fixing your concatenation bug as per ΤΖΩΤΖΙΟΥ):

awk '/o'\''/ {
             line = "> ";
             for(i = 20; i <= 33; i++) {
               line = line " " $i;
             }
             print line;
        }' /var/log/exim/main.log > output.txt

Of course, you end up needing some weird escaping if you do it at the promp like above. It is cleaner in a script...


Edit: On the first pass I missed the += problem...

Also assuming that the line you gave above is partial, as it has only 13ish fields (by default fields are white space delimited).

谁对谁错谁最难过 2024-07-14 03:07:09

“'”在当地并不违法。 来自 RFC2821,第 4.1.2 节:

Local-part = Dot-string / Quoted-string

Dot-string = Atom *("." Atom)

Atom = 1*atext

2821 进一步引用 RFC2822 对于非本地定义的元素,因此:

atext           =       ALPHA / DIGIT / ; Any character except controls,
                        "!" / "#" /     ;  SP, and specials.
                        "$" / "%" /     ;  Used for atoms
                        "&" / "'" /
                        "*" / "+" /
                        "-" / "/" /
                        "=" / "?" /
                        "^" / "_" /
                        "`" / "{" /
                        "|" / "}" /
                        "~"

换句话说,“'”是一个完全合法的不带引号的字符有在电子邮件本地部分。 现在,它可能在您的网站上不合法,但事实并非如此。

抱歉没有直接切入主题,但我想纠正你的说法。

"'" is not illegal in local parts. From RFC2821, section 4.1.2:

Local-part = Dot-string / Quoted-string

Dot-string = Atom *("." Atom)

Atom = 1*atext

2821 further references RFC2822 for non-locally-defined elements, so:

atext           =       ALPHA / DIGIT / ; Any character except controls,
                        "!" / "#" /     ;  SP, and specials.
                        "$" / "%" /     ;  Used for atoms
                        "&" / "'" /
                        "*" / "+" /
                        "-" / "/" /
                        "=" / "?" /
                        "^" / "_" /
                        "`" / "{" /
                        "|" / "}" /
                        "~"

In other words, "'" is a perfectly legal unquoted characted to have in an email localpart. Now, it may not be legal at your site, but that's not what you said.

Sorry for not staying directly on topic, but I wanted to correct your assertion.

流年已逝 2024-07-14 03:07:09

任务结束了,而且更简单:python。

import fileinput
for line in fileinput.input():
    if "'" in line:
        fields = line.split(' ')
        print "> ", ' '.join( fields[20:34] )

Off task, and simpler still: python.

import fileinput
for line in fileinput.input():
    if "'" in line:
        fields = line.split(' ')
        print "> ", ' '.join( fields[20:34] )
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文