Linux 计算某个域每天发送的邮件数量
我有一些使用 postfix 代理的 Linux SMTP 服务器的日志。我想对日志执行操作,这样我就可以知道某个域每天发送了多少邮件,而无需编写脚本。
例如,我的 mail.log 文件包含以下内容:
Jan 1 14:05:31 mail postfix/smtp[31349]: E6EC84105D: to=<[email protected]>, relay=http://mail.example.org[127.0.0.1]:25, delay=1.7, delays=0.22/0.05/0.36/1.1, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 78B06EC0073)
Jan 1 15:05:00 mail postfix/smtp[31349]: E6EC84105D: to=<[email protected]>, relay=http://mail.example.org[127.0.0.1]:25, delay=1.7, delays=0.22/0.05/0.36/1.1, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 874BE4587C4)
Jan 1 15:05:00 mail postfix/smtp[31349]: E6EC84105D: to=<[email protected]>, relay=http://mail.example2.org[127.0.0.1]:25, delay=1.7, delays=0.22/0.05/0.36/1.1, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 98C484E1571)
Jan 2 10:08:15 mail postfix/smtp[31349]: E6EC84105D: to=<[email protected]>, relay=http://mail.example.org[127.0.0.1]:25, delay=1.7, delays=0.22/0.05/0.36/1.1, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 4456D154E12)
Jan 2 15:07:00 mail postfix/smtp[31349]: E6EC84105D: to=<[email protected]>, relay=http://mail.example2.org[127.0.0.1]:25, delay=1.7, delays=0.22/0.05/0.36/1.1, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 4F54515C154)
Jan 2 14:59:11 mail postfix/smtp[31349]: E6EC84105D: to=<[email protected]>, relay=http://mail.example2.org[127.0.0.1]:25, delay=1.7, delays=0.22/0.05/0.36/1.1, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 9856C984E16)
Feb 1 13:14:35 mail postfix/smtp[31349]: E6EC84105D: to=<[email protected]>, relay=http://mail.example.org[127.0.0.1]:25, delay=1.7, delays=0.22/0.05/0.36/1.1, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as EC1874415E8)
我想要的输出是:
- 首先是发送邮件的域/地址
- 发送邮件的数量每个日期发送特定域(例如,1 月 1 日发送 2 封邮件)
所以这里的输出应该是这样的:
http://mail.example.org[127.0.0.1]:25
Jan 1 2
Jan 2 1
Feb 1 1
http://mail.example2.org[127.0.0.1]:25
Jan 1 1
Jan 2 2
现在我知道我有 2 个命令可以分别执行这些操作,但我真的不知道如何组合他们在一起:
1。统计某个域总共发送了多少封邮件:
[user@linux ~] grep -h "status=sent" mail.log | cut -d' ' -f9 | awk '{c[$0] += 1} END {for(i in c){printf "%6s %4d\n", i, c[i]}}' | sort -M
relay=http://mail.example2.org[127.0.0.1]:25, 3
relay=http://mail.example.org[127.0.0.1]:25, 4
2。计算每天发送的邮件数量
[user@linux ~]$ grep -h "status=sent" mail.log | cut -c-6 | awk '{c[$0] += 1} END {for(i in c){printf "%6s %4d\n", i, c[i]}}' | sort -k2
Feb 1 1
Jan 1 3
Jan 2 3
有谁知道一个可以帮助我完成此特定操作的好命令吗?任何帮助将不胜感激,谢谢!
I have some logs of a Linux SMTP server which uses the postfix agent. I want to perform an operation on the logs so I can know how many mails a certain domain sends per date without writing a script.
For example my mail.log file has these contents:
Jan 1 14:05:31 mail postfix/smtp[31349]: E6EC84105D: to=<[email protected]>, relay=http://mail.example.org[127.0.0.1]:25, delay=1.7, delays=0.22/0.05/0.36/1.1, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 78B06EC0073)
Jan 1 15:05:00 mail postfix/smtp[31349]: E6EC84105D: to=<[email protected]>, relay=http://mail.example.org[127.0.0.1]:25, delay=1.7, delays=0.22/0.05/0.36/1.1, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 874BE4587C4)
Jan 1 15:05:00 mail postfix/smtp[31349]: E6EC84105D: to=<[email protected]>, relay=http://mail.example2.org[127.0.0.1]:25, delay=1.7, delays=0.22/0.05/0.36/1.1, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 98C484E1571)
Jan 2 10:08:15 mail postfix/smtp[31349]: E6EC84105D: to=<[email protected]>, relay=http://mail.example.org[127.0.0.1]:25, delay=1.7, delays=0.22/0.05/0.36/1.1, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 4456D154E12)
Jan 2 15:07:00 mail postfix/smtp[31349]: E6EC84105D: to=<[email protected]>, relay=http://mail.example2.org[127.0.0.1]:25, delay=1.7, delays=0.22/0.05/0.36/1.1, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 4F54515C154)
Jan 2 14:59:11 mail postfix/smtp[31349]: E6EC84105D: to=<[email protected]>, relay=http://mail.example2.org[127.0.0.1]:25, delay=1.7, delays=0.22/0.05/0.36/1.1, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 9856C984E16)
Feb 1 13:14:35 mail postfix/smtp[31349]: E6EC84105D: to=<[email protected]>, relay=http://mail.example.org[127.0.0.1]:25, delay=1.7, delays=0.22/0.05/0.36/1.1, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as EC1874415E8)
The output I want is:
- First the domain/address the mail is sent from
- Amount of mails that specific domain sends per date (e.g. Jan 1 2 mails sent)
So here the output should be somehow:
http://mail.example.org[127.0.0.1]:25
Jan 1 2
Jan 2 1
Feb 1 1
http://mail.example2.org[127.0.0.1]:25
Jan 1 1
Jan 2 2
For now I know I have 2 commands that can do these operations seperately, but I really have no idea on how to combine them together:
1. Count how many mails a certain domain sends in total:
[user@linux ~] grep -h "status=sent" mail.log | cut -d' ' -f9 | awk '{c[$0] += 1} END {for(i in c){printf "%6s %4d\n", i, c[i]}}' | sort -M
relay=http://mail.example2.org[127.0.0.1]:25, 3
relay=http://mail.example.org[127.0.0.1]:25, 4
2. Count how many mails are sent per day
[user@linux ~]$ grep -h "status=sent" mail.log | cut -c-6 | awk '{c[$0] += 1} END {for(i in c){printf "%6s %4d\n", i, c[i]}}' | sort -k2
Feb 1 1
Jan 1 3
Jan 2 3
Does anyone know a good command that can help me with this specific operation? Any help would be appreciated, thank you!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
对于显示的示例,请尝试执行以下
awk
代码。用 GNUawk
编写和测试应该适用于任何版本。说明:在
awk
的主程序中,首先将第7个字段中的开始relay= AND结束,
全局替换为NULL。然后创建一个名为arr1
的数组,其索引为$1 OFS $2 OFS $8
,并使用相同索引(此处为 1)不断增加其计数,对 Input_file 的所有行执行此操作。然后在awk
代码的END
块中,遍历 arr1 的所有元素,并将其索引i
拆分到 arr2 中。然后使用 arr2 的 3 元素的索引创建新数组 arr3,该元素是 Input_file 中的 http 值。并将值赋给arr2[1] OFS arr2[2] OFS arr2[4] OFS arr1[i]
。在所有循环中创建 arr3 后,然后通过 for 循环遍历其所有项目并打印其索引,然后是 ORS(换行),然后是 arr3 的值(负责打印所需的所需输出)。With your shown samples, please try following
awk
code. Written and tested in GNUawk
should work with any version.Explanation: In main program of
awk
firstly globally substituting starting relay= AND ending,
with NULL in 7th field. Then creating an array namedarr1
which has index as$1 OFS $2 OFS $8
and keep increasing its count with same indexes with 1 here, doing this for all the lines for Input_file. Then inEND
block ofawk
code, traversing through arr1 all elements and splitting its indexi
into arr2. Then creating new array arr3 with index of arr2's 3 element which is http value in Input_file. And assigning value toarr2[1] OFS arr2[2] OFS arr2[4] OFS arr1[i]
. Once in all cycles arr3 is created, then traversing through all of its items by for loop and printing its index followed by ORS(new line) followed by value of arr3(which is responsible for printing needed required output).假设:
relay=
实例relay=
可能并不总是显示在mail.log
读取日期的顺序)添加几行不包含
relay=
的行:使用 GNU 的一个想法awk(对于数组数组):
注意:
for (addr in counts)
不保证以任何特定顺序处理数组条目dates[ ++dtorder]=date
用于跟踪日期处理的顺序;然后在END{...}
处理中使用它,以确保我们以相同的顺序输出日期;这假设日期按日历顺序显示在mail.log
中,从而无需弄清楚如何对Jan
、Feb
、< code>Mar 等(按日历顺序)这会生成:
Assumptions:
relay=
relay=
may not always show up in the same spaced-delimited fieldmail.log
)Adding a couple lines that do not include
relay=
:One idea using
GNU awk
(for array of arrays):NOTES:
for (addr in counts)
is not guaranteed to process array entries in any specific orderdates[++dtorder]=date
is used to keep track of the order in which dates are processed; this is then used in theEND{...}
processing to insure we ouput dates in the same order; this assumes dates show up inmail.log
in calendar order which in turn eliminates the need to figure out how to sortJan
,Feb
,Mar
, etc in calendar orderThis generates:
不是您想要的确切输出,但非常简单(使用 GNU 和 BSD
awk
、sort
和uniq
进行测试):awk 字段分隔符由
-F'=|,?[[:space:]]+'
选项设置为=
符号,或后跟可选逗号至少一个空格(或制表符、换页符...)据此,您感兴趣的字段是数字 10(来源)、1(月)和 2(日)。排序 | uniq -c 排序并打印结果,每个唯一输入一行,前面带有计数。
但月份的排序是按字母顺序排列的。如果您希望输出首先按来源排序,然后按增加日期排序,我们可以添加
sort
选项:-k2,2M
按日期对第二个键的月份名称进行排序,而不是按字母顺序排列。最后,如果您想要显示的准确输出,我们可以添加最后一个awk
脚本来进行最终格式化:每次原点更改 (
$2!=p
) 时,这最后一个awk
脚本将新的原点存储在变量p
中以供以后比较,打印换行符(第一行除外,因此(NR!=1) ? " \n" p : p
),以及打印新的原点。对于每一行,它还打印月份 ($3
)、日期 ($4
) 和计数 ($1
)。Not the exact output you want but quite simple (tested with GNU and BSD
awk
,sort
anduniq
):The
awk
field separator is set by the-F'=|,?[[:space:]]+'
option to either an=
sign, or an optional comma followed by at least one space (or tab, formfeed...) According this, the fields you are interested in are number 10 (origin), 1 (month) and 2 (day).sort | uniq -c
sorts and prints the result, one line per unique input, preceded with the count.But the sorting of the months is alphabetical. If you want the output to be sorted first by origin and then by increasing date we can add
sort
options:-k2,2M
sorts the month names of the second key by date, instead of alphabetically. Finally, if you want the exact output you show, we can add a lastawk
script for the final formatting:Each time the origin changes (
$2!=p
) this lastawk
script stores the new origin in variablep
for later comparisons, prints a newline (except for the first line, thus the(NR!=1) ? "\n" p : p
), and prints the new origin. For each line it also prints the month ($3
), the day ($4
) and the count ($1
).