Grep 并提取多个日志文件中的特定数据
我在目录中有多个日志文件,并尝试仅提取时间戳和日志行的一部分,即全文查询参数的值。请求中的每个查询参数均由以下图所示的andand(&)隔开。
输入
30/MAR/2022:00:27:36 +0000 [59823] - >得到 /libs/granite/omnisearch?p.guesstotal = 1000&; fulltext = 798&prong>&SavedSearches%40delete=&
31/MAR/2022:00:27:36 +0000 [59823] - >得到 /libs/granite/omnisearch?p.guesstotal = 1000&; fulltext = dyson+v7 & savedSearches%40delete=&
预期的输出
30/3月/2022:00:27:36-> 798
31/31/MAR/2022:00:27:36-> Dyson+V7
我有此命令递归搜索目录中的所有文件。
grep -rn“/libs/granite/omnisearch”〜/downloads/reqlogs/> output.txt
它以目录名称开头打印整个日志行,例如so
/users/****/downloads/reqlogs/logfile1_2022-03-31.log:6020:31/mar//mar/ 2022:00:27:36 +0000 [59823] - > get/libs/granite/omnisearch?p.guesstotal = 1000&; amp;fulltext = 798&savedSearches%4
请启发,我该如何操纵它以实现预期的输出。
I've got multiple log files in a directory and trying to extract just the timestamp and a section of the log line i.e. the value of the fulltext query param. Each query param in a request is separated by an ampersand(&) as shown below.
Input
30/Mar/2022:00:27:36 +0000 [59823] -> GET
/libs/granite/omnisearch?p.guessTotal=1000&fulltext=798&savedSearches%40Delete=&31/Mar/2022:00:27:36 +0000 [59823] -> GET
/libs/granite/omnisearch?p.guessTotal=1000&fulltext=Dyson+V7&savedSearches%40Delete=&
Intended Output
30/Mar/2022:00:27:36 -> 798
31/Mar/2022:00:27:36 -> Dyson+V7
I've got this command to recursively search over all the files in the directory.
grep -rn "/libs/granite/omnisearch" ~/Downloads/ReqLogs/ > output.txt
This prints the entire log line starting with the directory name, like so
/Users/****/Downloads/ReqLogs/logfile1_2022-03-31.log:6020:31/Mar/2022:00:27:36 +0000 [59823] -> GET /libs/granite/omnisearch?p.guessTotal=1000&fulltext=798&savedSearches%4
Please enlighten, How do i manipulate this to achieve the intended output.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
grep
可以返回整行或匹配的字符串。要从匹配行中提取不同的数据,请使用 sed 或 Awk。或者
sed
版本更简洁,但也更间接。\%...%
使用备用分隔符%
,以便我们可以在搜索表达式中使用文字斜杠。然后,
s/ .../\1/p
表示替换第一个空格之后匹配行上的所有内容,捕获fulltext=
和& 之间的所有内容。
,并替换为捕获的子字符串,然后打印结果行。-n
标志关闭默认打印操作,以便我们只打印搜索表达式匹配的行。通配符
~/Downloads/ReqLogs/*
匹配该目录中的所有文件;如果您确实也需要遍历子目录,也许可以将find
添加到其中。或者与
-exec
之后的 Awk 命令类似。占位符{}
告诉find
在哪里添加找到的文件的名称,+
表示将尽可能多的文件放入一个文件中go,而不是为每个找到的文件运行单独的-exec
。 (如果需要,请使用\;
而不是+
。)grep
can return the whole line or the string which matched. For extracting a different piece of data from the matching lines, turn tosed
or Awk.or
The
sed
version is more succinct, but also somewhat more oblique.\%...%
uses the alternate delimiter%
so that we can use literal slashes in our search expression.The
s/ .../\1/p
then says to replace everything on the matching lines after the first space, capturing anything betweenfulltext=
and&
, and replace with the captured substring, then print the resulting line.The
-n
flag turns off the default printing action, so that we only print the lines where the search expression matched.The wildcard
~/Downloads/ReqLogs/*
matches all files in that directory; if you really need to traverse subdirectories, too, perhaps addfind
to the mix.or similarly with the Awk command after
-exec
. The placeholder{}
tellsfind
where to add the name of the found file(s) and+
says to put as many as possible in one go, rather than running a separate-exec
for each found file. (If you want that, use\;
instead of+
.)