如何从已经索引的数据中解析基巴纳中的日志数据?
因此,我为一家创业公司工作,我们正在使用可容纳的弹性堆栈。我不确定这是否会影响任何东西,但我想我会提到它。我是唯一对弹性堆栈有任何工作知识的人,这不是很多。所以请忍受我。
基本上,来自Postgres DB和React应用程序的日志数据将发送到弹性堆栈。当我在基巴纳(Kibana)搜索它时,我会得到看起来像这样的结果:
{
"_index": "logstash-2022.03.23",
"_type": "_doc",
"_id": "yboRtX8B3AbBxU682eag",
"_version": 1,
"_score": null,
"_source": {
"@timestamp": "2022-03-23T04:38:04.706Z",
"source": "app",
"layer": "app",
"container":xxxxxxxxx
"app": "our-app",
"host": xxxxxxx,
"app_id": xxxxxxx,
"log": "172.17.0.1 - - [22/Mar/2022:23:37:56 -0500] \"GET /healthcheck HTTP/1.0\" 400 26 \"-\" \"Aptible Health Check\"\n",
"service": "xxxxxxx",
"type": "json",
"@version": "3",
"file": "/tmp/dockerlogs/f76cd328d5710e817702c5b7c15d37797828a1308f8b0e17d039a86813237f73/f76cd328d5710e817702c5b7c15d37797828a1308f8b0e17d039a86813237f73-json.log",
"offset": 46118576,
"stream": "stdout",
"time": "2022-03-23T04:37:56.192544365Z"
},
"fields": {
"@timestamp": [
"2022-03-23T04:38:04.706Z"
],
"time": [
"2022-03-23T04:37:56.192Z"
]
},
"sort": [
1648010284706
]
}
这不是超级有用的,因为我发现有用的数据都在“日志”部分中。 因此,这是上面的日志部分: “ 172.17.0.1--- [22/MAR/2022:23:37:56 -0500] \“ get/healthcheck http/1.0 \” 400 26 \“ - \” - \“ \” \“可恰当的健康检查\” \ n \ n \ n \ n \ n “
我想解析它,或者更好,或者拥有Kibana或任何自动解析的东西。 我讨厌承认我已经花了几天的时间来解决这个问题,而当它首次解析数据中的数据或之后如何完成该数据时,我就无法弄清楚如何自动解析此问题;就像从此JSON中获取日志数据并解析它,以便它最终成为其自己的JSON。
因此,这里关于如何解析这将是很棒的任何建议。
So I work for a startup and we are using Aptible to host our elastic stack. I'm not sure if that affects anything but figured I'd mention it. I'm the only guy here with any working knowledge of the elastic stack, and it's not very much. So bear with me, please.
Basically the log data from a postgres db and react app gets sent to the elastic stack. When I go to search through it in kibana, I get results that look like this:
{
"_index": "logstash-2022.03.23",
"_type": "_doc",
"_id": "yboRtX8B3AbBxU682eag",
"_version": 1,
"_score": null,
"_source": {
"@timestamp": "2022-03-23T04:38:04.706Z",
"source": "app",
"layer": "app",
"container":xxxxxxxxx
"app": "our-app",
"host": xxxxxxx,
"app_id": xxxxxxx,
"log": "172.17.0.1 - - [22/Mar/2022:23:37:56 -0500] \"GET /healthcheck HTTP/1.0\" 400 26 \"-\" \"Aptible Health Check\"\n",
"service": "xxxxxxx",
"type": "json",
"@version": "3",
"file": "/tmp/dockerlogs/f76cd328d5710e817702c5b7c15d37797828a1308f8b0e17d039a86813237f73/f76cd328d5710e817702c5b7c15d37797828a1308f8b0e17d039a86813237f73-json.log",
"offset": 46118576,
"stream": "stdout",
"time": "2022-03-23T04:37:56.192544365Z"
},
"fields": {
"@timestamp": [
"2022-03-23T04:38:04.706Z"
],
"time": [
"2022-03-23T04:37:56.192Z"
]
},
"sort": [
1648010284706
]
}
This isn't super helpful, since the data I find useful is all in the "log" section.
So here is the log section from above :"172.17.0.1 - - [22/Mar/2022:23:37:56 -0500] \"GET /healthcheck HTTP/1.0\" 400 26 \"-\" \"Aptible Health Check\"\n"
I'd like to parse that, or better yet have kibana or whatever automatically parse that.
I hate to admit I've spent DAYS on this problem, and I just can't figure out how to either have this parsed automatically when it first parses the data coming in, or how to have it done afterward; like take the log data from this JSON and parse it so it ends up as its own JSON.
So any advice here on how to parse that would be great.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论