弹性搜索DSL查询将日志消息与启动和结束文本匹配

发布于 2025-01-26 07:22:14 字数 504 浏览 5 评论 0原文

日志消息如下所示，

The application node ABC is down
The application node BCD is down
The application node XXX  is down

我编写了以下查询，但是它不起作用，

"query": {
   "must": {
          "match": {
               "log_message": {
                    "query": "The application node /[A-Z]*/ is down"
               }
          }
   }
   "filter":{
           "term": {
                "application": "XYZ"
            }
    }
}

如何编写DSL查询将这些消息与应用程序名称上的过滤器一起匹配。

原文

The log messages are like below

The application node ABC is down
The application node BCD is down
The application node XXX  is down

I have written the following query but it is not working

"query": {
   "must": {
          "match": {
               "log_message": {
                    "query": "The application node /[A-Z]*/ is down"
               }
          }
   }
   "filter":{
           "term": {
                "application": "XYZ"
            }
    }
}

How to write a DSL query to match these messages along with filter on application name.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

笙痞 2025-02-02 07:22:14

第一点，因为您尚未给出考虑log_message的索引映射，定义为文本在没有任何分析器的情况下键入字段。因此，它将考虑log_message字段的默认标准分析仪。

在这里，您的正则模式/[az]*/在索引时将所有令牌转换为standard分析仪将所有令牌转换为plowsect。您可以阅读有关标准分析仪。您可以替换图案，例如`/[az]*/

point，匹配查询不支持REGEX模式。您可以使用query_string Elasticsearch的查询类型，如下所示：

{
  "query": {
    "bool": {
      "must": [
        {
          "query_string": {
            "default_field": "log_message",
            "query": "The application node /[a-z]*/ is down",
            "default_operator": "AND"
          }
        }
      ],
      "filter": [
        {
          "term": {
            "application": "XYZ"
          }
        }
      ]
    }
  }
}

Regex查询将影响您的搜索性能，因此要谨慎使用。

最佳解决方案：

如果您的用例与节点名称和应用程序名称的查询匹配，则可以使用node或down等节点的状态，则可以使用 grok pattern with ingest管道并将其存储为单独的值并将其用于查询。

以下是日志消息的示例grok模式（您可以根据各种日志模式进行修改）：

The application node %{WORD:node_name} is %{WORD:node_status}

上面，grok模式将给出以下结果：

{
  "node_name": "ABC",
  "node_status": "down"
}

示例摄入管道：

PUT _ingest/pipeline/my-pipeline
{
  "processors": [
    {
      "grok": {
        "field": "log_message",
        "patterns": [
          "The application node %{WORD:node_name} is %{WORD:node_status}"
        ]
      }
    }
  ]
}

您可以在索引文档时使用以下管道：

POST index_name/_doc?pipeline=my-pipeline
{
  "log_message":"The application node XXX is down"
}

输出文档：

"hits" : [
      {
        "_index" : "index_name",
        "_type" : "_doc",
        "_id" : "bZuMkoABMUDAwut6pbnf",
        "_score" : 1.0,
        "_source" : {
          "node_status" : "down",
          "node_name" : "XXX",
          "log_message" : "The application node XXX is down"
        }
      }
    ]

您可以使用：您可以使用以下查询以获取有关特定节点的数据，该节点已下降：

{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "node_name": "XXX"
          }
        },
        {
          "match": {
            "node_status": "down"
          }
        }
      ],
      "filter": [
        {
          "term": {
            "application": "XYZ"
          }
        }
      ]
    }
  }
}

The first point, as you have not given index mapping considering log_message is defined as text type field without any analyzer. So it will consider default standard analyzer for log_message field.

Here, your regex pattern /[A-Z]*/ will not work as standard analyzer convert all tokens into lowercase while indexing. You can read about standard analyzer here. You can replace your pattern like `/[a-z]*/

Second Point, match query not support regex pattern. You can use query_string type of query of Elasticsearch as shown below:

{
  "query": {
    "bool": {
      "must": [
        {
          "query_string": {
            "default_field": "log_message",
            "query": "The application node /[a-z]*/ is down",
            "default_operator": "AND"
          }
        }
      ],
      "filter": [
        {
          "term": {
            "application": "XYZ"
          }
        }
      ]
    }
  }
}

Regex query will impact your search performance so used with caution.

Best Solution:

If your use case to match the query on node name and application name with status of node like running or down then you can get this information from message field using grok pattern with ingest pipeline and stored as separate value and use it to query.

Below is sample grok pattern for your log message (you can modified based on your various log pattern):

The application node %{WORD:node_name} is %{WORD:node_status}

Above, grok pattern will give below result:

{
  "node_name": "ABC",
  "node_status": "down"
}

Sample Ingest Pipeline:

PUT _ingest/pipeline/my-pipeline
{
  "processors": [
    {
      "grok": {
        "field": "log_message",
        "patterns": [
          "The application node %{WORD:node_name} is %{WORD:node_status}"
        ]
      }
    }
  ]
}

You can use pipeline like below while indexing document:

POST index_name/_doc?pipeline=my-pipeline
{
  "log_message":"The application node XXX is down"
}

Output Document:

"hits" : [
      {
        "_index" : "index_name",
        "_type" : "_doc",
        "_id" : "bZuMkoABMUDAwut6pbnf",
        "_score" : 1.0,
        "_source" : {
          "node_status" : "down",
          "node_name" : "XXX",
          "log_message" : "The application node XXX is down"
        }
      }
    ]

You can use below query to get data for specific node which is down:

{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "node_name": "XXX"
          }
        },
        {
          "match": {
            "node_status": "down"
          }
        }
      ],
      "filter": [
        {
          "term": {
            "application": "XYZ"
          }
        }
      ]
    }
  }
}

回复收藏 0 原文

~没有更多了~