如何将Argo CD与DataDog集成来查询自动升级(B/G)的部署资源状态?

发布于 2025-01-10 17:22:33 字数 1522 浏览 6 评论 0原文

我正在尝试将Argo与DataDog集成来查询指标并根据指标值来评估部署以自动升级B/G升级。 就我而言,问题是 Argo 无法评估通过分析模板传递的 DataDog 查询...

Kubernetes 版本:v1.20 (EKS),argo cd 版本:v2.2.2,argo 推出:v1.1.1

分析模板 I'我使用:

apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: gateway-uat-pat
spec:
  args:
  - name: service-name
  metrics:
  - name: gateway-uat-pat
    interval: 5m
    successCondition: default(result, 0) <= 10
    failureLimit: 3
    provider:
      datadog:
        interval: 5m
        query: |
          sum:trace.http.request.errors{service:{{args.service-name}}}

我正在创建的秘密对象:

apiVersion: v1
kind: Secret
metadata:
  name: datadog
type: Opaque
stringData:
  address: https://api.datadoghq.com
  api-key: '***'
  app-key: '***' 

分析模板和秘密都是在 Argo 外部创建的。然后尝试使用 Argo Rollouts 部署原始应用程序,我在我的 rollout 文件规范中包含了以下策略:

  strategy:
    blueGreen:
      activeService:  gateway
      previewService:  gateway-preview
      postPromotionAnalysis:
        templates:
        - templateName: gateway-uat-pat
        args:
        - name: service-name
          value: gateway-qa

我不断收到的错误:

InvalidSpec:推出“gateway-rollouts”无效:spec.strategy.blueGreen.postPromotionAnalysis.templates:无效值:“gateway-uat-p​​at”:AnalysisTemplate gateway-uat-p​​at 具有指标 gateway-uat-p​​at ,其中无限期地运行。计数值无效:

我深入研究了 Argo CD Analysis 文档,但找不到有关如何使用 Argo 成功评估 DataDog 查询的任何信息。我是否对 AnalysisTemplate 中的参数进行了任何错误配置/有关我做错的地方的任何信息?谢谢

I'm trying to integrate Argo with DataDog to query the metrics and based on the metric value to evaluate the deployment to automatically promote for B/G promotion.
In my case the issue is Argo fails to evaluate the DataDog query that passed via Analysis template...

Kubernetes version: v1.20 (EKS), argo cd version: v2.2.2, argo rollouts: v1.1.1

The Analysis template I'm using:

apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: gateway-uat-pat
spec:
  args:
  - name: service-name
  metrics:
  - name: gateway-uat-pat
    interval: 5m
    successCondition: default(result, 0) <= 10
    failureLimit: 3
    provider:
      datadog:
        interval: 5m
        query: |
          sum:trace.http.request.errors{service:{{args.service-name}}}

The secret object I'm creating:

apiVersion: v1
kind: Secret
metadata:
  name: datadog
type: Opaque
stringData:
  address: https://api.datadoghq.com
  api-key: '***'
  app-key: '***' 

Both Analysis Template and secret are created outside of Argo. And then tried deploying original application using Argo Rollouts and I have included the following strategy in my rollout file spec:

  strategy:
    blueGreen:
      activeService:  gateway
      previewService:  gateway-preview
      postPromotionAnalysis:
        templates:
        - templateName: gateway-uat-pat
        args:
        - name: service-name
          value: gateway-qa

The error I keep getting:

InvalidSpec: The Rollout "gateway-rollouts" is invalid: spec.strategy.blueGreen.postPromotionAnalysis.templates: Invalid value: "gateway-uat-pat": AnalysisTemplate gateway-uat-pat has metric gateway-uat-pat which runs indefinitely. Invalid value for count:

I dig into the Argo CD Analysis docs, but couldn't find any information on how to successfully evaluate the DataDog queries with Argo. Have I done any mis-configurations with args in AnalysisTemplate / any information on where I'm doing wrong? Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

陌伤浅笑 2025-01-17 17:22:33

我找到了解决方案@naveen。应将“计数”属性添加到分析中。如果没有,分析将永远循环并超时。

apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: loq-error-rate
spec:
  metrics:
  - name: error-rate
    interval: 30s
    count: 2
    successCondition: result < 1
    failureLimit: 3
    provider:
      datadog:
        interval: 5m
        query: |
          sum:system.cpu.user

I found the solution @naveen. "Count" attribute should be added to the analysis. If not, analysis will loop forever and timeout.

apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: loq-error-rate
spec:
  metrics:
  - name: error-rate
    interval: 30s
    count: 2
    successCondition: result < 1
    failureLimit: 3
    provider:
      datadog:
        interval: 5m
        query: |
          sum:system.cpu.user

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文