如何从自定义容器中获取输出并将其传递到顶点AI/Kubeflow管道中的下一个管道?

发布于 2025-02-08 16:22:37 字数 1008 浏览 3 评论 0 原文

我很难理解如何通过容器作为输出工件传递结果。我知道我们需要将输出写入文件,但我需要一些示例如何做。

这是python容器程序的最后部分/代码>。

with open('./output.txt', 'w') as f:
    logging.info(f"Model path url is in {'./output.txt'}")
    f.write(model_path)

这是组件 .yaml 文件

name: Dummy Model Training
description: Train a dummy model and save to GCS
inputs:
  - name: input_url
    description: 'Input csv url.'
    type: String
  - name: gcs_url
    description: 'GCS bucket url.'
    type: String
outputs:
  - name: gcs_model_path
    description: 'Trained model path.'
    type: String
implementation:
    container:
        image: ${CONTAINER_REGISTRY}
        command: [
          python, ./app/trainer.py,
          --input_url, {inputValue: input_url},
          --gcs_url, {inputValue: gcs_url},
        ]

I am having difficulty trying to understand how to pass a result from a container as an output artifact. I understand that we need to write the output to a file but i need some example how to do it.

https://www.kubeflow.org/docs/components/pipelines/sdk-v2/component-development/

This is the last part of the python container program where i save the url of model file on GCS onto output.txt.

with open('./output.txt', 'w') as f:
    logging.info(f"Model path url is in {'./output.txt'}")
    f.write(model_path)

This is the component .yaml file

name: Dummy Model Training
description: Train a dummy model and save to GCS
inputs:
  - name: input_url
    description: 'Input csv url.'
    type: String
  - name: gcs_url
    description: 'GCS bucket url.'
    type: String
outputs:
  - name: gcs_model_path
    description: 'Trained model path.'
    type: String
implementation:
    container:
        image: ${CONTAINER_REGISTRY}
        command: [
          python, ./app/trainer.py,
          --input_url, {inputValue: input_url},
          --gcs_url, {inputValue: gcs_url},
        ]

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

空心空情空意 2025-02-15 16:22:37

首先,您的虚拟组件缺少对输出的参考。您需要使用 {outputpath:< output_name>} {outputuri:< output_name>} 将其传递到容器中,以便您的容器代码可以写入数据对于此系统,生成路径或URI(“ gs:// ...”)。要修复您的组件yaml,可以是:

name: Dummy Model Training
description: Train a dummy model and save to GCS
inputs:
  - name: input_url
    description: 'Input csv url.'
    type: String
  - name: gcs_url
    description: 'GCS bucket url.'
    type: String
outputs:
  - name: gcs_model_path
    description: 'Trained model path.'
    type: String
implementation:
    container:
        image: ${CONTAINER_REGISTRY}
        command: [
          python, ./app/trainer.py,
          --input_url, {inputValue: input_url},
          --gcs_url, {inputValue: gcs_url},
          --output_model_path, {outputPath: gcs_model_path}
        ]

那么您的代码应该写入此传递路径,而不是关于

如何在下游组件中消耗输出的“ ./output.txt”。这是一个简单但可运行的示例,您可以在顶点管道上尝试一下:

First of all, your dummy component is missing reference to the output. You need to use {outputPath: <output_name>} or {outputUri: <output_name>} to pass it into the container, so that you container code can write data to this system generated path or URI ("gs://..."). To fix your component yaml, it can be:

name: Dummy Model Training
description: Train a dummy model and save to GCS
inputs:
  - name: input_url
    description: 'Input csv url.'
    type: String
  - name: gcs_url
    description: 'GCS bucket url.'
    type: String
outputs:
  - name: gcs_model_path
    description: 'Trained model path.'
    type: String
implementation:
    container:
        image: ${CONTAINER_REGISTRY}
        command: [
          python, ./app/trainer.py,
          --input_url, {inputValue: input_url},
          --gcs_url, {inputValue: gcs_url},
          --output_model_path, {outputPath: gcs_model_path}
        ]

Then your code should write to this passed-in path, instead of './output.txt'

Regarding how to consume the output in a downstream component. Here's a simple yet runnable example, which you can try out on Vertex Pipelines:
https://github.com/kubeflow/pipelines/blob/bf2389a66c164457b0e10a820ba484992fd7dd1a/sdk/python/test_data/pipelines/two_step_pipeline.py

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文