当前位置：文江博客话题详情

Flyte如何为“数据和机器学习”量身定制？

发布于 2025-02-08 06:46:18 字数 546 浏览 1 评论 0 原文

，用于规模的复杂，关键任务数据和机器学习过程的工作流程自动化平台

，我经过了很多文档，我看不到为什么它是“数据和机器学习”。在我看来，这是一个工作流程管理器，在容器兰花（这里是kubernetes）之上，工作流管理器的意思是，我可以定义有向的无环形图（DAG），然后将DAG节点部署为容器，而DAG则是DAG是跑步。

当然，这对于“数据和机器学习”很有用，很重要，但是我不妨将其用于任何其他微服务DAG。除了功能/详细信息外，这与 https://spark.apache.org 。

作为软件成就，我应该记住什么？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

一个人的夜不怕黑 2025-02-15 06:46:18

这是一个很好的问题。一方面，您是正确的，核心是无服务器的工作流编排器（无服务器，因为它确实会带来基础架构来运行代码）。是的，它可以用于多种其他情况。对于其他一些系统，例如微服务编排，它可能不是最好的工具。

但是，真正使它对ML＆amp;数据编排是功能的组合

（下面列表）＆amp;
的人们的集成（下面）
社区
使用IT路线图

具有

长期运行的任务：它是针对运行漫长的任务而设计的。即使控制平面下降，可能会运行几天和几周的任务，您也不会失去工作。您可以继续部署而不会影响现有工作。
版本控制 - 允许多个用户在同一
工作流程，使用不同的库，模型，输入等
记忆。让我们以10个步骤的管道示例，您可以记住所有9个步骤，如果第10步失败或可以修改第10步，然后将重复使用前9个。这会导致急剧更快的迭代强大的迭代
强烈的键入和ML特定类型支持。
Flyte了解数据框，并能够从Spark.dataframe-＆gt; pandas.dataframe-＆gt; modin-＆gt; Porars等，没有用户必须考虑如何有效地进行操作。还支持张量（正确序列化），numpy阵列等诸如模型以及从过去的执行中检索的模型，因此，实际上，Model Truth Store
存储了本机内部支持任务检查点。这可以帮助您在节点失败之间恢复模型培训，甚至在跨执行之间恢复。添加了新的支持，用于检查点回调。
Flyte甲板：一种可视化指标，例如ROC曲线等指标，或将数据输入到任务分布的自动可视化。
可扩展的编程接口，可以协调分布式作业或在本地运行 -
例如，Spark，MPI，
用于库隔离
调度程序的SageMaker参考任务独立于用户代码
理解GPU等的资源 - 自动在GPU和或SPOT机器上安排。通过智能处理点计算机-N-1自动恢复最后一个，将移动到按需机器上，以更好地确保
地图任务和动态任务。（在区域列表上的地图），动态 - ＆gt;基于输入的
多个发射计划创建新的静态图。附表2运行的工作流程具有略有不同的超级参数或模型值等的工作流程

对于真正长期运行的任务，管理员可以部署管理层而无需杀死
对spot/arm/gpu的任务支持（带有不同的任务版本等）
每个项目 /域名升级的配额和油门
，而无需升级用户库

集成

pandas dataframe pandas dataframe天然支持
火花
MPI工作（帮派计划）
pandera / pandera / pandera /对数据质量
萨吉马制造商
轻松部署模型，用于服务
PORARS / MODIN / MODIN / MODIN / SPARK FARGE
DATAISORS /检查点等
等等以及路线图

社区

中的许多其他人都集中在ML特定功能的

路线图

CD4ML上，其中人为循环和基于外部信号的工作流程。这将允许用户自动化模型的部署或在循环标签中执行人类，
以支持雷/spark/dask群集在整个任务中
与Whylogs集成以及其他用于监视
MLFLOW的集成等
的工具，以使更多的本机flydtedecks呈现更多的本机渲染器，

希望此答案有希望的答案你的问题。另外，请加入Slack社区，并帮助传播此信息。也问更多问题

That is a great question. You are right in one thing, at the core it is a Serverless Workflow Orchestrator (serverless, because it does bring up the infrastructure to run the code). And yes it can be used for multiple other situations. It may not be the best tool for some other systems like Micro-service orchestration.

But, what really makes it good for ML & Data Orchestration is a combination of

Features (list below) &
Integrations (list below)
Community of folks using it
Roadmap

Features

Long running tasks: It is designed for extremely long running tasks. Tasks that can run for days and weeks, even if the control plane goes down, you will not lose the work. You can keep deploying without impacting existing work.
Versioning - allow multiple users to work independently on the same
workflow, use different libraries, models, inputs etc
Memoization. Lets take an example of a pipeline with 10 steps, you can memoize all 9 steps and if 10th fails or you can modify 10th and then it will reuse results from previous 9. This leads to drastically faster iteration
Strong typing and ML specific type supports
Flyte understands dataframes and is able to translate dataframes from spark.dataFrame -> pandas.DataFrame -> Modin -> polars etc without the user having to think about how to do it efficiently. Also supports things like tensors (correctly serialized), numpy arrays, etc. Also models can be saved and retrieved from past executions so is infact the model truth store
Native support for Intra task checkpointing. This can help is recovering model training between node failures and across executions even. With new support being added for Checkpointing callbacks.
Flyte decks: A way to visualize metrics like ROC curve, etc or auto visualization of the distribution of data input to a task.
Extendable Programming interface, that can orchestrate distributed jobs or run locally -
e.g spark, MPI, sagemaker
Reference task for library isolation
Scheduler independent of user code
Understanding of resources like GPU's etc - Automatically schedule on gpus and or spot machines. With Smart handling of spot machines - n-1 retries last one automatically is moved to an on-demand machine to better guarantees
Map tasks and dynamic tasks. (map over a list of regions), dynamic -> create new static graphs based on inputs dyanmically
Multiple launchplans. Schedule 2 runs for a workflow with slightly different hyper parameters or model values etc

For Admins

For really long running tasks, admin can deploy the management layer without killing the tasks
Support for spot/arm/gpu (with different versions etc)
Quotas and throttles for per project/domain
Upgrade infra without upgrading user libraries

Integrations

pandas dataframe native support
Spark
mpi jobs (gang scheduled)
pandera / Great expectations for data quality
Sagemaker
Easy deployment of model for serving
Polars / Modin / Spark dataframe
tensors / checkpointing etc
etc and many others in the roadmap

Community

Focused on ML specific features

Roadmap

CD4ML, with human in the loop and external signal based workflows. This will allow for users to automate deployment of models or perform human in the loop labeling etc
Support for Ray/Spark/Dask cluster re-use across tasks
Integration with WhyLogs and other tools for monitoring
Integration with MLFlow etc
More native Flytedecks renderers

Hopefully this answers your questions. Also please join the slack community and help spread this information. Also ask more questions

回复收藏 0 原文

~没有更多了~

关于作者

梦断已成空

暂无简介

文章

26 人气

关注发私信

友情链接

文江博客

Flyte如何为“数据和机器学习”量身定制？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

具有

集成

社区

路线图

Features

Integrations

Community

Roadmap

关于作者

相关话题

热门标签

推荐作者

十二

飞烟轻若梦

OPleyuhuo

wxb0109

旧城空念

-小熊_

友情链接

Flyte如何为“数据和机器学习”量身定制？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

具有

集成

社区

路线图

Features

Integrations

Community

Roadmap

关于作者

相关话题

热门标签

推荐作者

十二

飞烟轻若梦

OPleyuhuo

wxb0109

旧城空念

-小熊_

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。