雪花雪板蟒蛇 - 校准

发布于 2025-02-10 09:55:43 字数 250 浏览 1 评论 0原文

关于与Python的Snowpark有一些问题。

  1. 当我们已经有了雪花python连接器(可以使用)可以用来连接到python jupyter snowflake dw?

    时,为什么我们需要雪堆?

  2. 如果我们使用Snowpark并与本地Jupyter文件连接以运行ML模型。它是否使用我们本地的机器计算能力或雪花计算能力?如果其本地机器计算功率我们如何使用雪花计算能力来运行ML模型?

Have a few questions regarding SnowPark with Python.

  1. Why do we need Snowpark when we already have Snowflake python connector(freely) that can use to connect to Python jupyter with Snowflake DW?

  2. If we use snowpark and connect with Local jupyter file to run ML model. Is it use our local machine computing power or Snowflake computing power?If its our local machine computing power how can we use Snowflake computing power to run the ml model?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

静赏你的温柔 2025-02-17 09:55:43
  1. 与Python的Snowpark可以让您像火花DF一样对待雪花桌。这意味着您可以针对雪花桌运行Pyspark代码,而无需将数据从雪花上拉出,并且计算是雪花计算,而不是您的本地机器,这是完全弹性的。
  2. 只要您在Python中执行Spark DataFrame逻辑,该计算将在雪花侧。如果将这些数据拉回机器以执行其他逻辑(例如,熊猫),那么Snowpark将将数据拉回您的本地机器,并且该计算将像正常情况一样发生。

我建议从这里开始了解更多信息:

https :// /en/developer-guide/snowpark/index.html

  1. Snowpark with Python allows you to treat a Snowflake table like a Spark DF. This means you can run pyspark code against Snowflake tables without the need to pull the data out of Snowflake, and the compute is Snowflake compute, not your local machine, which is fully elastic.
  2. As long as you are executing spark dataframe logic in python, the compute will be on the Snowflake side. If you pull that data back to your machine to execute other logic (pandas, for example), then Snowpark will be pulling the data back to your local machine and the compute will happen there as normal.

I recommend starting here to learn more:

https://docs.snowflake.com/en/developer-guide/snowpark/index.html

夜空下最亮的亮点 2025-02-17 09:55:43

要记住的几件事是,我们在这里谈论多件事,并且可以通过一些澄清来很好。

Snowpark是您通过PIP/CONDA安装的库,它是一个数据框架库,这意味着您将能够定义一个数据框对象,该对象指向雪花中的数据(也有一些方法也可以使用它将数据纳入雪花上)。除非您也明确说明,否则它不会将数据拉回给客户端,并且所有计算都在雪花侧完成。

当您在Snewpark DataFrame上进行操作时,您正在使用Python代码,该代码将在Snowflake中生成SQL,使用相同的机制,就像您编写了自己的SQL一样。生成的SQL的执行是由.show(),.collect(),save_as_table()等动作方法触发的。

这里

更多信息在 Snowflake Python支持也有Python UDFS和
Python存储的过程,您不需要Snowpark来创建或使用这些程序,因为您可以使用Create功能/Create存储过程使用SQL来执行此操作,但是您也可以使用Snowpark。

借助Python UDF和Python存储的过程,您可以将Python代码带入雪花上,该代码将在雪花计算上执行,它将不会翻译成SQL,但将使用在计算节点上运行的Python沙盒。

为了使用Python存储的过程或Python UDF,您不必做任何事情,它就像雪花上有其他内置的功能一样。

有关 python存储过程

Snowflake Python连接器允许您编写在雪花上执行的SQL,并将结果拉回到达客户端,使用客户端内存等。如果您希望在雪花中执行操作,则需要编写SQL为此。

A couple of things to have in mind is that we are talking about multiple things here and it could be good with some clarification.

Snowpark is a library that you install through pip/conda and it's a dataframe library, meaning you will be able to define a dataframe object that points to data in Snowflake (there is also ways to get data into Snowflake using it as well). It does not pull back the data to the client, unless you explicit tells it too, and all computation is done on the Snowflake side.

When you do operations on a Snowpark dataframe you are using Python code that will generate SQL that is executed in Snowflake, using the same mechanism as if you wrote your own SQL. The execution of the generated SQL is triggered by action methods such as .show(), .collect(), save_as_table() and so on.

More information here

As part of the Snowflake Python support there is also Python UDFs and
Python Stored Procedures, you do not need Snowpark to create or use those since you can do that with SQL using CREATE FUNCTION/CREATE STORED PROCEDURE, but you can use Snowpark as well.

With Python UDFs and Python Stored Procedures you can bring Python code into Snowflake that will be executed on the Snowflake compute, it will not be translated into SQL but will use Python sandboxes that run on the compute nodes.

In order to use Python Stored Procedures or Python UDFs you do not have to do anything, it is there like any other built in feature of Snowflake.

More information about Python UDFs and information about Python Stored Procedures.

The Snowflake Python Connector allows you to write SQL that is executed on Snowflake and the the result is pulled back to the client to be used there, using the client memory etc. If you want your manipulation to be executed in Snowflake you need to write SQL for it.

不一样的天空 2025-02-17 09:55:43

使用现有的Snowflake Python连接器,您将雪花数据带到执行Python程序的系统中,从而将您限制在该系统的计算和内存中。使用Snowpark for Python,您将Python代码带到雪花上,以利用云平台的计算和内存。

Using the existing Snowflake Python Connector you bring the Snowflake data to the system that is executing the Python program, limiting you to the compute and memory of that system. With Snowpark for Python, you are bringing your Python code to Snowflake to leverage the compute and memory of the cloud platform.

九公里浅绿 2025-02-17 09:55:43

SnowPark Python提供了以下好处,这些好处与Snowflake Python Connector

  1. 用户可以以UDF(用户定义的功能)的形式将其自定义Python客户端代码带入雪花,并在DataFrame上使用这些功能。

它允许数据工程师,数据科学家和数据开发人员以其选择的语言以熟悉的方式进行编码,并在单个平台中更快,更安全地执行管道,ML工作流和数据应用程序。

  1. 用户可以使用DataFrame API的熟悉语法(编程的DataFrame样式)构建/使用查询/工作。

  2. 用户可以使用所有流行的Anaconda库来构建/工作。所有这些库都是预装的。用户可以从Anaconda的库中访问数百个策划的开源Python软件包。

  3. SnowPark操作在服务器上懒惰地执行,这减少了客户和雪花数据库之间传输的数据量。

有关更多详细信息,请参阅

Snowpark python provides the following benefits which are not there with the Snowflake python connector

  1. User can bring their custom python client code into Snowflake in the form of a UDF (user defined function) and use these functions on Dataframe.

It allows data engineers, data scientists and data developers to code in their familiar way with their language of choice, and execute pipeline, ML workflow and data apps faster and more securely, in a single platform.

  1. User can build/work with queries using the familiar syntax of Dataframe APIs ( Dataframe style of programming)

  2. User can use all popular Anaconda's libraries, all these libraries are pre-installed. User has access to hundreds of curated, open-source Python packages from Anaconda's libraries.

  3. Snowpark operations are executed lazily on the server, which reduces the amount of data transferred between your client and the Snowflake database.

For more details, please refer to the documentation

顾冷 2025-02-17 09:55:43

我认为理解雪园很复杂。我认为@mats的答案真的很好。我创建了我认为提供一些高级指导的博客文章: https:// .mobilize.net/blog/lost-in-the-snowpark

I think that understanding Snowpark is complex. I think @Mats answer is really good. I created blog post that I think provides some high level guidance: https://www.mobilize.net/blog/lost-in-the-snowpark

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文