雪花雪板蟒蛇 - 校准
关于与Python的Snowpark有一些问题。
当我们已经有了雪花python连接器(可以使用)可以用来连接到python jupyter snowflake dw?
时,为什么我们需要雪堆?如果我们使用Snowpark并与本地Jupyter文件连接以运行ML模型。它是否使用我们本地的机器计算能力或雪花计算能力?如果其本地机器计算功率我们如何使用雪花计算能力来运行ML模型?
Have a few questions regarding SnowPark with Python.
Why do we need Snowpark when we already have Snowflake python connector(freely) that can use to connect to Python jupyter with Snowflake DW?
If we use snowpark and connect with Local jupyter file to run ML model. Is it use our local machine computing power or Snowflake computing power?If its our local machine computing power how can we use Snowflake computing power to run the ml model?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
我建议从这里开始了解更多信息:
https :// /en/developer-guide/snowpark/index.html
I recommend starting here to learn more:
https://docs.snowflake.com/en/developer-guide/snowpark/index.html
要记住的几件事是,我们在这里谈论多件事,并且可以通过一些澄清来很好。
Snowpark是您通过PIP/CONDA安装的库,它是一个数据框架库,这意味着您将能够定义一个数据框对象,该对象指向雪花中的数据(也有一些方法也可以使用它将数据纳入雪花上)。除非您也明确说明,否则它不会将数据拉回给客户端,并且所有计算都在雪花侧完成。
当您在Snewpark DataFrame上进行操作时,您正在使用Python代码,该代码将在Snowflake中生成SQL,使用相同的机制,就像您编写了自己的SQL一样。生成的SQL的执行是由.show(),.collect(),save_as_table()等动作方法触发的。
这里
更多信息在 Snowflake Python支持也有Python UDFS和
Python存储的过程,您不需要Snowpark来创建或使用这些程序,因为您可以使用Create功能/Create存储过程使用SQL来执行此操作,但是您也可以使用Snowpark。
借助Python UDF和Python存储的过程,您可以将Python代码带入雪花上,该代码将在雪花计算上执行,它将不会翻译成SQL,但将使用在计算节点上运行的Python沙盒。
为了使用Python存储的过程或Python UDF,您不必做任何事情,它就像雪花上有其他内置的功能一样。
有关 python存储过程。
Snowflake Python连接器允许您编写在雪花上执行的SQL,并将结果拉回到达客户端,使用客户端内存等。如果您希望在雪花中执行操作,则需要编写SQL为此。
A couple of things to have in mind is that we are talking about multiple things here and it could be good with some clarification.
Snowpark is a library that you install through pip/conda and it's a dataframe library, meaning you will be able to define a dataframe object that points to data in Snowflake (there is also ways to get data into Snowflake using it as well). It does not pull back the data to the client, unless you explicit tells it too, and all computation is done on the Snowflake side.
When you do operations on a Snowpark dataframe you are using Python code that will generate SQL that is executed in Snowflake, using the same mechanism as if you wrote your own SQL. The execution of the generated SQL is triggered by action methods such as .show(), .collect(), save_as_table() and so on.
More information here
As part of the Snowflake Python support there is also Python UDFs and
Python Stored Procedures, you do not need Snowpark to create or use those since you can do that with SQL using CREATE FUNCTION/CREATE STORED PROCEDURE, but you can use Snowpark as well.
With Python UDFs and Python Stored Procedures you can bring Python code into Snowflake that will be executed on the Snowflake compute, it will not be translated into SQL but will use Python sandboxes that run on the compute nodes.
In order to use Python Stored Procedures or Python UDFs you do not have to do anything, it is there like any other built in feature of Snowflake.
More information about Python UDFs and information about Python Stored Procedures.
The Snowflake Python Connector allows you to write SQL that is executed on Snowflake and the the result is pulled back to the client to be used there, using the client memory etc. If you want your manipulation to be executed in Snowflake you need to write SQL for it.
使用现有的Snowflake Python连接器,您将雪花数据带到执行Python程序的系统中,从而将您限制在该系统的计算和内存中。使用Snowpark for Python,您将Python代码带到雪花上,以利用云平台的计算和内存。
Using the existing Snowflake Python Connector you bring the Snowflake data to the system that is executing the Python program, limiting you to the compute and memory of that system. With Snowpark for Python, you are bringing your Python code to Snowflake to leverage the compute and memory of the cloud platform.
SnowPark Python提供了以下好处,这些好处与Snowflake Python Connector
它允许数据工程师,数据科学家和数据开发人员以其选择的语言以熟悉的方式进行编码,并在单个平台中更快,更安全地执行管道,ML工作流和数据应用程序。
用户可以使用DataFrame API的熟悉语法(编程的DataFrame样式)构建/使用查询/工作。
用户可以使用所有流行的Anaconda库来构建/工作。所有这些库都是预装的。用户可以从Anaconda的库中访问数百个策划的开源Python软件包。
SnowPark操作在服务器上懒惰地执行,这减少了客户和雪花数据库之间传输的数据量。
有关更多详细信息,请参阅
Snowpark python provides the following benefits which are not there with the Snowflake python connector
It allows data engineers, data scientists and data developers to code in their familiar way with their language of choice, and execute pipeline, ML workflow and data apps faster and more securely, in a single platform.
User can build/work with queries using the familiar syntax of Dataframe APIs ( Dataframe style of programming)
User can use all popular Anaconda's libraries, all these libraries are pre-installed. User has access to hundreds of curated, open-source Python packages from Anaconda's libraries.
Snowpark operations are executed lazily on the server, which reduces the amount of data transferred between your client and the Snowflake database.
For more details, please refer to the documentation
我认为理解雪园很复杂。我认为@mats的答案真的很好。我创建了我认为提供一些高级指导的博客文章: https:// .mobilize.net/blog/lost-in-the-snowpark
I think that understanding Snowpark is complex. I think @Mats answer is really good. I created blog post that I think provides some high level guidance: https://www.mobilize.net/blog/lost-in-the-snowpark