创建雪花数据库副本的最快方法

发布于 2025-02-12 08:01:57 字数 407 浏览 0 评论 0原文

在雪花上释放之前,我需要制作数据库的副本以检查我的代码。

当前,我们使用克隆方法来创建数据库副本: https://docs.snowflake.com/en/sql-reference/sql/create-clone.html

但大约需要15分钟。我正在尝试找到更快的方法来创建副本(有或没有复制数据 - 没关系)。

我尝试的是:

  • 整个数据库的get_ddl函数 - 返回对象而不保存创建对象的顺序

,也许有人知道更好的选项?

I need to make a copy of the database to check my code before release in snowflake.

Currently, we use clone method for creating copy of database: https://docs.snowflake.com/en/sql-reference/sql/create-clone.html

But it's take around 15 min. I'm trying to find faster way to create a copy (with or without copying data - it doesn't matter).

What I tried:

  • get_ddl function for whole database - returns objects without saving order of creating creating objects

Maybe someone know better options for it?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

逐鹿 2025-02-19 08:01:57

您可以同时并行克隆几个模式。

编写一个存储过程,该过程将为您将在数据库中克隆的每个架构创建根任务和子任务。然后,您运行root任务。

示例代码。

在开始时,创建数据库的“任务根”:

CREATE OR REPLACE TASK root_clone_task
WAREHOUSE = 'COMPUTE_WH'
AS
CREATE OR REPLACE DATABASE CLONE_MY_DB;

然后,与您具有克隆架构一样多的子任务,或者最好是8个子任务和在它们之间的8个子任务和平衡的克隆(默认情况下,WH支持8个线程):

CREATE OR REPLACE TASK sub_clone_task1
WAREHOUSE = 'COMPUTE_WH'
AFTER root_clone_task
AS
CREATE OR REPLACE SCHEMA CLONE_MY_DB.SCHEMA1 CLONE MY_DB.SCHEMA1;

CREATE OR REPLACE TASK sub_clone_task2
WAREHOUSE = 'COMPUTE_WH'
AFTER root_clone_task
AS
CREATE OR REPLACE SCHEMA CLONE_MY_DB.SCHEMA2 CLONE MY_DB.SCHEMA2;

然后我们需要恢复所有子任务,可以使用Alter Task< name>一次完成一个子任务。简历,或者我们可以使用命令:

select system$task_dependents_enable('MY_DB.PUBLIC.root_clone_task');

您将获得错误消息:

“任务应具有时间表或恢复后。”

如果要避免这种情况,则可以在root任务中定义时间表,但是将启用所有子任务。

最后,只需执行根本任务:

EXECUTE TASK root_clone_task;

当然,您可以通过编写具有动态代码的存储过程来自动化此任务,该过程将生成所有必要的任务。

最后,在我看来,大多数时间都需要移动元数据和有关权限的信息,在许多情况下,我们以后删除这些权限,以使它们完全不同。
因此,我提出了一个新想法:在没有特权的情况下添加一个选项

请访问网站: https://community.snowflake.com/snowflake.com/s/ideas
并搜索这个想法:

添加一个没有特权的选项到克隆选项
并为此投票,我希望它能大大加快克隆过程。

You can clone several schemas in parallel at the same time.

Write a stored procedure that will create a root TASK and sub tasks for each schema you will clone in the database. Then you run root TASK.

Sample code.

At the beginning, the 'TASK root' which creates the database:

CREATE OR REPLACE TASK root_clone_task
WAREHOUSE = 'COMPUTE_WH'
AS
CREATE OR REPLACE DATABASE CLONE_MY_DB;

Then as many sub-tasks as you have schema to clone, or preferably 8 sub-tasks and balanced cloning between them (by default WH supports 8 threads in parallel):

CREATE OR REPLACE TASK sub_clone_task1
WAREHOUSE = 'COMPUTE_WH'
AFTER root_clone_task
AS
CREATE OR REPLACE SCHEMA CLONE_MY_DB.SCHEMA1 CLONE MY_DB.SCHEMA1;

CREATE OR REPLACE TASK sub_clone_task2
WAREHOUSE = 'COMPUTE_WH'
AFTER root_clone_task
AS
CREATE OR REPLACE SCHEMA CLONE_MY_DB.SCHEMA2 CLONE MY_DB.SCHEMA2;

Then we need to resume all the sub-tasks, this can be done one at a time using ALTER TASK <name> RESUME or we can use the command:

select system$task_dependents_enable('MY_DB.PUBLIC.root_clone_task');

You will get the error message:

"Task should have a SCHEDULE or AFTER to be resumed."

If you want to avoid this you can define SCHEDULE in root TASK, but all child tasks will be enabled.

Finally, just execute the root TASK:

EXECUTE TASK root_clone_task;

Of course, you can automate this by writing a stored procedure with dynamic code that will generate all the necessary tasks.

At the end, it seems to me that it takes most of the time to move metadata and information about permissions, and in many cases we remove these permissions later to give them a completely different one.
Therefore, I proposed a new idea: Add an option WITHOUT PRIVILEGES to the CLONE option.

Please go to the website: https://community.snowflake.com/s/ideas
and search for the idea:

Add an option WITHOUT PRIVILEGES to the CLONE option
and vote for it, I hope it will significantly speed up the cloning process.

七月上 2025-02-19 08:01:57

也许尝试使用类似选项创建对象?不幸的是,它在数据库上不起作用,但在图架和表上都可以。这是在没有数据的情况下创建表

副本

CREATE TABLE myTable2 LIKE myTable1

的 在返回没有行的选择中,

CREATE TABLE myTable2 AS SELECT * FROM myTable1 WHERE 1=0

不确定其中的任何一个是否会更快

Perhaps try creating objects using the LIKE option? It doesn't work on databases unfortunately but does on schemas and tables etc. This creates a copy of the table with no data

So get the list of schemas and tables of your db and loop through them

CREATE TABLE myTable2 LIKE myTable1

Another option would be to use the CREATE command with a SELECT that returns no rows

CREATE TABLE myTable2 AS SELECT * FROM myTable1 WHERE 1=0

Not sure if any of them will be faster though

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文