如何将<文件名>.csv.gz从雪花阶段加载到雪花表中?

发布于 2025-01-10 14:37:14 字数 700 浏览 0 评论 0原文

我已成功将 1000 个文件加载到 Snowflake stage=MT_STAGE 中。 每个文件都有完全相同的架构。 每个文件都有完全相同的命名约定(文件名).csv.gz 每个文件大约 50 兆(+/- 几兆)。 每个文件都有 115k-120k 条记录。 每个文件有 184 列。 我创建了一个雪花表=MT_TABLE。 我在尝试执行“COPY INTO”将文件从阶段移动到单个表中时不断出现错误。 我已经尝试了无数种命令的变体,包括 &没有不同的选择。我花了三天时间阅读文档并尝试观看视频。我失败了。有人可以帮忙吗?

copy into MT_TABLE from @MT_STAGE;

执行复制并处理 0 个文件

copy into MT_TABLE from @MT_STAGE (type=csv field_delimiter=”,” skip_header=1);

语法错误:意外的“(”。(第 1 行)

copy into MT_TABLE from @MT_STAGE type=csv field_delimiter=”,” skip_header=1;

语法错误:意外的“”,'。 (第 1 行)

I have successfully loaded 1000 files into a Snowflake stage=MT_STAGE.
Every file has exact same schema.
Every file has exact same naming convention (filename).csv.gz
Every file is about 50 megs (+/- a couple megs).
Every file has between 115k-120k records.
Every file has 184 columns.
I have created a Snowflake table=MT_TABLE.
I keep on getting errors trying to do a "COPY INTO" to move files from stage into a single table.
I've tried countless variations of the command, with & without different options. I've spent 3 days reading documentation and trying to watch videos. I have failed. Can anyone help?

copy into MT_TABLE from @MT_STAGE;

Copy executed with 0 files processed

copy into MT_TABLE from @MT_STAGE (type=csv field_delimiter=”,” skip_header=1);

Syntax error: unexpected '('. (line 1)

copy into MT_TABLE from @MT_STAGE type=csv field_delimiter=”,” skip_header=1;

Syntax error: unexpected '”,'. (line 1)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

罪#恶を代价 2025-01-17 14:37:14

因此,根据迈克的声明,如果您的数据中有逗号

col_acol_bcol c
无逗号一,逗号二,, 逗号,
col_a, col_b, col_b
no comma, one, comma, two,, commas

如何才能知道哪个是了解

col_acol_bcol c
无逗号一,逗号二 中内容的正确方法,, 逗号
无逗号,一,逗号二,,逗号
无逗号一,逗号,二,逗号
无逗号,一,逗号,二,逗号
无逗号一,逗号,二,逗号
没有逗号,一,逗号,二,逗号

这是正确的行。

因此,您可以将字段分隔符从 , 更改为管道 | 或者引用数据

no comma| one, comma| two,, commas

双引号

"no comma","one, comma"," two,, commas"

单引号

'no comma','one, comma',' two,, commas'

很酷的事情是,如果您更改列分隔符,则它必须不改变位于数据中或必须引用数据。

如果您更改为引用,则它必须不在归档中,否则必须进行转义。

或者您可以编码为某种安全数据类型,例如 base64,它会占用更多空间,但现在它是运输安全的:

bm8gY29tbWE,IG9uZSwgY29tbWE,IHR3bywsIGNvbW1hcw

So as per Mike's statement if there are comma's in your data

col_acol_bcol c
no commaone, commatwo,, commas
col_a, col_b, col_b
no comma, one, comma, two,, commas

how can anything tell which is the correct way to know what is in what

col_acol_bcol c
no commaone, commatwo,, commas
no comma, one, commatwo,, commas
no commaone, comma, two, commas
no comma, one, comma, two, commas
no commaone, comma, two,commas
no comma, one, comma, two,commas

which is the correct line.

So you ether change the field delimeter from , to pipe | or you quote the data

no comma| one, comma| two,, commas

double quotes

"no comma","one, comma"," two,, commas"

single quotes

'no comma','one, comma',' two,, commas'

The cool thing is, if you change your column delimiter it has to not be in the in the data OR the data has to be quoted.

And if you change to quoting it has to not be in the filed OR it has to be escaped.

OR you can encode as some safe data type like base64 and it takes more space, but now it's transportation transport safe:

bm8gY29tbWE,IG9uZSwgY29tbWE,IHR3bywsIGNvbW1hcw
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文