Teradata 中组内的数据样本

发布于 01-07 21:58 字数 379 浏览 1 评论 0原文

假设我有一张 week_nbr 和 cust_id 表。

假设我想要每周抽取 500 名客户的样本。

愚蠢的方法是每周都这样做:

select cust_id
from week_cust
sample randomized allocation 500
where week_nbr=1

聪明的方法是什么?即,有没有办法使以下概念发挥作用?

select week_nbr
      ,random sample of 500 cust_id in this week
from week_cust

这是在 Teradata 12 上的。

最好,提前致谢。

Suppose I have a table of week_nbr and cust_id.

Suppose I want a sample of 500 customers from each week.

The dumb way is to do this for each week:

select cust_id
from week_cust
sample randomized allocation 500
where week_nbr=1

What's the smart way? I.e., is there a way to make the following concept work?

select week_nbr
      ,random sample of 500 cust_id in this week
from week_cust

This is on Teradata 12.

Best, and thanks in advance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

无边思念无边月2025-01-14 21:58:05

在我看来,使用 sample 子句似乎不可能做到这一点。不过,您应该能够通过类似的方法获得相同的结果:

select week_nbr, cust_id
from week_cust
qualify row_number() over (partition by week_nbr order by random(1, 10000)) <= 500

It doesn't look to me as though it's possible to do this using the sample clause. You should be able to achieve the same result with something like this though:

select week_nbr, cust_id
from week_cust
qualify row_number() over (partition by week_nbr order by random(1, 10000)) <= 500
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文