如何为关联表生成随机数据?
我正在使用三张表的电子购物商店工作:products
,atributes
和product_atributes
。我已经填充了表products
和atributes
:
insert into atribute(name)
select random_atribute()
from generate_series(1,10000) as seq(i);
insert into product(price)
select floor(1 + random() * 100)::int
from generate_series(1,20000) as seq(i);
我想生成随机数据 table product_atributes
:是1个产品的1个属性,但也只有1个产品的所有属性。结果应该看起来像:
product_atributes
id product_id atribute_id
1 5 12
2 5 76
3 5 10
4 5 7
5 1 45
6 1 109
...
Table product_atributes
是这样创建的:
create table product_atributes(
id serial primary key,
atribute_id integer references atribute(id),
product_id integer references product(id)
);
我尝试了STH。这样,但它行不通:
with data as (
select s.i,
random_atribute_id() as atribute_id,
s.id as product_id
from (generate_series(1, 1000) as seq(i)
cross join lateral (select seq.i, * from product order by random() limit 1) as s)
)
insert into product_atributes(atribute_id, product_id)
select atribute_id, product_id from data;
我该怎么做?
I am working on an e-shop with three tables: products
, atributes
and product_atributes
. I have already filled tables products
and atributes
:
insert into atribute(name)
select random_atribute()
from generate_series(1,10000) as seq(i);
insert into product(price)
select floor(1 + random() * 100)::int
from generate_series(1,20000) as seq(i);
I want to generate random data for the table product_atributes
: There can be 1 attribute for 1 product but also all attributes for just 1 product. The result should look like:
product_atributes
id product_id atribute_id
1 5 12
2 5 76
3 5 10
4 5 7
5 1 45
6 1 109
...
Table product_atributes
is created like this:
create table product_atributes(
id serial primary key,
atribute_id integer references atribute(id),
product_id integer references product(id)
);
I tried sth. like this, but it's not working:
with data as (
select s.i,
random_atribute_id() as atribute_id,
s.id as product_id
from (generate_series(1, 1000) as seq(i)
cross join lateral (select seq.i, * from product order by random() limit 1) as s)
)
insert into product_atributes(atribute_id, product_id)
select atribute_id, product_id from data;
How can I do this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
简单地
插入所有可能的组合,均匀的机会为1%:
更柔和&更快的速度
可提供对最小和最大属性数量的更多控制。
对于 对于您的特定设置使用 gap-less 属性ID(从1到10000):
这会生成0至10个完全随机的关联。
2 + trunc(Random() * 6):: int as Ct
将生成3至8个关联。 (请注意0基于0的generate_series()
!)在子查询中生成属性(
ct
)的数量,以避免对tandom(Random)进行单一评估()导致所有产品的数字相同。
我们可以相当多地优化无差距ID的性能。而不是为给定的ID范围生成随机数,而不是在所有产品中浏览所有10000个属性。 更快。
将
不同的
扔掉以消除(不太可能)重复。我们的10000中的10个几乎没有关系,但我们不能允许重复。 (因此,在极少数情况下,可以减少关联的可能性。)我们甚至可以处理一些差距。选择随机行有很多微妙之处。 参阅:
请
您还可以使用数据改装CTE 。 1:n关系的简单示例:
您的情况相同的原则,JUT两个ctes ...
相关:
“属性”是这样的。
Simple
To insert every possible combination with an even chance of 1 percent:
More sophisitcated & faster
This offers more control over minimum and maximum number of attributes.
And it's a lot faster for your particular setup with gap-less attribute IDs (from 1 to 10000):
This generates 0 to 10 completely random associations per product.
2 + trunc(random() * 6)::int AS ct
would generate 3 to 8 associations. (Note the 0-basedgenerate_series()
!)Generate the number of attributes (
ct
) in a subquery to avoid single evaluation ofrandom()
resulting in the same number for all products.We can optimize performance for gap-less IDs quite a bit. Instead of walking through all 10000 attribute for every product, just generate random numbers for the given range of IDs. Much faster.
Throw in
DISTINCT
to eliminate (unlikely) duplicates. Hardly matters for 10 our of 10000, but we can't allow duplicates. (So it's possible to get fewer associations in rare cases.)We could even work with a few gaps. There are quite a few subtleties to selecting random rows. See:
Asides
You could also populate all three tables with random data in a single query using data-modifying CTEs. Simple example for a 1:n relationship:
Same principle for your case, jut two CTEs ...
Related:
"Attributes" are spelled like this.