Postgres 数组的唯一约束

发布于 2024-12-20 12:17:05 字数 525 浏览 5 评论 0原文

如何对数组中所有值的唯一性创建约束,例如:

CREATE TABLE mytable
(
    interface integer[2],
    CONSTRAINT link_check UNIQUE (sort(interface))
)

我需要的排序函数

create or replace function sort(anyarray)
returns anyarray as $$
select array(select $1[i] from generate_series(array_lower($1,1),
array_upper($1,1)) g(i) order by 1)
$$ language sql strict immutable; 

将值 {10, 22} 和 {22, 10} 视为相同,并在唯一约束下进行检查

How to create a constraint on the uniqueness of all the values ​​in the array like:

CREATE TABLE mytable
(
    interface integer[2],
    CONSTRAINT link_check UNIQUE (sort(interface))
)

my sort function

create or replace function sort(anyarray)
returns anyarray as $
select array(select $1[i] from generate_series(array_lower($1,1),
array_upper($1,1)) g(i) order by 1)
$ language sql strict immutable; 

I need that would be the value {10, 22} and {22, 10} considered the same and check under the UNIQUE CONSTRAINT

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

抽个烟儿 2024-12-27 12:17:05

我认为您不能使用带有唯一约束<的函数/a> 但你可以使用唯一索引 。因此,给定一个类似这样的排序函数:

create function sort_array(anyarray) returns anyarray as $
    select array_agg(distinct n order by n) from unnest($1) as t(n);
$ language sql immutable;

那么你可以这样做:

create table mytable (
    interface integer[2] 
);
create unique index mytable_uniq on mytable (sort_array(interface));

然后会发生以下情况:

=> insert into mytable (interface) values (array[11,23]);
INSERT 0 1
=> insert into mytable (interface) values (array[11,23]);
ERROR:  duplicate key value violates unique constraint "mytable_uniq"
DETAIL:  Key (sort_array(interface))=({11,23}) already exists.
=> insert into mytable (interface) values (array[23,11]);
ERROR:  duplicate key value violates unique constraint "mytable_uniq"
DETAIL:  Key (sort_array(interface))=({11,23}) already exists.
=> insert into mytable (interface) values (array[42,11]);
INSERT 0 1

I don't think you can use a function with a unique constraint but you can with a unique index. So given a sorting function something like this:

create function sort_array(anyarray) returns anyarray as $
    select array_agg(distinct n order by n) from unnest($1) as t(n);
$ language sql immutable;

Then you could do this:

create table mytable (
    interface integer[2] 
);
create unique index mytable_uniq on mytable (sort_array(interface));

Then the following happens:

=> insert into mytable (interface) values (array[11,23]);
INSERT 0 1
=> insert into mytable (interface) values (array[11,23]);
ERROR:  duplicate key value violates unique constraint "mytable_uniq"
DETAIL:  Key (sort_array(interface))=({11,23}) already exists.
=> insert into mytable (interface) values (array[23,11]);
ERROR:  duplicate key value violates unique constraint "mytable_uniq"
DETAIL:  Key (sort_array(interface))=({11,23}) already exists.
=> insert into mytable (interface) values (array[42,11]);
INSERT 0 1
梦中楼上月下 2024-12-27 12:17:05

mu 已经演示了 表达式索引 可以解决您的问题。

我的注意力被使用的功能吸引了。对于两个整数的数组来说,这两种方法似乎都太过分了。这可能是真实情况的简化?不管怎样,我很感兴趣,并用几个变体进行了测试。

测试设置

包含 10000 个随机整数对的临时表:

CREATE TEMP TABLE arr (i int[]);

INSERT INTO arr 
SELECT ARRAY[(random() * 1000)::int, (random() * 1000)::int]
FROM   generate_series(1,10000);

用简短的注释来解释每个候选者的测试:

  1. mu 的查询
CREATE OR REPLACE FUNCTION sort_array1(integer[])
  RETURNS int[]
  LANGUAGE sql STRICT IMMUTABLE AS
$func$
    SELECT array_agg(n) FROM (SELECT n FROM unnest($1) AS t(n) ORDER BY n) AS a;
$func$;
  1. 与聚合内的 ORDER BY 相同(第 9.0 页+)
CREATE OR REPLACE FUNCTION sort_array2(int[])
  RETURNS int[]
  LANGUAGE sql STRICT IMMUTABLE AS
$func$
SELECT array_agg(n ORDER BY n) FROM unnest($1) AS t(n);
$func$;
  1. uralbash 的查询
CREATE OR REPLACE FUNCTION sort_array3(anyarray)
  RETURNS anyarray
  LANGUAGE sql STRICT IMMUTABLE AS
$func$
SELECT ARRAY(
    SELECT $1[i]
    FROM   generate_series(array_lower($1,1), array_upper($1,1)) g(i)
    ORDER  BY 1)
$func$;
  1. 将参数更改为 int[]
CREATE OR REPLACE FUNCTION sort_array4(int[])
  RETURNS int[]
  LANGUAGE sql STRICT IMMUTABLE AS
$func$
SELECT ARRAY(
    SELECT $1[i]
    FROM   generate_series(array_lower($1,1), array_upper($1,1)) g(i)
    ORDER  BY 1)
$func$;
  1. 简化 array_lower() - it始终为 1
CREATE OR REPLACE FUNCTION sort_array5(int[])
  RETURNS int[]
  LANGUAGE sql STRICT IMMUTABLE AS
$func$
SELECT ARRAY(
    SELECT $1[i]
    FROM   generate_series(1, array_upper($1,1)) g(i)
    ORDER  BY 1)
$func$;
  1. 进一步简化为具有 2 个元素的情况
CREATE OR REPLACE FUNCTION sort_array6(int[])
  RETURNS int[]
  LANGUAGE sql STRICT IMMUTABLE AS
$func$
SELECT ARRAY(
    SELECT i
    FROM  (VALUES ($1[1]),($1[2])) g(i)
    ORDER  BY 1)
$func$;
  1. 我的简单查询
CREATE OR REPLACE FUNCTION sort_array7(int[])
  RETURNS int[]
  LANGUAGE sql STRICT IMMUTABLE AS
$func$
SELECT CASE WHEN $1[1] > $1[2] THEN ARRAY[$1[2], $1[1]] ELSE $1 END;
$func$;
  1. 没有 STRICT 修饰符 (!)
CREATE OR REPLACE FUNCTION sort_array8(int[])
  RETURNS int[]
  LANGUAGE sql IMMUTABLE AS
$func$
SELECT CASE WHEN $1[1] > $1[2] THEN ARRAY[$1[2], $1[1]] ELSE $1 END;
$func$;

结果

我执行了大约 20 次,并从 EXPLAIN ANALYZE 中获得了最佳结果。

SELECT sort_array1(i) FROM arr  -- Total runtime: 183 ms
SELECT sort_array2(i) FROM arr  -- Total runtime: 175 ms

SELECT sort_array3(i) FROM arr  -- Total runtime: 183 ms
SELECT sort_array4(i) FROM arr  -- Total runtime: 183 ms
SELECT sort_array5(i) FROM arr  -- Total runtime: 177 ms
SELECT sort_array6(i) FROM arr  -- Total runtime: 144 ms

SELECT sort_array7(i) FROM arr  -- Total runtime: 103 ms
SELECT sort_array8(i) FROM arr  -- Total runtime:  43 ms (!!!)

这些是 Debian Squeeze 上 v9.0.5 服务器的结果。 v.8.4 上的结果类似。

我还测试了 PL/pgSQL 变体,它们比预期慢一些:对于一个微小的操作来说开销太大,没有查询计划可以缓存。

简单函数(nr. 7)比其他函数快得多。这是预料之中的,其他变体的开销对于小型数组来说太大了。

忽略 STRICT 声明可以使速度加倍以上。我没想到这一点,并发布了这个后续文章来揭示原因:

mu already demonstrated how an index on an expression can solve your problem.

My attention was caught by the used functions. Both seem like overkill for arrays of two integers. This may be a simplification of the real situation? Either way, I was intrigued and ran a test with a couple of variants.

Test setup

Temporary table with 10000 random pairs of integer:

CREATE TEMP TABLE arr (i int[]);

INSERT INTO arr 
SELECT ARRAY[(random() * 1000)::int, (random() * 1000)::int]
FROM   generate_series(1,10000);

Test candidates with a short comment to explain each one:

  1. mu's query
CREATE OR REPLACE FUNCTION sort_array1(integer[])
  RETURNS int[]
  LANGUAGE sql STRICT IMMUTABLE AS
$func$
    SELECT array_agg(n) FROM (SELECT n FROM unnest($1) AS t(n) ORDER BY n) AS a;
$func$;
  1. The same with ORDER BY inside aggregate (pg 9.0+)
CREATE OR REPLACE FUNCTION sort_array2(int[])
  RETURNS int[]
  LANGUAGE sql STRICT IMMUTABLE AS
$func$
SELECT array_agg(n ORDER BY n) FROM unnest($1) AS t(n);
$func$;
  1. uralbash's query
CREATE OR REPLACE FUNCTION sort_array3(anyarray)
  RETURNS anyarray
  LANGUAGE sql STRICT IMMUTABLE AS
$func$
SELECT ARRAY(
    SELECT $1[i]
    FROM   generate_series(array_lower($1,1), array_upper($1,1)) g(i)
    ORDER  BY 1)
$func$;
  1. Change parameter to int[]
CREATE OR REPLACE FUNCTION sort_array4(int[])
  RETURNS int[]
  LANGUAGE sql STRICT IMMUTABLE AS
$func$
SELECT ARRAY(
    SELECT $1[i]
    FROM   generate_series(array_lower($1,1), array_upper($1,1)) g(i)
    ORDER  BY 1)
$func$;
  1. Simplify array_lower() - it is always 1
CREATE OR REPLACE FUNCTION sort_array5(int[])
  RETURNS int[]
  LANGUAGE sql STRICT IMMUTABLE AS
$func$
SELECT ARRAY(
    SELECT $1[i]
    FROM   generate_series(1, array_upper($1,1)) g(i)
    ORDER  BY 1)
$func$;
  1. Further simplify to case with 2 elements
CREATE OR REPLACE FUNCTION sort_array6(int[])
  RETURNS int[]
  LANGUAGE sql STRICT IMMUTABLE AS
$func$
SELECT ARRAY(
    SELECT i
    FROM  (VALUES ($1[1]),($1[2])) g(i)
    ORDER  BY 1)
$func$;
  1. My simple query
CREATE OR REPLACE FUNCTION sort_array7(int[])
  RETURNS int[]
  LANGUAGE sql STRICT IMMUTABLE AS
$func$
SELECT CASE WHEN $1[1] > $1[2] THEN ARRAY[$1[2], $1[1]] ELSE $1 END;
$func$;
  1. Without STRICT modifier (!)
CREATE OR REPLACE FUNCTION sort_array8(int[])
  RETURNS int[]
  LANGUAGE sql IMMUTABLE AS
$func$
SELECT CASE WHEN $1[1] > $1[2] THEN ARRAY[$1[2], $1[1]] ELSE $1 END;
$func$;

Results

I executed each around 20 times and took the best result from EXPLAIN ANALYZE.

SELECT sort_array1(i) FROM arr  -- Total runtime: 183 ms
SELECT sort_array2(i) FROM arr  -- Total runtime: 175 ms

SELECT sort_array3(i) FROM arr  -- Total runtime: 183 ms
SELECT sort_array4(i) FROM arr  -- Total runtime: 183 ms
SELECT sort_array5(i) FROM arr  -- Total runtime: 177 ms
SELECT sort_array6(i) FROM arr  -- Total runtime: 144 ms

SELECT sort_array7(i) FROM arr  -- Total runtime: 103 ms
SELECT sort_array8(i) FROM arr  -- Total runtime:  43 ms (!!!)

These are the results from a v9.0.5 server on Debian Squeeze. Similar results on v.8.4.

I also tested PL/pgSQL variants which were a bit slower as expected: too much overhead for a tiny operation, no query plan to cache.

The simple function (nr. 7) is substantially faster than the others. That was expected, the overhead of the other variants is just too much for a tiny array.

Leaving away the STRICT declaration more than doubles the speed. I did not expect that and posted this follow-up uncover why:

无所谓啦 2024-12-27 12:17:05

只需在两个值上创建唯一索引:

create unique index ix on 
  mytable(least(interface[1], interface[2]), greatest(interface[1], interface[2])); 

Just create a unique index on the two values:

create unique index ix on 
  mytable(least(interface[1], interface[2]), greatest(interface[1], interface[2])); 
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文