MySQL 递归查询

发布于 2024-09-18 08:12:32 字数 920 浏览 3 评论 0原文

我的数据库架构如下所示：

Table t1:
    id
    valA
    valB

Table t2:
    id
    valA
    valB

我想要做的是，对于其中一个表中的给定行集，查找两个表中具有相同 valA 或 valB 的行（将 valA 与valA 和 valB 与 valB，而不是 valA 与 valB）。然后，我想查找与上一个查询结果中的行具有相同 valA 或 valB 的行，依此类推。

Example data:

t1 (id, valA, valB):
    1, a, B
    2, b, J
    3, d, E
    4, d, B
    5, c, G
    6, h, J

t2 (id, valA, valB):
    1, b, E
    2, d, H
    3, g, B


Example 1:

Input: Row 1 in t1
Output: 
    t1/4, t2/3
    t1/3, t2/2
    t2/1
    ...


Example 2:

Input: Row 6 in t1
Output:
    t1/2
    t2/1

我想要在结果中找到该行的搜索级别（例如，在示例 1 中：t1/2 和 t2/1 的级别 1， t1/5 为 2 级，...）有限深度的递归可以。随着时间的推移，我可能想在查询中包含更多遵循相同模式的表。如果可以很容易地为此目的扩展查询，那就太好了。

但最重要的是性能。你能告诉我最快的方法来完成这个任务吗？

提前致谢！

原文

My database schema looks like this:

Table t1:
    id
    valA
    valB

Table t2:
    id
    valA
    valB

What I want to do, is, for a given set of rows in one of these tables, find rows in both tables that have the same valA or valB (comparing valA with valA and valB with valB, not valA with valB). Then, I want to look for rows with the same valA or valB as the rows in the result of the previous query, and so on.

Example data:

t1 (id, valA, valB):
    1, a, B
    2, b, J
    3, d, E
    4, d, B
    5, c, G
    6, h, J

t2 (id, valA, valB):
    1, b, E
    2, d, H
    3, g, B


Example 1:

Input: Row 1 in t1
Output: 
    t1/4, t2/3
    t1/3, t2/2
    t2/1
    ...


Example 2:

Input: Row 6 in t1
Output:
    t1/2
    t2/1

I would like to have the level of the search at that the row was found in the result (e.g. in Example 1: Level 1 for t1/2 and t2/1, level 2 for t1/5, ...) A limited depth of recursion is okay. Over time, I maybe want to include more tables following the same schema into the query. It would be nice if it was easy to extend the query for that purpose.

But what matters most, is the performance. Can you tell me the fastest possible way to accomplish this?

Thanks in advance!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

木緿 2024-09-25 08:12:32

尝试一下，虽然它还没有完全测试，但看起来它正在工作：P（http://pastie.org/1140339)

drop table if exists t1;
create table t1
(
id int unsigned not null auto_increment primary key,
valA char(1) not null,
valB char(1) not null
)
engine=innodb;

drop table if exists t2;
create table t2
(
id int unsigned not null auto_increment primary key,
valA char(1) not null,
valB char(1) not null
)
engine=innodb;

drop view if exists t12;
create view t12 as
select 1 as tid, id, valA, valB from t1
union
select 2 as tid, id, valA, valB from t2;

insert into t1 (valA, valB) values 
('a','B'),
('b','J'),
('d','E'),
('d','B'),
('c','G'),
('h','J');

insert into t2 (valA, valB) values 
('b','E'),
('d','H'),
('g','B');

drop procedure if exists find_children;

delimiter #

create procedure find_children
(
in p_tid tinyint unsigned,
in p_id int unsigned 
)
proc_main:begin

declare done tinyint unsigned default 0;
declare dpth smallint unsigned default 0;


create temporary table children(
 tid tinyint unsigned not null,
 id int unsigned not null,
 valA char(1) not null,
 valB char(1) not null,
 depth smallint unsigned default 0,
 primary key (tid, id, valA, valB)
)engine = memory;

insert into children select p_tid, t.id, t.valA, t.valB, dpth from t12 t where t.tid = p_tid and t.id = p_id; 

create temporary table tmp engine=memory select * from children;

/* http://dec.mysql.com/doc/refman/5.0/en/temporary-table-problems.html */

while done <> 1 do

    if exists(
    select 1 from t12 t 
      inner join tmp on tmp.valA = t.valA or tmp.valB = t.valB and tmp.depth = dpth) then

        insert ignore into children
        select 
        t.tid, t.id, t.valA, t.valB, dpth+1 
      from t12 t
      inner join tmp on tmp.valA = t.valA or tmp.valB = t.valB and tmp.depth = dpth;

        set dpth = dpth + 1;            

        truncate table tmp;
        insert into tmp select * from children where depth = dpth;

    else
        set done = 1;
    end if;

end while;

select * from children order by depth;

drop temporary table if exists children;
drop temporary table if exists tmp;

end proc_main #


delimiter ;


call find_children(1,1);

call find_children(1,6);

try this although it's not fully tested but looked like it was working :P (http://pastie.org/1140339)

drop table if exists t1;
create table t1
(
id int unsigned not null auto_increment primary key,
valA char(1) not null,
valB char(1) not null
)
engine=innodb;

drop table if exists t2;
create table t2
(
id int unsigned not null auto_increment primary key,
valA char(1) not null,
valB char(1) not null
)
engine=innodb;

drop view if exists t12;
create view t12 as
select 1 as tid, id, valA, valB from t1
union
select 2 as tid, id, valA, valB from t2;

insert into t1 (valA, valB) values 
('a','B'),
('b','J'),
('d','E'),
('d','B'),
('c','G'),
('h','J');

insert into t2 (valA, valB) values 
('b','E'),
('d','H'),
('g','B');

drop procedure if exists find_children;

delimiter #

create procedure find_children
(
in p_tid tinyint unsigned,
in p_id int unsigned 
)
proc_main:begin

declare done tinyint unsigned default 0;
declare dpth smallint unsigned default 0;


create temporary table children(
 tid tinyint unsigned not null,
 id int unsigned not null,
 valA char(1) not null,
 valB char(1) not null,
 depth smallint unsigned default 0,
 primary key (tid, id, valA, valB)
)engine = memory;

insert into children select p_tid, t.id, t.valA, t.valB, dpth from t12 t where t.tid = p_tid and t.id = p_id; 

create temporary table tmp engine=memory select * from children;

/* http://dec.mysql.com/doc/refman/5.0/en/temporary-table-problems.html */

while done <> 1 do

    if exists(
    select 1 from t12 t 
      inner join tmp on tmp.valA = t.valA or tmp.valB = t.valB and tmp.depth = dpth) then

        insert ignore into children
        select 
        t.tid, t.id, t.valA, t.valB, dpth+1 
      from t12 t
      inner join tmp on tmp.valA = t.valA or tmp.valB = t.valB and tmp.depth = dpth;

        set dpth = dpth + 1;            

        truncate table tmp;
        insert into tmp select * from children where depth = dpth;

    else
        set done = 1;
    end if;

end while;

select * from children order by depth;

drop temporary table if exists children;
drop temporary table if exists tmp;

end proc_main #


delimiter ;


call find_children(1,1);

call find_children(1,6);

回复收藏 0 原文

戏剧牡丹亭 2024-09-25 08:12:32

您可以使用存储过程来完成此操作（请参见清单 7 和 7a）：

http://www .artfulsoftware.com/mysqlbook/sampler/mysqled1ch20.html

您只需要找出递归步骤的查询 - 获取已经找到的行并找到更多行。

如果您有一个支持 SQL-99 递归公用表表达式的数据库（例如 PostgreSQL 或 Firebird，提示提示），您可以采用与上面链接中相同的方法，但使用 rCTE 作为框架，因此无需编写一个存储过程。

编辑：我尝试使用 PostgreSQL 8.4 中的 rCTE 来执行此操作，虽然我可以找到这些行，但我找不到一种方法来用找到它们的深度来标记它们。首先，我创建一个视图来统一表：

create view t12 (tbl, id, vala, valb) as (
  (select 't1', id, vala, valb from t1)
  union
  (select 't2', id, vala, valb from t2)
)

然后执行此查询：

with recursive descendants (tbl, id, vala, valb) as (
  (select *
  from t12
  where tbl = 't1' and id = 1) -- the query that identifies the seed rows, here just t1/1
  union
  (select c.*
  from descendants p, t12 c
  where (p.vala = c.vala or p.valb = c.valb)) -- the recursive term
)
select * from descendants;

您会想象捕获深度就像向 rCTE 添加深度列一样简单，在种子查询中设置为零，然后在递归步骤中以某种方式递增。但是，我找不到任何方法来做到这一点，因为您无法在递归步骤中针对 rCTE 编写子查询（因此没有像 select max(深度) + 1 from后代 这样的方法）列列表），并且您不能在列列表中使用聚合函数（因此列列表中没有 max(p.depth) + 1 与 group by c.* 相结合 上选择）。

您还需要向查询添加限制以排除已选择的行；由于联合的独特效果，您不需要在基本版本中执行此操作，但是如果添加计数列，那么一行可以以不同的计数多次包含在结果中，并且您将得到笛卡尔爆炸。但你不能轻易阻止它，因为你不能对 rCTE 进行子查询，这意味着你不能说任何像 并且不存在的东西（select * from后代 d where d.tbl = c.tbl and d.id = c.id）！

我知道所有这些关于递归查询的东西对你来说没有用，但我发现它很有趣，所以请原谅。

You can do it with stored procedures (see listings 7 and 7a):

http://www.artfulsoftware.com/mysqlbook/sampler/mysqled1ch20.html

You just need to figure out a query for the step of the recursion - taking the already-found rows and finding some more rows.

If you had a database which supported SQL-99 recursive common table expressions (like PostgreSQL or Firebird, hint hint), you could take the same approach as in the above link, but using a rCTE as the framework, so avoiding the need to write a stored procedure.

EDIT: I had a go at doing this with an rCTE in PostgreSQL 8.4, and although i can find the rows, i can't find a way to label them with the depth at which they were found. First, i create a a view to unify the tables:

create view t12 (tbl, id, vala, valb) as (
  (select 't1', id, vala, valb from t1)
  union
  (select 't2', id, vala, valb from t2)
)

Then do this query:

with recursive descendants (tbl, id, vala, valb) as (
  (select *
  from t12
  where tbl = 't1' and id = 1) -- the query that identifies the seed rows, here just t1/1
  union
  (select c.*
  from descendants p, t12 c
  where (p.vala = c.vala or p.valb = c.valb)) -- the recursive term
)
select * from descendants;

You would imagine that capturing depth would be as simple as adding a depth column to the rCTE, set to zero in the seed query, then somehow incremented in the recursive step. However, i couldn't find any way to do that, given that you can't write subqueries against the rCTE in the recursive step (so nothing like select max(depth) + 1 from descendants in the column list), and you can't use an aggregate function in the column list (so no max(p.depth) + 1 in the column list coupled with a group by c.* on the select).

You would also need to add a restriction to the query to exclude already-selected rows; you don't need to do that in the basic version, because of the distincting effect of the union, but if you add a count column, then a row can be included in the results more than once with different counts, and you'll get a Cartesian explosion. But you can't easily prevent it, because you can't have subqueries against the rCTE, which means you can't say anything like and not exists (select * from descendants d where d.tbl = c.tbl and d.id = c.id)!

I know all this stuff about recursive queries is of no use to you, but i find it riveting, so please do excuse me.

回复收藏 0 原文

~没有更多了~