MySQL 递归查询
我的数据库架构如下所示:
Table t1:
id
valA
valB
Table t2:
id
valA
valB
我想要做的是,对于其中一个表中的给定行集,查找两个表中具有相同 valA 或 valB 的行(将 valA 与valA 和 valB 与 valB,而不是 valA 与 valB)。 然后,我想查找与上一个查询结果中的行具有相同 valA 或 valB 的行,依此类推。
Example data:
t1 (id, valA, valB):
1, a, B
2, b, J
3, d, E
4, d, B
5, c, G
6, h, J
t2 (id, valA, valB):
1, b, E
2, d, H
3, g, B
Example 1:
Input: Row 1 in t1
Output:
t1/4, t2/3
t1/3, t2/2
t2/1
...
Example 2:
Input: Row 6 in t1
Output:
t1/2
t2/1
我想要在结果中找到该行的搜索级别(例如,在示例 1 中:t1/2 和 t2/1 的级别 1, t1/5 为 2 级,...)有限深度的递归可以。随着时间的推移,我可能想在查询中包含更多遵循相同模式的表。如果可以很容易地为此目的扩展查询,那就太好了。
但最重要的是性能。你能告诉我最快的方法来完成这个任务吗?
提前致谢!
My database schema looks like this:
Table t1:
id
valA
valB
Table t2:
id
valA
valB
What I want to do, is, for a given set of rows in one of these tables, find rows in both tables that have the same valA or valB (comparing valA with valA and valB with valB, not valA with valB). Then, I want to look for rows with the same valA or valB as the rows in the result of the previous query, and so on.
Example data:
t1 (id, valA, valB):
1, a, B
2, b, J
3, d, E
4, d, B
5, c, G
6, h, J
t2 (id, valA, valB):
1, b, E
2, d, H
3, g, B
Example 1:
Input: Row 1 in t1
Output:
t1/4, t2/3
t1/3, t2/2
t2/1
...
Example 2:
Input: Row 6 in t1
Output:
t1/2
t2/1
I would like to have the level of the search at that the row was found in the result (e.g. in Example 1: Level 1 for t1/2 and t2/1, level 2 for t1/5, ...) A limited depth of recursion is okay. Over time, I maybe want to include more tables following the same schema into the query. It would be nice if it was easy to extend the query for that purpose.
But what matters most, is the performance. Can you tell me the fastest possible way to accomplish this?
Thanks in advance!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
尝试一下,虽然它还没有完全测试,但看起来它正在工作:P(http://pastie.org/1140339)
try this although it's not fully tested but looked like it was working :P (http://pastie.org/1140339)
您可以使用存储过程来完成此操作(请参见清单 7 和 7a):
http://www .artfulsoftware.com/mysqlbook/sampler/mysqled1ch20.html
您只需要找出递归步骤的查询 - 获取已经找到的行并找到更多行。
如果您有一个支持 SQL-99 递归公用表表达式的数据库(例如 PostgreSQL 或 Firebird,提示提示),您可以采用与上面链接中相同的方法,但使用 rCTE 作为框架,因此无需编写一个存储过程。
编辑:我尝试使用 PostgreSQL 8.4 中的 rCTE 来执行此操作,虽然我可以找到这些行,但我找不到一种方法来用找到它们的深度来标记它们。首先,我创建一个视图来统一表:
然后执行此查询:
您会想象捕获深度就像向 rCTE 添加深度列一样简单,在种子查询中设置为零,然后在递归步骤中以某种方式递增。但是,我找不到任何方法来做到这一点,因为您无法在递归步骤中针对 rCTE 编写子查询(因此没有像
select max(深度) + 1 from后代
这样的方法)列列表),并且您不能在列列表中使用聚合函数(因此列列表中没有max(p.depth) + 1
与group by c.* 相结合
上选择)。您还需要向查询添加限制以排除已选择的行;由于联合的独特效果,您不需要在基本版本中执行此操作,但是如果添加计数列,那么一行可以以不同的计数多次包含在结果中,并且您将得到笛卡尔爆炸。但你不能轻易阻止它,因为你不能对 rCTE 进行子查询,这意味着你不能说任何像
并且不存在的东西(select * from后代 d where d.tbl = c.tbl and d.id = c.id)!
我知道所有这些关于递归查询的东西对你来说没有用,但我发现它很有趣,所以请原谅。
You can do it with stored procedures (see listings 7 and 7a):
http://www.artfulsoftware.com/mysqlbook/sampler/mysqled1ch20.html
You just need to figure out a query for the step of the recursion - taking the already-found rows and finding some more rows.
If you had a database which supported SQL-99 recursive common table expressions (like PostgreSQL or Firebird, hint hint), you could take the same approach as in the above link, but using a rCTE as the framework, so avoiding the need to write a stored procedure.
EDIT: I had a go at doing this with an rCTE in PostgreSQL 8.4, and although i can find the rows, i can't find a way to label them with the depth at which they were found. First, i create a a view to unify the tables:
Then do this query:
You would imagine that capturing depth would be as simple as adding a depth column to the rCTE, set to zero in the seed query, then somehow incremented in the recursive step. However, i couldn't find any way to do that, given that you can't write subqueries against the rCTE in the recursive step (so nothing like
select max(depth) + 1 from descendants
in the column list), and you can't use an aggregate function in the column list (so nomax(p.depth) + 1
in the column list coupled with agroup by c.*
on the select).You would also need to add a restriction to the query to exclude already-selected rows; you don't need to do that in the basic version, because of the distincting effect of the union, but if you add a count column, then a row can be included in the results more than once with different counts, and you'll get a Cartesian explosion. But you can't easily prevent it, because you can't have subqueries against the rCTE, which means you can't say anything like
and not exists (select * from descendants d where d.tbl = c.tbl and d.id = c.id)
!I know all this stuff about recursive queries is of no use to you, but i find it riveting, so please do excuse me.