在 SQL 中查找属于超集子集的所有集合
我正在考虑一个应用程序的设计,其中主要功能围绕查找作为给定集合的子集的所有集合的集合的能力。
例如,给定输入集 A={1,2,3...50} 和集合 B={ B1={3,5,9,12}, B2={1,6,100,123,45} 。 .. B500={8,67,450} },返回作为 A 子集的所有 B。
我想这与搜索引擎类似,只是我没有真正拥有集合 A 很小而 B 很大的奢侈;就我而言,B通常小于A。
我发现了一个类似的问题这里,但想知道是否有更有效/标准的东西。
I'm pondering the design of an application where the main feature revolves around the ability to find the set of all sets which are subsets of a given set.
For example, given the input set A={1,2,3...50} and the set of sets B={ B1={3,5,9,12}, B2={1,6,100,123,45} ... B500={8,67,450} }, return all Bs which are a subset of A.
I guess it's similar to a search engine, except that I don't really have the luxury of set A being small and the Bs being large; in my case Bs are usually smaller than A.
I found a similar question here, but was wondering if there was anything more efficient / standard.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
哈珀的回答是正确而优雅的。当然是经验丰富的 SQL 编码人员中的“标准”。当然,要求是数据库必须标准化:父级不重复; Parent::Child 有两个关系; Child 表中有两个唯一索引 (ParentKey, ChildKey) 和 (ChildKey, ParentKey),“否则一切皆废”。不可能获得比这更好的性能(假设服务器的硬件配置正确,等等)。下一步是 6NF,它确实显着提高了性能,但除非迫不得已,否则不需要这样做。如果你的 B 比你的 A 小,那么速度会非常快。
另一种方法是使用子查询。根据您的 Db 供应商,子查询(特别是当您的 B 小于 As 时)可能会更快。例如。 Sybase 处理子查询的能力比 MS 好得多。
Harper's answer is correct and elegant. Certainly the "standard" among experienced SQL coders. The requirement is of course the db must be normalised: Parent is not duplicated; Parent::Child has two relations; there are two unique indices (ParentKey, ChildKey) and (ChildKey, ParentKey) in the Child table, "otherwise all bets are off". It is not possible to get better performance than that (assuming the server is configured properly for the hardware, etc). The next step is 6NF, which does provide a significant increase in performance, but you do not need to go there unless you have to. If your Bs are smaller than your As, it will be very fast.
The alternative is to use subqueries. Depending on your Db vendor, subqueries (particularly if your Bs are smaller than your As) can be faster. Eg. Sybase handles subqueries far better than MS.