配置器中的组合数量

发布于 2024-07-27 22:20:29 字数 1352 浏览 8 评论 0 原文

我被要求编写一个例程来决定产品配置器中可能的组合数量。

配置器非常简单。 尽管它具有比这更多的功能,但它可以建模为多个“单选组”(如 UI 控件),其中必须选择 n 个选项之一。

唯一可以使用的约束是规则,如果选择了一个选项,则不能选择另一个选项。

所以我想做的是在给定一组选项组和约束的情况下计算可以配置的不同产品的数量。

我使用包含-排除原则来解决这个问题。 然而据我所知,任何基于此方法的算法都应该在 O(2^n) 中运行,这是行不通的。 当然,有几种可能的优化可以提供不错的运行时间,但仍然容易构建最坏的情况。

我现在差不多就是这样。 有什么建议么?

更新

我意识到我还没有很好地解释规则如何适用。

有几个带有选项的组。 每组中必须选择一个且仅一个选项。 一组中可以有一个或多个选项。

只有一种类型的约束。 如果某个组中的选项A被选中,则其他组中的选项B就不能被选中。 可以有任意数量的约束,适用于选项组或选项本身的约束/规则的数量没有限制。

例如:

组 1:
x1 x2 x3 x4 x5

第 2 组:
y1 y2 y3

第 3 组:
z1 z2 z3 z4

约束:
x1 <-> y2 *
x1 <-> z4
y2 <-> z2

* 如果在组 1 中选择了选项 x1,则不能选择组 2 中的选项 y2。

使用包含-排除,我将计算组合数量:

Combinations = Cnorules - Cr[1] - Cr [2] - Cr[3] + Cr[1,2] + Cr[1,3] + Cr[2,3] - Cr[1,2,3]

其中

C无规则 = 5 * 3 * 4

C r[a,b,c] = 违反规则 a、b 和 c 的组合数。

不幸的是,该方法需要 2^|规则| 计算。

I have been asked to program a routine to decide the number of possible combinations in a product configurator.

The configurator is really simple. Even though it has more features than this, it can be modeled as several "radio groups" (like the UI control) where one of n options has to be selected.

The only kind of constraints that can be used is rules that says if one option is selected another option can not be selected.

So what I want to do is to calculate the number of different products that can be configured, given a set of option groups and constraints.

I made a naive approach to solve this using the Inclusion-exclusion principle. However as far as I can see, any algorithm based on this method should run in O(2^n) which won't work. There are of course several possible optimizations that should give decent runtimes but there are still easily constructed worst case scenarios.

Thats pretty much where I am right now. Any suggestions?

Update

I realize I haven't explained how the rules applies well enough.

There are several groups with options. One and only one option must be selected in each group. There can be one or more of options in a group.

There is only one type of constraints. If option A in some group is selected, then option B in some other group can't be selected. There can be any number of constraints, there is no limit how many constraints/rules apply to a option group or an option itself.

So an example would be:

Group 1:
x1 x2 x3 x4 x5

Group 2:
y1 y2 y3

Group 3:
z1 z2 z3 z4

Constraints:
x1 <-> y2 *
x1 <-> z4
y2 <-> z2

* If option x1 is selected in group1, then option y2 in group 2 can not be selected.

Using inclusion-exclusion I would calculate the number of combinations as

Combinations = Cno rules - Cr[1] - Cr[2] - Cr[3] + Cr[1,2] + Cr[1,3] + Cr[2,3] - Cr[1,2,3]

Where

Cno rules = 5 * 3 * 4

Cr[a,b,c] = Number of combinations that violates rule a, b and c.

The method unfortunately requires 2^|rules| calculations.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(11

听风念你 2024-08-03 22:20:30

我认为扎克的思考方向是正确的。 查看组合数量的表达式,您会发现二阶项 Cr[i,j] 比 C[k] 项小得多。
想象一个立方体,其中每个轴都是一组选项。 立方体中的每个点代表一个特定的选项组合。 一阶 C[k] 校正排除立方体两个表面之间的一行选项。 仅当两条这样的线在立方体空间中的一个点(选项组合)相交时,才会发生二阶校正 C[i,j]。 因此,无论您拥有多少组,高阶修正总是越来越小。 如果您坚持

组合 = C(无规则) - Cr[1] - Cr[2] - Cr[3],

您最终会得到组合数量的下限。 现在您知道一阶校正的大小,并考虑上面立方体的观察结果,您甚至可以估计二阶校正的数量级。 这将取决于组的数量。 然后,您的算法可以决定是继续执行更高的订单还是停止。

I think Zac is thinking in the right direction. Looking at your expression for the number of combinations, you see that the second order terms Cr[i,j] are much smaller than the C[k] terms.
Imagine a cube where each axis is a group of options. Each point in the cube represents a particular combination of options. A first order C[k] correction excludes a line of options between two surfaces of the cube. A second order correction C[i,j] only happens when two such lines meet at one point (combination of options) in space in the cube. So regardless of the number of groups you have, the higher order corrections are always increasingly smaller. If you stick to

Combinations = C(no rules) - Cr[1] - Cr[2] - Cr[3]

you end up with a lower bound for the number of combinations. Now that you know the size of your first order correction, and thinking about the observation with the cube above, you can even estimate the order of magnitude of the second order correction. It will depend on the number of groups. Your algorithm can then decide whether it wants to continue with the higher orders or stop.

爱情眠于流年 2024-08-03 22:20:30

Daniel 的帖子的评论:

你的算法看起来不错,但我做不到说服自己它确实有效,所以我安装了 scala 并做了一些测试。 不幸的是我没有得到正确的结果。

例如,考虑这种情况:

Group 1:
a1 a2 a3 a4 a5

Group 2:
b1 b2 b3

Group 3:
c1 c2 c3 c4

Group 4:
d1 d2 d3 d4 d5

Rules:
a1 <-> b2
a1 <-> c2
b2 <-> c2
b2 <-> d1
a2 <-> d2

我使用此配置配置了基本筛算法,并得到了以下结果(227 组合):

Without rules => 300
Rules: [1] => 20
Rules: [2] => 15
Rules: [3] => 25
Rules: [4] => 20
Rules: [5] => 12
Order: 1 => 208 (diff: -92)
Rules: [1, 2] => 5
Rules: [1, 3] => 5
Rules: [2, 3] => 5
Rules: [1, 4] => 4
Rules: [2, 4] => 1
Rules: [3, 4] => 5
Rules: [1, 5] => 0
Rules: [2, 5] => 0
Rules: [3, 5] => 1
Rules: [4, 5] => 0
Order: 2 => 234 (diff: 26)
Rules: [1, 2, 3] => 5
Rules: [1, 2, 4] => 1
Rules: [1, 3, 4] => 1
Rules: [2, 3, 4] => 1
Rules: [1, 2, 5] => 0
Rules: [1, 3, 5] => 0
Rules: [2, 3, 5] => 0
Rules: [1, 4, 5] => 0
Rules: [2, 4, 5] => 0
Rules: [3, 4, 5] => 0
Order: 3 => 226 (diff: -8)
Rules: [1, 2, 3, 4] => 1
Rules: [1, 2, 3, 5] => 0
Rules: [1, 2, 4, 5] => 0
Rules: [1, 3, 4, 5] => 0
Rules: [2, 3, 4, 5] => 0
Order: 4 => 227 (diff: 1)
Rules: [1, 2, 3, 4, 5] => 0
Order: 5 => 227 (diff: 0)

***Combinations: 227***

但是在 scala 中使用此代码:

  val groups = Set(Group(1, Set(1, 2, 3, 4, 5)),
                   Group(2, Set(1, 2, 3)),
                   Group(3, Set(1, 2, 3, 4)),
                   Group(4, Set(1, 2, 3, 4, 5)))

  val rules = Set(Rule(GroupOption(1, 1), GroupOption(2, 2)),
                  Rule(GroupOption(1, 1), GroupOption(3, 2)),
                  Rule(GroupOption(2, 2), GroupOption(3, 2)),
                  Rule(GroupOption(2, 2), GroupOption(4, 1)),
                  Rule(GroupOption(1, 2), GroupOption(4, 2)))

我得到了答案 258 >。

我检查了筛法中的计算,它们似乎是正确的。 也许可以修复你的算法? 我真的无法指出哪里出了问题。

Comment to Daniel's post:

Your algorithm looks good but I couldn't convince myself it really worked, so I installed scala and made some tests. Unfortunaly I don't get correct results.

For example consider this case:

Group 1:
a1 a2 a3 a4 a5

Group 2:
b1 b2 b3

Group 3:
c1 c2 c3 c4

Group 4:
d1 d2 d3 d4 d5

Rules:
a1 <-> b2
a1 <-> c2
b2 <-> c2
b2 <-> d1
a2 <-> d2

I configured my basic sieve algorithm with this configuration and got the following results (227 combinations):

Without rules => 300
Rules: [1] => 20
Rules: [2] => 15
Rules: [3] => 25
Rules: [4] => 20
Rules: [5] => 12
Order: 1 => 208 (diff: -92)
Rules: [1, 2] => 5
Rules: [1, 3] => 5
Rules: [2, 3] => 5
Rules: [1, 4] => 4
Rules: [2, 4] => 1
Rules: [3, 4] => 5
Rules: [1, 5] => 0
Rules: [2, 5] => 0
Rules: [3, 5] => 1
Rules: [4, 5] => 0
Order: 2 => 234 (diff: 26)
Rules: [1, 2, 3] => 5
Rules: [1, 2, 4] => 1
Rules: [1, 3, 4] => 1
Rules: [2, 3, 4] => 1
Rules: [1, 2, 5] => 0
Rules: [1, 3, 5] => 0
Rules: [2, 3, 5] => 0
Rules: [1, 4, 5] => 0
Rules: [2, 4, 5] => 0
Rules: [3, 4, 5] => 0
Order: 3 => 226 (diff: -8)
Rules: [1, 2, 3, 4] => 1
Rules: [1, 2, 3, 5] => 0
Rules: [1, 2, 4, 5] => 0
Rules: [1, 3, 4, 5] => 0
Rules: [2, 3, 4, 5] => 0
Order: 4 => 227 (diff: 1)
Rules: [1, 2, 3, 4, 5] => 0
Order: 5 => 227 (diff: 0)

***Combinations: 227***

However using this code in scala:

  val groups = Set(Group(1, Set(1, 2, 3, 4, 5)),
                   Group(2, Set(1, 2, 3)),
                   Group(3, Set(1, 2, 3, 4)),
                   Group(4, Set(1, 2, 3, 4, 5)))

  val rules = Set(Rule(GroupOption(1, 1), GroupOption(2, 2)),
                  Rule(GroupOption(1, 1), GroupOption(3, 2)),
                  Rule(GroupOption(2, 2), GroupOption(3, 2)),
                  Rule(GroupOption(2, 2), GroupOption(4, 1)),
                  Rule(GroupOption(1, 2), GroupOption(4, 2)))

I got the answer 258.

I've checked the calculations in the sieve method and they seem to be right. Perhaps it is possible to fix your algorithm? I can't really put my finger on what is wrong.

柒七 2024-08-03 22:20:30

你的问题是相当不可行的。

  • 计算解决方案的数量是 #P 完全的,即使您将每组单选框限制为两个选项
  • 检查是否有任何与约束一致的选项选择是 NP 完全
  • 检查是否有任何与约束一致的选项选择可以是如果你将每组单选框限制为两个选项(2SAT),那么完成得相当快。

所以一般不要指望多项式算法; 这种算法的存在意味着 P=NP。

你可以做什么:

  • 使用近似算法。 根据我链接的维基百科文章,他们经常受到怀疑。
  • 使用 SAT 求解器 http://en.wikipedia.org/wiki/SAT_solver 或相关计数工具(不幸的是我不知道); 人们已经创建了许多启发式方法,并且程序通常会比您自制的解决方案快得多。 甚至还有SAT比赛,所以这个领域目前扩展得非常快。
  • 检查您是否需要这样的通用性。 也许你的问题有一个额外的假设,这会改变它的复杂性。

证明:

  1. 计算解决方案的数量很容易证明是#P,因此将 2SAT 简化为此就足够了。 采取一些 2SAT 实例,例如

(p1 或不是 p2)和(p2 或不是 p3)

允许用户选择 p1、p2、p3 的值。 您可以轻松地形成约束条件,迫使该问题得以解决。 因此,可能的配置数量 = p1、p2、p3 使布尔公式成立的可能分配数量。

  1. 你可以很容易地检查是否允许某些选项的选择,所以这是NP,所以将3SAT简化为这样就足够了。 举一些 3SAT 实例,例如

(p1 或 p2 或不是 p3)和(p2 或不是 p1 或 p4)

给出选项:

group p1 p1true p1false

group p2 p2false p2true

group p3 p3false p3true

group p4 p4false p4true

group Clause_1 c1a c1b c1c

group Clause_2 c2a c2b c2c

Clause_1 将控制第一个子句:(p1 或 p2 或不是 p3)。 准确地说,如果选择 p1true,则 c1a 将可检查;如果选择 p2true,则 c1b 将可检查;如果选择 p3false,则 c1c 将可检查。 所以约束是:

p1false <-> c1a

p2false <->; c1b

p3true <->; c1c

与clause_2相同,约束为

p2false <-> c2a

p1true <->; c2b

p4false <-> c2c

如果用户能够选择所有答案(因此配置数量 > 0),他将证明变量 p1,...,p4 的某些估值使 3SAT 实例为真。 相反,如果用户无法选择与假设一致的答案(可用配置的数量 = 0),则 3SAT 实例将无法令人满意。 所以知道答案是否是> 0 表示知道 3SAT 实例是否可解。

当然,这种减少是多项式时间的。 证明结束。

如果您对答案可能为 0 的事实不满意:如果您忽略此类配置器,它仍然是 NP 困难的。 (您可以向所有组添加一些“bogus”选项,如果未选择“bogus”,则允许指数级的多种选择。这解释起来比较复杂,所以我将根据要求执行此操作。)

Your problem is rather infeasible.

  • Counting the number of solutions is #P-complete, even if you restrict every group of radioboxes to two options
  • Checking if there is ANY choice of options consistent with constraints is NP-complete
  • Checking if there is ANY choice of options consistent with constraints can be done quite fast, if you restrict every group of radioboxes to two options (2SAT)

So in general don't count on a polynomial algorithm; existence of such algorithms would imply P=NP.

What you can do:

  • Use an approximation algorithm. According to Wikipedia article I linked, they are often suspectible to them.
  • Use a SAT solver http://en.wikipedia.org/wiki/SAT_solver or a related tool for counting (unfortunately I don't know any); people have created many heuristics and that programs will be in general much faster than your homemade solution. There are even SAT competitions, so this field is currently expanding very fast.
  • Check if you need such generality. Maybe your problem has an additional assumptions, and this will change its complexity.

Proofs:

  1. Counting the number of solutions is easily shown to be #P, so it's enough to reduce 2SAT to this. Take some 2SAT instance, like

(p1 or not p2) and (p2 or not p3)

Allow the user to choose values of p1, p2, p3. You can easily form constraints that will force this to be solvable. So the number of possible configurations = number of possible assignments of p1, p2, p3 making the Boolean formula true.

  1. You can easily check if some choice of options is allowed or not, so this is NP, so it's enough to reduce 3SAT to this. Take some 3SAT instance, like

(p1 or p2 or not p3) and (p2 or not p1 or p4)

Give options:

group p1 p1true p1false

group p2 p2false p2true

group p3 p3false p3true

group p4 p4false p4true

group clause_1 c1a c1b c1c

group clause_2 c2a c2b c2c

clause_1 will be controlling the first clause: (p1 or p2 or not p3). Precisely, c1a will be checkable if p1true was chosen, c1b will be checkable if p2true was chosen, c1c will be checkable if p3false was chosen. So the constraints are:

p1false <-> c1a

p2false <-> c1b

p3true <-> c1c

Same with clause_2, constaints are

p2false <-> c2a

p1true <-> c2b

p4false <-> c2c

If the user will be able to choose all answers (so the number of configurations is > 0), he'll prove that there is some valuation of variables p1, ..., p4 that make the 3SAT instance true. And conversely, if the user won't be able to choose answers consistent with assumptions (the number of available configurations = 0), the 3SAT instance will not be satisfable. So knowing if the answer is > 0 means knowing if 3SAT instance is solvable.

Of course this reduction is polynomial-time. End of proof.

If you aren't satisfied with the fact that answer might be 0: it is still NP-hard if you disregard such configurators. (You would add some "bogus" option to all groups and allow exponentially many choices if "bogus" was not chosen. This is more complex to explain, so I'll do it on request.)

终陌 2024-08-03 22:20:30

上面 sdcvvc 的优秀答案中简要提到了这一点,但是蒙特卡洛近似是否足够好? 生成 N 个随机配置(其中选择 N 以使您的实验功效足够高:我不知道足够多的信息来帮助您),然后测试其中有多少与您的约束兼容。 将该比例推断到配置空间的其余部分。

This is mentioned briefly in sdcvvc's excellent answer above, but might a Monte Carlo approximation be good enough? Generate N random configurations (where N is chosen such that the power of your experiment is high enough: I don't know enough to help you here), then test how many of them are compatible with your constraints. Extrapolate that proportion to the rest of your configuration space.

当爱已成负担 2024-08-03 22:20:29

如果您有 N 个选项组,每个选项组都有 Xi 选项 (0<=i),那么

X0*X1*...*X(N-1)

会给您您想要的答案。 换句话说,乘以每组选项的大小。

If you have N option groups, each with Xi options (0<=i<N) then

X0*X1*...*X(N-1)

Gives you the answer you want. In other words, multiply the size of each group of options.

农村范ル 2024-08-03 22:20:29

如果您有 n 个参数,每个参数具有 Ci 个可能值和 m 约束,则配置数量的上限如下(忽略的约束条件)。

N0 = C1 * C2 * ... * Cn

形式为 ci == x => 的单个约束 cj != y 不允许以下数量的配置。

        N
Dk = -------
     Ci * Cj

因此,配置的数量是通过从忽略约束的上限中减去不允许的配置来获得的。

N = prod(Ci, i = 1...n) * (1 - sum(1 / (Cxj * Cyj), j = 1...m))

这里,xjyj 是第 j 约束的两个参数索引。

示例

Parameters    n = 4

Parameter 1   C1 = 4   0 1 2 3 
Parameter 2   C2 = 3   4 5 6 
Parameter 3   C3 = 2   X Y
Parameter 4   C4 = 3   7 8 9

Constraints   m = 2

Constraint 1  c2 == 4 => c3 != X
Constraint 2  c3 == X => c4 != 9

N0 = 4 * 3 * 2 * 3 = 72

D1 = 72 / (3 * 2) = 12
D2 = 72 / (2 * 3) = 12

N = 72 - (12 + 12) = 48

更新

我认为这还不完全正确,因为它没有考虑约束的依赖性。

If you have n parameters with Ci possible values for each parameter and m constraints, a upper bound for the number of configuartions is the following (ignoring the constraints).

N0 = C1 * C2 * ... * Cn

A single constraint of the form ci == x => cj != y disallows the following number of configurations.

        N
Dk = -------
     Ci * Cj

Hence the number of configuration is obtained by subtracting the disallowed configuartions from the upper bound ignoring the constraints.

N = prod(Ci, i = 1...n) * (1 - sum(1 / (Cxj * Cyj), j = 1...m))

Here xj and yj are the both parameter indices from the j-th constraint.

Example

Parameters    n = 4

Parameter 1   C1 = 4   0 1 2 3 
Parameter 2   C2 = 3   4 5 6 
Parameter 3   C3 = 2   X Y
Parameter 4   C4 = 3   7 8 9

Constraints   m = 2

Constraint 1  c2 == 4 => c3 != X
Constraint 2  c3 == X => c4 != 9

N0 = 4 * 3 * 2 * 3 = 72

D1 = 72 / (3 * 2) = 12
D2 = 72 / (2 * 3) = 12

N = 72 - (12 + 12) = 48

UPDATE

I think this is not yet completly correct because it does not account for dependencies of the constraints.

温馨耳语 2024-08-03 22:20:29

对于一般情况没有捷径。它并不像你想象的那么糟糕。 请参阅下面的“重新思考”。

2^n 真的那么糟糕吗? 我们在这里谈论多少排除规则? 您实际上只需为每个配置器执行一次此操作,除非规则/选项集不断动态变化并需要动态重新计算。 如果真的有大量规则,那么我不会寻求精确的解决方案 - 只考虑 k 阶交集并说“组合数至少/最多......”。 可能还有其他筛选方法可以让您快速得出答案的范围。

另请记住:如果您只考虑实际需要的排除项,那么 2^n 只是一个上限,对于任何实际场景,您的实际计算次数可能会少得多。 也就是说,如果 C[1,2] 为零,则 C[1,2,...] 也为零。 考虑一下:对于每个约束,如果约束集共享任何共同选项,则将它们“聚集”在一起。 很明显,您的实际运行时间将由最大“团块”的大小定义(是的,可能与 n 一样大)。


重新思考:在大多数情况下,C[x,y] 将为零。 约束只能与涉及不同组的其他约束重叠。 换句话说,(x1<->y1)只能与(x1<->z1)或(y1<->z2)或其他东西重叠,而不能与(x1<->y2)重叠。 类似地,一组约束只能与一个新组重叠:(x1<->y1) 与 (y1<->z2) 的组合与 (x3<->) 没有交互;z2)(x 组已固定为 x1)。 您只需考虑包含/排除,其中添加到组合中的每个规则都会将以前未触及的组添加到组合中。 所以你实际上是 O(2G),其中 G 是组的数量(也可能是基于组大小的不同界限)。 更易于管理!

There are no shortcuts here for the general case. It's not as bad as you think. See "Rethinking" below.

Is 2^n really so bad? How many exclusion rules are we talking about here? You really only have to do this once for each configurator, unless the set of rules/options is constantly changing on the fly and requiring dynamic recalculation. If there's really a huge number of rules, then I would not seek an exact solution -- only consider the kth-order intersections and say "number of combinations is at least/most ...". There might be other sieve methods that will let you quickly derive bounds on the answer.

Also keep in mind: if you only consider the exclusions you actually need, then 2^n is merely an upper bound, and your actual number of calculations may be significantly less for any actual scenarios. That is, if C[1,2] is zero, then so is C[1,2,...]. Consider this: for each constraint, "clump" sets of constraints together if they share any options in common. It's clear that your real running time is going to be defined by the size of the largest "clump" (which, yes, may be as big as n).


Rethinking: C[x,y] is going to be zero in most cases. A constraint can only overlap with other constraints involving a different group. In other words, (x1<->y1) can only overlap with (x1<->z1) or (y1<->z2) or something, not (x1<->y2). Similarly, a set of constraints can only overlap with a new group: the combination of (x1<->y1) with (y1<->z2) has no interaction with (x3<->z2) (the x group is already fixed at x1). You only have to consider inclusion/exclusion where each rule you add to the combination adds a previously-untouched group to the mix. So you are actually O(2G), where G is the number of groups (and also perhaps a different bound based on the size of the groups). Much more manageable!

潦草背影 2024-08-03 22:20:29

好吧,我无法得到大约 2 ^ N,但我可以减少样本集。 为此,我们将计算“组合约束”。 组合约束是一种约束,其中如果选择了左侧的所有选项,则无法选择右侧的任何选项,但不能应用基于左侧选项的其他约束。

我们需要从一组约束计算一组所有可能的组合约束。 虽然没有必要,但如果右手组小于左手组,我们将通过交换左右手来“修复”现有约束。 这可能会减少一些组合约束,尽管可以使用更好的启发式进行交换。

我们还需要计算可以为每个组任意选择的选项的“最小集”。 该最小集是通过从可用选项列表中删除出现在组合约束左侧的任何选项来计算的。

接下来是一个算法,但我并没有证明它可以正确计算 CC。 我将证明,如果确实如此,那么它们可以用来计算可能组合的数量。

  1. 修复约束,使左手组小于或等于右手组。
  2. 编写约束:

    1. 按左手排序约束
    2. 依次,对于每个约束:

      1. 用同一左手折叠约束及其后面的所有约束,将 x1 <-> y1,x1 <-> y2 ... x1 <-> yN 进入 Set(x1) <-> 设置(y1 ... yN)
      2. 将折叠约束与每个已折叠约束组合起来,如果:
        • x1 不在已折叠约束的右侧
        • x1 与左侧任何元素都不属于同一组
      3. 将折叠约束及其所有组合添加到折叠约束集中

  3. 计算最小值通过获取所有选项并删除出现在固定约束左侧的选项来设置。

现在您可以使用下面的公式计算组合数。 我们将 CC 称为组合约束。 那么组合数为:

C(Mininum Set) + CCC1 + ... + CCCn

其中:

  • C(Minimum Set) 是最小集合可能的组合数。
  • CCCx 是通过取最小集、用该选项替换 CCx 左侧有选项的任何组,然后删除 CCx 右侧的任何选项而可能的组合数。

请注意,该表达式纯粹是相加的。 这意味着,要使该表达式产生预期结果,必须满足以下两个条件:

  1. 它的任何两项都不能包含相同的组合。
  2. 所有组合都必须由这些术语来解释。
  3. 任何项都不能产生无效的组合。

对于第一个证明,请注意不存在同一左手的两个不同的 CC。 如果两个 CC 具有相同的左手但不同的右手,则意味着存在必须应用于其中一个 CC 的附加约束,或者应用于另一个 CC 的无效约束。

由于不存在具有相同左手的两个CC,并且根据定义,最小集合不包含任何CC的左手,因此可以通过至少一个选项来区分任何两个CC,该选项为一个选择而另一个不选择。 因此,任何两个 CC 都不会产生相同的组合。

对于第二个证明,请注意,根据定义,CC 集包含左侧选项的所有有效组合。

假设有一个组合既没有出现在最小集合中,也没有出现在CC集合中。 如果该组合不包含任何左侧选项,则根据定义,它必须是最小集的组合。 因此,它必须包含左手的选项。

由于这组 CC 包含左手的所有有效组合,因此有一个 CC 具有相同的左手选项。 因此,该组合必须具有一个不包含在该 CC 的任何组合中的选项。 但该 CC 中唯一未包含的选项是出现在其他 CC 左侧的选项,以及必须通过约束从其中排除的选项。 既然情况都不是这样,那么这种组合就不可能存在。

对于第三个证明,我们首先考虑最小集。 最小集不包含任何组左侧的任何选项。 由于所有约束都在一个左手选项和一个右手选项之间,因此没有任何约束适用于最小集合。

现在让我们考虑一下 CC。 根据定义,CC 具有左手选项的有效组合。 任何与左手不兼容的选项都必须出现在右手中,并且右手中的任何选项都必须从最小集中删除。 由于最小集合上没有选项本身之间不兼容,因此 CC 上的任何组合都不会存在不满足的约束。

证明就结束了。

让我们通过评论中的示例来看看这是如何应用的:

G1: x1, x2, x3 
G2: y1, y2 
G3: z1, z2, z3 

R1: x1 <-> y2 
R2: x3 <-> y2 
R3: y1 <-> z1 
R4: y2 <-> z2 
R5: y2 <-> z3

CC1: {x1} <-> {y2}
CC2: {x3} <-> {y2}
CC3: {y1} <-> {z1}
CC4: {x1, y1} <-> {y2, z1}
CC5: {x3, y1} <-> {y2, z1}
CC6: {y2} <-> {z2, z3}

让我们简要思考一下不在列表中的组合组:

R1&R2: {x1, x3} <-> {y2} -- not in the list because x1 and x3 belongs to the same
                            group
R1&R5: {x1, y2} <-> {y2} -- not in the list because the left hand of R2, y2
                            appears in the right hand of R1

现在,让我们看看每组中可能有哪些选项:

Minimum Set: (x2), (), (z1, z2, z3)    
CC1: (x1), (), (z1, z2, z3) -- replace G1 with x1, remove y2 from G2
CC2: (x3), (), (z1, z2, z3) -- replace G1 with x3, remove y2 from G2
CC3: (x2), (y1), (z2, z3)   -- replace G2 with y1, remove z1 from G3
CC4: (x1), (y1), (z2, z3)   -- replace G1 with x1, G2 with y1, remove y2 and z1
CC5: (x3), (y1), (z2, z3)   -- replace G1 with x3, G2 with y1, remove y2 and z1
CC6: (x2), (y2), (z1)       -- replace G2 with y2, remove z2 and z3 from G3

现在,让我们添加一些内容:

C(Minimum Set) = 1 * 0 *3 = 0
CCC1 = 1 * 0 * 3 = 0
CCC2 = 1 * 0 * 3 = 0
CCC3 = 1 * 1 * 2 = 2
CCC4 = 1 * 1 * 2 = 2
CCC5 = 1 * 1 * 2 = 2
CCC6 = 1 * 1 * 1 = 1

C(Minimum Set) + CCC1 + CCC2  + CCC3 + CCC4 + CCC5 + CCC6
0              + 0    + 0     + 2    + 2    + 2    + 1    = 7

我将在这里添加一个进一步的想法。 尽管 5 条规则只有 6 个 CCC,远少于预期的 32 个项,但这些 CCC 是用 2^N 最差时间计算的,因为对于每条规则,您必须将其与迄今为止创建的所有 CCC 进行比较和组合。 您可以将它们视为二进制数,如果规则被组合,则设置一个位,如果不组合则不设置。

然而,不兼容的组合会立即被丢弃,因此对于组合的每个新规则,不会在已经被视为无效的组合上浪费时间。 此外,通过预先对规则进行排序,同一组中的连续规则可能会被丢弃,甚至无需测试与正确数据结构的不兼容性。

正如这个特定示例所示,平均时间可能比 2^N 好得多。

替代算法和考虑因素

有一些关于 2-SAT 和 3-SAT 的讨论。 在我看来,这是一个 2-SAT 问题,因为每个都约束一个 <-> 。 b 实际上是一个子句“!a || !b”。 因此,所有约束一起可以写为“(!x1 || !y2) && (!x1 || !z4) && (!y2 && !z3)”等。这意味着您可以“解决”它在某种意义上您可以找出每个选项是否有一个布尔值赋值可以将其变为真。 Aspall、Plass 和 Tarjan 对此有一个线性算法,并附有幻灯片 这里

但是知道约束是否可以解决并不是我们所要求的。 问题是在保持 2-SAT 问题成立的同时,可以设置所有选项的数量

现在,有一些有效的算法可以计算满足 2-SAT 问题的方法数量。 例如,本文提出了一种在 1.2561n 中运行的算法。 但即使这样也对我们没有帮助,因为我们需要知道什么解决方案才能计算出满足该解决方案的组合数量。

根据维基百科,本文有一个算法可以有效地枚举所有解决方案,这就是我们想。 但如果计数已经是指数级的,那么枚举也将是指数级的。 也许比 2n 更好,但仍然是指数级的。

如果我们枚举 2-SAT 问题的所有解决方案,则每组的组合数量由 1 乘以自由选项(未选择任何选项的每组中未出现在任何约束中的选项)的数量给出通过解决方案。

例如,采用之前的一组组和约束。 2-SAT 问题(包括互斥)是:

(!x1 || !y2) && (!x3 || !y2) && (!y1 || !z1) && (!y2 || !z2) && (!y2 || !z3) &&
(!x1 || !x3) && (!y1 || !y2) && (!z1 || !z2) && (!z1 || !z3) && (!z2 || !z3)

第一行是五个规则。 第二行是约束规则中出现的同一组中所有选项的互斥。

这个 2-SAT 问题的解是:

x1    x3    y1    y2    z1    z2    z3    Combinations
true  false true  false false true  false 1
true  false true  false false false true  1
true  false true  false false false false 0
true  false false false true  false false 0
true  false false false false true  false 0
true  false false false false false true  0
true  false false false false false false 0
false true  true  false false true  false 1
false true  true  false false false true  1
false true  true  false false false false 0
false true  false false true  false false 0
false true  false false false true  false 0
false true  false false false false true  0
false true  false false false false false 0
false false true  false false true  false 1
false false true  false false false true  1
false false true  false false false false 0
false false false true  true  false false 1
false false false true  false false false 0
false false false false true  false false 0
false false false false false true  false 0
false false false false false false true  0
false false false false false false false 0

在前两个解中,没有没有选择选项的组,因此组合数为 1。第三个解没有为组 G3 选择选项,所以我们将 1 乘以0. 以“false false”开头的行中,没有为组G1选择任何选项,并且有一个自由选项:x2。 因此,我们将它们乘以 1,如果没有为 G2 或 G3 选择选项,则乘以 0(在这种情况下,自由选项的数量为 0)。

现在的问题是,我如何强制在每组中选择一个选项,同时仍然声称自己是 2-SAT。 如前所述,该问题有两个隐含的约束:对于每一组,必须选择一个且只能选择一个选项。 这两个约束可以写为:

    x1 || x<子>2 || x3(对于 x 组,带有选项 x1 .. x3
   (!x1 || !x2) && (!x1 || !x3) && (!x2 || !x3)

对于任何重要情况,后者约束为 2-SAT,前者约束为 3-SAT。 碰巧,我不强制执行第一个约束,但计数会变为 0。计数算法应该如下所示:

  • 对于无约束组合,将每组中无约束选项的数量相互乘以。
  • 对于完全约束组合,添加以下计数的结果:
    • 对于每个解决方案,将每组中无约束选项的数量相乘,且没有选项被评估为“true”。

因此,对于其中至少有一个无约束选项的每一组,选择是隐式且匿名的。 对于所有选项都是某个约束的一部分的每个组,如果没有选择任何选项,则该组的计数将变为 0,因此该解决方案的组合数也将变为 0。

这感觉像是从有效的 > 中欺骗了问题。 2-SAT 约束。 毕竟,如果这是可能的,那么只需枚举 2-SAT 部分的解决方案,然后丢弃所有不满足 3-SAT 部分的解决方案即可解决 3-SAT 问题。 唉,我可以识别出一个根本区别:

  • 问题的 2-SAT 部分未解决的所有谓词都不受任何进一步的约束。

鉴于对子句的这种相当严格的约束,我们可以通过枚举 2-SAT 显式约束的解决方案来解决这个问题。

如果有人想进一步追求这一点,那就继续吧。 我对我建议的 2n 解决方案的改进感到满意。

Ok, I can't get around 2 ^ N, but I can reduce the sample set. To do this, we'll compute "Composed Constraints". A Composed Constraint is a constraint where, if the all options on the left side is selected, then none of the options on the right side can be selected, but no other constraint based on the left side options may apply.

We need to compute a set of all possible Composed Constraints from a set of constraints. While not necessary, we'll "fix" the existing constraints by swapping left and right hands if the group of the right hand is smaller than the group of the left. This may reduce some composed constraints, though better heuristics are possible for swapping.

We also need to compute a "minimum set" of options that can be arbitrarily chosen for each group. This minimum set is computed by removing from the list of available options any options appearing on the left hand of a composed constraint.

An algorithm follows, but I'm not proving it computes CCs correctly. I will prove that, if it does, then they can be used to compute the number of possible combinations.

  1. Fix the constraints so that the group of the left hand is less than, or equal to the group of the right hand.
  2. Compose the constraints:

    1. Sort constraints by left hand
    2. Sequentially, for each constraint:

      1. Fold the constrain with all constraints that follow it with the same left hand, turning x1 <-> y1, x1 <-> y2 ... x1 <-> yN into Set(x1) <-> Set(y1 ... yN)
      2. Compose the folded constraint with each already folded constraint, if:
        • x1 is not in the right hand of the already folded constraint
        • x1 is not in the same group of any element in the left hand
      3. Add the folded constraint and all it's compositions to the set of folded constraints
  3. Compute the Minimum Set, by taking all options and removing the ones appearing in the left hand of the fixed constraints.

Now you can compute the number of combinations with the formula below. Let's call CC a composed constraint. Then the number of combinations is:

C(Mininum Set) + CCC1 + ... + CCCn

Where:

  • C(Minimum Set) is the number of combinations possible with the minimum set.
  • CCCx is the number of combinations possible by taking the minimum set, replacing any groups for which there's an option on the left hand of CCx with that option, and then removing any options on the right hand of CCx.

Note that the expression is purely additive. This means that, for that expression to yield the expected result, the following two conditions must be true:

  1. No two terms of it may contain the same combination.
  2. All combinations must be accounted for by these terms.
  3. No invalid combinations may be yielded by any term.

For the first proof, note that there are no two distinct CCs with the same left hand. If two CCs had the same left hand but different right hands, that would mean there was an addition constraint that must apply to one of the CCs, or an invalid constraint being applied to the other.

Since there are no two CCs with the same left hand, and the minimum set does not contain the left hand of any CC by definition, then any two CC can be distinguished by at least one option which is selected for one but not for the other. Therefore, no two CCs may yield the same combination.

For the second proof, note that the set of CCs contains, by definition, all valid combinations for options on the left hand.

Assume that there is one combination that does not appear in either the minimum set nor the set of CCs. If this combination does not contain any left-hand option, then it must be a combination from the minimum set, by definition. Therefore, it must contain options from the left hand.

Since the set of CCs contains all valid combinations for the left hand, then there is one CC with the same left-hand options. This combination must have, therefore, an option that is not included in any combination for that CC. But the only options not included in that CC are the ones appearing in the left hand for other CCs, and the ones that must be excluded from it by constraints. Since neither may be the case, then this combination cannot exist.

For the third proof, let's first consider the minimum set. The minimum set does not contain any option in the left hand of any group. Since all constraints are between one left and one right hand option, no constraint applies to the minimum set.

Now let's consider the CCs. A CC has a valid combination of left hand options by definition. Any option that is incompatible with that left hand must appear in the right hand, and any option from that right hand must be removed from the minimum set. Since no options on the minimum set where incompatible between themselves to begin with, there can be no unsatisfied constraint in any combination on a CC.

And that ends the proof.

Let's see how this applies with an example from the comments:

G1: x1, x2, x3 
G2: y1, y2 
G3: z1, z2, z3 

R1: x1 <-> y2 
R2: x3 <-> y2 
R3: y1 <-> z1 
R4: y2 <-> z2 
R5: y2 <-> z3

CC1: {x1} <-> {y2}
CC2: {x3} <-> {y2}
CC3: {y1} <-> {z1}
CC4: {x1, y1} <-> {y2, z1}
CC5: {x3, y1} <-> {y2, z1}
CC6: {y2} <-> {z2, z3}

Let's ponder briefly on composed groups not in the list:

R1&R2: {x1, x3} <-> {y2} -- not in the list because x1 and x3 belongs to the same
                            group
R1&R5: {x1, y2} <-> {y2} -- not in the list because the left hand of R2, y2
                            appears in the right hand of R1

Now, let's see what options are possible in each set:

Minimum Set: (x2), (), (z1, z2, z3)    
CC1: (x1), (), (z1, z2, z3) -- replace G1 with x1, remove y2 from G2
CC2: (x3), (), (z1, z2, z3) -- replace G1 with x3, remove y2 from G2
CC3: (x2), (y1), (z2, z3)   -- replace G2 with y1, remove z1 from G3
CC4: (x1), (y1), (z2, z3)   -- replace G1 with x1, G2 with y1, remove y2 and z1
CC5: (x3), (y1), (z2, z3)   -- replace G1 with x3, G2 with y1, remove y2 and z1
CC6: (x2), (y2), (z1)       -- replace G2 with y2, remove z2 and z3 from G3

Now, let's add things up:

C(Minimum Set) = 1 * 0 *3 = 0
CCC1 = 1 * 0 * 3 = 0
CCC2 = 1 * 0 * 3 = 0
CCC3 = 1 * 1 * 2 = 2
CCC4 = 1 * 1 * 2 = 2
CCC5 = 1 * 1 * 2 = 2
CCC6 = 1 * 1 * 1 = 1

C(Minimum Set) + CCC1 + CCC2  + CCC3 + CCC4 + CCC5 + CCC6
0              + 0    + 0     + 2    + 2    + 2    + 1    = 7

I'll add one further thought here. Even though there are only 6 CCCs for 5 rules, much less than the 32 terms expected otherwise, these CCCs are computed with 2^N worst time, since, for each rule, you must compare and combine it to all CCCs created so far. You may think of them as binary numbers, where a bit is set if the rule is being combined, and not set if not.

However, incompatible combinations are discard right away, so that for each new rule being combined, no time is lost on combinations already deemed invalid. Also, by sorting the rules before-hand, successive rules in the same group may be discarded without even testing for incompatibility, with the right data structures.

As this particular example shows, the average time can be much better than 2^N.

Alternative Algorithms and Considerations

There's some talk of 2-SAT and 3-SAT going around. It seems, to me, that this is a 2-SAT problem, in the sense that every constrain a <-> b is actually a clause "!a || !b". So all constrains together can just be written as "(!x1 || !y2) && (!x1 || !z4) && (!y2 && !z3)", etc. That means you can "solve" it in the sense that you can find out if there is a boolean assignment to each option that will turn this true. There is a linear algorithm for this by Aspall, Plass and Tarjan, with a slide presentation here.

But knowing whether the constrains can be solved or not is not what was asked. What was asked is the number of ways all options can be set while keeping the 2-SAT problem true.

Now, there are efficient algorithms for counting the number of of ways to satisfy a 2-SAT problem. For instance, this paper presents an algorithm that runs in 1.2561n. But even this won't help us, as we need to know what the solutions are to be able to compute the number of combinations that satisfy that solution.

According to Wikipedia, this paper has an algorithm which efficiently enumerate all solutions, which is what we want. But if counting is already exponential, so will be enumerating. Better than 2n, perhaps, but still exponential.

If we do enumerate all solutions to the 2-SAT problem, the number of combinations for each group is given by 1 multiplied by the number of free options, options that do not appear in any constraint, of each group for which no option was selected by the solution.

For example, taking the previous set of groups and constraints. The 2-SAT problem, including mutual exclusion, is:

(!x1 || !y2) && (!x3 || !y2) && (!y1 || !z1) && (!y2 || !z2) && (!y2 || !z3) &&
(!x1 || !x3) && (!y1 || !y2) && (!z1 || !z2) && (!z1 || !z3) && (!z2 || !z3)

The first line are the five rules. The second line is the mutual exclusion of all options in the same group that appear in the constraint rules.

The solutions to this 2-SAT problems are:

x1    x3    y1    y2    z1    z2    z3    Combinations
true  false true  false false true  false 1
true  false true  false false false true  1
true  false true  false false false false 0
true  false false false true  false false 0
true  false false false false true  false 0
true  false false false false false true  0
true  false false false false false false 0
false true  true  false false true  false 1
false true  true  false false false true  1
false true  true  false false false false 0
false true  false false true  false false 0
false true  false false false true  false 0
false true  false false false false true  0
false true  false false false false false 0
false false true  false false true  false 1
false false true  false false false true  1
false false true  false false false false 0
false false false true  true  false false 1
false false false true  false false false 0
false false false false true  false false 0
false false false false false true  false 0
false false false false false false true  0
false false false false false false false 0

In the first two solutions, there are no groups without a selected option, so the number of combinations is 1. The third solution has no option selected for the group G3, so we multiply 1 by 0. In the lines that start with "false false", there are no option selected for group G1, and one free option: x2. So we multiply 1 by 1 for them, and by 0 if there are no option selected for G2 or G3 (in which case the number of free options is 0).

Now, there is the question of how do I enforce one option being chosen in each group and still claim to be 2-SAT. The problem, as stated, has two implicit constraints: for each group, there must be one, and only one, option selected. These two constrains can be written as:

    x1 || x2 || x3 (for group x with options x1 .. x3)
    (!x1 || !x2) && (!x1 || !x3) && (!x2 || !x3)

The later constrain being 2-SAT, the former being 3-SAT for any non-trivial case. As it happens, I do not enforce the first constraint, but the count then becomes 0. The counting algorithm should go like this:

  • For the constraint-less combinations, multiply the number of constraint-less options in each group by each other.
  • For the constraint-full combinations, add the result of the following counts:
    • For each solution, multiply the number of constraint-less options in each group without an option evaluated to "true" by each other.

So, for every group in which there is at least one constraint-less option, the selection is implicit and anonymous. For every group in which all options are part of some constraint, if no option was selected then the count for that group becomes 0, and, therefore, the number of combinations for that solution becomes 0 as well.

This feels like cheating the problem out of a valid > 2-SAT constraint. After all, if this was possible, then the 3-SAT problem could be solved just by enumerating the solutions to the 2-SAT part of it, and then discarding every one which does not satisfy the 3-SAT part of it. Alas, there is one fundamental difference I can identify:

  • All predicates not solved by the 2-SAT part of the problem are free from any further constraints.

Given this rather restrictive constraint on the clauses, we can solve this problem by enumerating solutions to the 2-SAT explicit constraints.

If anyone wants to pursue this further, go ahead. I'm satisfied with the improvement on the 2n solution I suggested.

梦幻的味道 2024-08-03 22:20:29

编辑

该算法不正确。 我在另一篇文章中提出了一个替代答案,在最坏的情况下仍然是 2^N,但否则可能会给出更好的结果。

这在所选示例中有效,因为 y2 是 x1 的排除集的一部分,并且前两个约束基于 x1。 尽管如此,我现在看到了需要做什么。 它仍然接近 2^N,但有一些优化可以带来显着的收益。

为了修复这个算法,组合规则必须采用 set(ox) <-> 的形式。 设置(哦)。 为了组合它们,对于您组合的左侧 oX 的每个约束,您还可以使用您已经组合的每个规则对其进行其他组合,如果 oX 不是组合规则右侧的一部分,也不是组与组的左侧相同

对于完全独立的约束,这是 2^N。 否则,您可以通过以下方式减少 N:

  • 将约束与常见的左手
  • 不计算互斥的规则组合统一起来,这分为:
    • 不合并同一组中选项的规则
    • 不组合规则,其中一个规则的左侧出现在另一个规则的右侧

我认为修复此算法不值得。 它占用的内存相当大,而且它的顺序与我的替代答案非常相同,而我的替代答案要轻得多。

结束编辑

让我们扭转局面。 对于算法来说怎么样:

  1. 根据需要修复规则以确保对于规则 o1 <-> o2,组(o1)< group(o2)
  2. 通过折叠所有规则来计算“组合”规则 oX <-> o? 转换为单个规则 oX <-> Set(o?)
  3. 通过从其中删除每个规则的左侧选项来计算
  4. 一组“干净”的组 通过将左侧选项的组替换为左侧选项本身,并从其他组中减去规则右侧的选项。
  5. 对于每组组,通过乘以组中每组中的选项数量来计算组合数量。
  6. 将第 5 步中的所有结果相加。

让我们看看它的工作原理:

Group 1:
x1 x2 x3 x4 x5

Group 2:
y1 y2 y3

Group 3:
z1 z2 z3 z4

Constraints (already fixed):
x1 <-> y2 *
x1 <-> z4
y2 <-> z2

Composed rules:
x1 <-> (y2, z4)
y2 <-> (z2)

Clean set of groups:
x2x3x4x5, y1y3, z1z2z3z4

Alternate sets:
x1, y1y3, z1z2z3 (first composed rule)
x2x3x4x5, y2, z1z3z4 (second composed rule)

Totals:
4 * 2 * 4 = 32
1 * 2 * 3 = 6
4 * 1 * 3 = 12

Total: 50

现在,也许这个算法是不正确的。 现在,我无法清晰地思考来证明它是正确的还是其他的——我已经太接近这个问题太久了。 但让我们检查一下这个例子:

c(no rules) = 60
c1 => 4
c2 => 3
c3 => 5
c12 => 1
c13 => 1
c23 => 0
c123 => 0

c(no rules) - c1 - c2 - c3 + c12 + c13 + c23 - c123 = 50

如果我的算法是正确的,它似乎是多项式的。 再说一遍,现在我的思考还不够清晰,我需要考虑集合中操作的大O。

这是它的 Scala 实现:

case class GroupOption(id: Int, option: Int)
case class Group(id: Int, options: Set[Int])
case class Rule(op1: GroupOption, op2: GroupOption)
case class ComposedRule(op: GroupOption, set: Set[GroupOption])

object ComputeCombinations {
  def fixRules(rules: Set[Rule]) = {
    rules map (rule => if (rule.op1.id > rule.op2.id) Rule(rule.op2, rule.op1) else rule)
  }

  def ruledOptions(id: Int, rules: Set[Rule]): Set[Int] = (
    rules 
    filter (rule => rule.op1.id == id)
    map (rule => rule.op1.option)
  )

  def cleanseSet(groups: Set[Group], rules: Set[Rule]) = {
    groups map (group => 
      Group(group.id, group.options -- ruledOptions(group.id, rules)))
  }

  def composeRules(rules: Set[Rule]): Set[ComposedRule] = Set(
    (
      rules.toList
      sort (_.op1.id < _.op1.id)
      foldLeft (List[ComposedRule]())
      ) { (list, rule) => list match {
        case ComposedRule(option, set) :: tail if option == rule.op1 =>
          ComposedRule(option, set + rule.op2) :: tail
        case _ => ComposedRule(rule.op1, Set(rule.op2)) :: list
      }} : _*
  )

  def subset(groups: Set[Group], composedRule: ComposedRule) = (
    groups
    filter (_.id != composedRule.op.id)
    map (group => Group(group.id, group.options -- 
                                  (composedRule.set 
                                   filter (_.id == group.id)
                                   map (_.option)
                                  )))
  )

  def subsets(groups: Set[Group], composedRules: Set[ComposedRule]) = (
    composedRules map (composedRule => subset(groups, composedRule))
  )

  def combinations(groups: Set[Group]) = (
    groups.toList map (_.options.size) reduceLeft (_*_)
  )

  def allCombinations(groups: Set[Group], rules: Set[Rule]) = {
    val fixedRules = fixRules(rules)
    val composedRules = composeRules(fixedRules)
    val cleanSet = cleanseSet(groups, fixedRules)
    val otherSets = subsets(cleanSet, composedRules)
    val allSets = otherSets + cleanSet
    val totalCombinations = allSets.toList map (set => combinations(set)) reduceLeft (_+_)
    totalCombinations
  }
}

object TestCombinations {
  val groups = Set(Group(1, Set(1, 2, 3, 4, 5)),
                   Group(2, Set(1, 2, 3)),
                   Group(3, Set(1, 2, 3, 4)))
  val rules = Set(Rule(GroupOption(1, 1), GroupOption(2, 2)),
                  Rule(GroupOption(1, 1), GroupOption(3, 4)),
                  Rule(GroupOption(2, 2), GroupOption(3, 2)))
  def test = ComputeCombinations.allCombinations(groups, rules)
}

EDIT

This algorithm is incorrect. I have presented an alternate answer in another post which is still 2^N in the worse case, but might give better results otherwise.

This one works in the example choosen because y2 is part of the exclusion set of x1, and the two first constrains are based on x1. Still, I now see what needs to be done. It's still close to 2^N, but there are optimizations which can lead to significant gains.

To fix this algorithm, the composed rules must be of the form set(ox) <-> set(oy). To compose them, for each constrain with left-hand oX you compose, you also make other compositions of it with each rule you already composed, if oX is not part of the right-hand side of the composed rule, nor it's group is the same as the left-hand side of the group.

For completely independent constrains, this is 2^N. Otherwise, you are reducing N by doing:

  • unifying constrains with a common left-hand
  • not computing rule combinations which are mutually exclusive, this is divided in:
    • not combining rules for options in the same group
    • not combining rules where the left side of of one appears on the right side of the other

I don't think fixing this algorithm is worth it. It is rather memory-heavy, and it would have the very same order as my alternate answer, which is much lighter.

End Edit

Let's turn this around. How about this for an algorithm:

  1. Fix rules as necessary to ensure that for rule o1 <-> o2, group(o1) < group(o2)
  2. Compute "composed" rules by folding all rules oX <-> o? into a single rule oX <-> Set(o?)
  3. Compute a "clean" set of groups by removing from them the left option of every rule
  4. Compute alternate sets from the clean set, one for each composed rule, by replacing the group of the left option with the left option itself, and subtracting from the other groups the options of the right hand of the rule.
  5. For each set of groups, compute the number of combinations by multiplying the number of options in each group in the set.
  6. Add all the results from step 5.

Let's see this at work:

Group 1:
x1 x2 x3 x4 x5

Group 2:
y1 y2 y3

Group 3:
z1 z2 z3 z4

Constraints (already fixed):
x1 <-> y2 *
x1 <-> z4
y2 <-> z2

Composed rules:
x1 <-> (y2, z4)
y2 <-> (z2)

Clean set of groups:
x2x3x4x5, y1y3, z1z2z3z4

Alternate sets:
x1, y1y3, z1z2z3 (first composed rule)
x2x3x4x5, y2, z1z3z4 (second composed rule)

Totals:
4 * 2 * 4 = 32
1 * 2 * 3 = 6
4 * 1 * 3 = 12

Total: 50

Now, maybe this algorithm is incorrect. Right now, I can't think clearly enough to prove it correct or otherwise -- I have been too close to the problem for too long. But let's check against the example:

c(no rules) = 60
c1 => 4
c2 => 3
c3 => 5
c12 => 1
c13 => 1
c23 => 0
c123 => 0

c(no rules) - c1 - c2 - c3 + c12 + c13 + c23 - c123 = 50

If my algorithm is correct, it seems to be polynomial. Again, right now I can't think clearly enough, and I'd need to consider the big-O of the manipulation in the sets.

Here is an Scala implementation of it:

case class GroupOption(id: Int, option: Int)
case class Group(id: Int, options: Set[Int])
case class Rule(op1: GroupOption, op2: GroupOption)
case class ComposedRule(op: GroupOption, set: Set[GroupOption])

object ComputeCombinations {
  def fixRules(rules: Set[Rule]) = {
    rules map (rule => if (rule.op1.id > rule.op2.id) Rule(rule.op2, rule.op1) else rule)
  }

  def ruledOptions(id: Int, rules: Set[Rule]): Set[Int] = (
    rules 
    filter (rule => rule.op1.id == id)
    map (rule => rule.op1.option)
  )

  def cleanseSet(groups: Set[Group], rules: Set[Rule]) = {
    groups map (group => 
      Group(group.id, group.options -- ruledOptions(group.id, rules)))
  }

  def composeRules(rules: Set[Rule]): Set[ComposedRule] = Set(
    (
      rules.toList
      sort (_.op1.id < _.op1.id)
      foldLeft (List[ComposedRule]())
      ) { (list, rule) => list match {
        case ComposedRule(option, set) :: tail if option == rule.op1 =>
          ComposedRule(option, set + rule.op2) :: tail
        case _ => ComposedRule(rule.op1, Set(rule.op2)) :: list
      }} : _*
  )

  def subset(groups: Set[Group], composedRule: ComposedRule) = (
    groups
    filter (_.id != composedRule.op.id)
    map (group => Group(group.id, group.options -- 
                                  (composedRule.set 
                                   filter (_.id == group.id)
                                   map (_.option)
                                  )))
  )

  def subsets(groups: Set[Group], composedRules: Set[ComposedRule]) = (
    composedRules map (composedRule => subset(groups, composedRule))
  )

  def combinations(groups: Set[Group]) = (
    groups.toList map (_.options.size) reduceLeft (_*_)
  )

  def allCombinations(groups: Set[Group], rules: Set[Rule]) = {
    val fixedRules = fixRules(rules)
    val composedRules = composeRules(fixedRules)
    val cleanSet = cleanseSet(groups, fixedRules)
    val otherSets = subsets(cleanSet, composedRules)
    val allSets = otherSets + cleanSet
    val totalCombinations = allSets.toList map (set => combinations(set)) reduceLeft (_+_)
    totalCombinations
  }
}

object TestCombinations {
  val groups = Set(Group(1, Set(1, 2, 3, 4, 5)),
                   Group(2, Set(1, 2, 3)),
                   Group(3, Set(1, 2, 3, 4)))
  val rules = Set(Rule(GroupOption(1, 1), GroupOption(2, 2)),
                  Rule(GroupOption(1, 1), GroupOption(3, 4)),
                  Rule(GroupOption(2, 2), GroupOption(3, 2)))
  def test = ComputeCombinations.allCombinations(groups, rules)
}
酒废 2024-08-03 22:20:29

这可能不是一个直接有用的答案,所以请随意忽略它......但是; 我自己目前没有使用类似的系统; 坦率地说,除了简单的例子,我不确定尝试计算有效组合的数量是否有用。 举个例子,我目前正在研究的模型有(例如)18000 个候选项目,分布在 80 多个选择中,有些是单选,有些是多选。 除了最小的模型之外,知道这个数字没有任何好处,因为您永远不会将其写为完整的真值表; 您几乎被迫按需运行规则(即删除不再有效的任何内容,根据需要自动选择任何内容,并检查是否没有违反规则)。 这不一定是问题; 我当前的代码在约 450 毫秒内处理此模型(作为 Web 服务),其中大部分实际上是处理输入/输出的 xml 所花费的时间。 如果输入/输出不是 xml,我认为大约是 150 毫秒,这很酷。

我很想说服我的雇主开源该引擎; 不过,那是另一天的战斗。

This might not be a directly useful answer, so feel free to ignore it... however; I'm currently working no a similar system myself; and frankly other than for trivial examples I'm not sure it is useful to try to calculate the number of valid combinations. As an example, the models I'm working on at the moment have (for example) 18000 candidate items spread over 80+ selections, some single-select / some multi-select. Beyond the smallest models, there is no benefit in knowing the number, as you would simply never write it as a full truth table; you are pretty-much forced to run the rules (i.e. remove anything that is no longer valid, auto-select anything as appropriate, and check that no rules are broken) on-demand. Which isn't necessarily a problem; my current code processes this model (as a web-service) in ~450ms, and most of that is actually the time spent processing xml for input/output. If the input/output wasn't xml, I think it would be ~150ms, which is cool.

I'd love to convince my employer to open-source the engine; that is a battle for another day, though.

不甘平庸 2024-08-03 22:20:29

难道不是 x^n,其中 n 是选项数,x 是每个选项的选项数吗?

Won't it just be x^n, where n is the number of options and x is the number of choices per option?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文