在 PHP 中查找数组的子集

发布于 2024-11-08 17:34:28 字数 358 浏览 4 评论 0原文

我有一个带有属性(ABCD)的关系模式。 我也有一组功能依赖项。

现在我需要确定 R 属性的所有可能子集的闭包。这就是我被困住的地方。我需要学习如何在 PHP 中查找子集(非重复)。

我的数组是这样存储的。

$ATTRIBUTES = ('A', 'B', 'C', 'D').

所以我的子集应该是

$SUBSET = ('A', 'B', 'C', 'D', 'AB', 'AC', AD', 'BC', 'BD', 'CD', 'ABC', 'ABD', 'BCD', 'ABCD')

代码不应该很大,但由于某种原因我无法理解它。

I have a Relational Schema with attributes (A B C D).
I have a set of Functional Dependencies with me too.

Now I need to determine the closure for all the possible subsets of R's attributes. That's where I am stuck. I need to learn how to find subsets (non-repeating) in PHP.

My Array is stored like this.

$ATTRIBUTES = ('A', 'B', 'C', 'D').

so my subsets should be

$SUBSET = ('A', 'B', 'C', 'D', 'AB', 'AC', AD', 'BC', 'BD', 'CD', 'ABC', 'ABD', 'BCD', 'ABCD')

The code shouldn't be something big but for some reason I can't get my head around it.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

纵性 2024-11-15 17:34:28

从 PHP 7.4 开始,我们可以通过使用 展开运算符来拥有一个不错的简短的 powerSet 函数

// IN:  ["A", "B", "C"]
// OUT: [[],["A"],["B"],["A","B"],["C"],["A","C"],["B","C"],["A","B","C"]]
function powerSet(array $array) : array {
    // add the empty set
    $results = [[]];

    foreach ($array as $element) {
        foreach ($results as $combination) {
            $results[] = [...$combination, $element];
        }
    }

    return $results;
}

As of PHP 7.4, we can have a nice short powerSet function by using spread operator:

// IN:  ["A", "B", "C"]
// OUT: [[],["A"],["B"],["A","B"],["C"],["A","C"],["B","C"],["A","B","C"]]
function powerSet(array $array) : array {
    // add the empty set
    $results = [[]];

    foreach ($array as $element) {
        foreach ($results as $combination) {
            $results[] = [...$combination, $element];
        }
    }

    return $results;
}
羞稚 2024-11-15 17:34:28

您想要 $attributes 的幂集吗?这就是你的问题所暗示的意思。

可以在此处找到示例(为了完整性而引用)

<?php 
/** 
* Returns the power set of a one dimensional array, a 2-D array. 
* [a,b,c] -> [ [a], [b], [c], [a, b], [a, c], [b, c], [a, b, c] ]
*/ 
function powerSet($in,$minLength = 1) { 
   $count = count($in); 
   $members = pow(2,$count); 
   $return = array(); 
   for ($i = 0; $i < $members; $i++) { 
      $b = sprintf("%0".$count."b",$i); 
      $out = array(); 
      for ($j = 0; $j < $count; $j++) { 
         if ($b{$j} == '1') $out[] = $in[$j]; 
      } 
      if (count($out) >= $minLength) { 
         $return[] = $out; 
      } 
   } 
   return $return; 
} 

You wish for the power set of $attributes? That is what your question implies.

An example can be found here (quoted for completeness)

<?php 
/** 
* Returns the power set of a one dimensional array, a 2-D array. 
* [a,b,c] -> [ [a], [b], [c], [a, b], [a, c], [b, c], [a, b, c] ]
*/ 
function powerSet($in,$minLength = 1) { 
   $count = count($in); 
   $members = pow(2,$count); 
   $return = array(); 
   for ($i = 0; $i < $members; $i++) { 
      $b = sprintf("%0".$count."b",$i); 
      $out = array(); 
      for ($j = 0; $j < $count; $j++) { 
         if ($b{$j} == '1') $out[] = $in[$j]; 
      } 
      if (count($out) >= $minLength) { 
         $return[] = $out; 
      } 
   } 
   return $return; 
} 
巴黎夜雨 2024-11-15 17:34:28

这里有一个回溯解决方案。

给定一个返回输入集的所有 L 长度子集的函数,找到从 L = 2 到数据集输入长度的所有 L 长度子集

<?php

function subsets($S,$L) {
    $a = $b = 0;
    $subset = [];
    $result = [];
    while ($a < count($S)) {
        $current = $S[$a++];
        $subset[] = $current;
        if (count($subset) == $L) {
            $result[] = json_encode($subset);
            array_pop($subset);
        }
        if ($a == count($S)) {
            $a = ++$b;
            $subset = [];
        }
    }
    return $result;
}



$S = [ 'A', 'B', 'C', 'D'];
$L = 2;


// L = 1 -> no need to do anything
print_r($S);

for ($i = 2; $i <= count($S); $i++)
    print_r(subsets($S,$i));

Here a backtracking solution.

given a function that returns all the L-lenght subsets of the input set, find all the L-lenght subsets from L = 2 to dataset input length

<?php

function subsets($S,$L) {
    $a = $b = 0;
    $subset = [];
    $result = [];
    while ($a < count($S)) {
        $current = $S[$a++];
        $subset[] = $current;
        if (count($subset) == $L) {
            $result[] = json_encode($subset);
            array_pop($subset);
        }
        if ($a == count($S)) {
            $a = ++$b;
            $subset = [];
        }
    }
    return $result;
}



$S = [ 'A', 'B', 'C', 'D'];
$L = 2;


// L = 1 -> no need to do anything
print_r($S);

for ($i = 2; $i <= count($S); $i++)
    print_r(subsets($S,$i));
听不够的曲调 2024-11-15 17:34:28

根据@Yada的回答,这将生成数组的幂集,但保留每个子集中原始数组的键(返回值仍然按数字顺序索引)。如果您需要关联数组的子集,这非常有用。

子集还保留原始数组的元素顺序。我向 $results 添加了稳定排序,因为我需要它,但您可以省略它。

function power_set($array) {
    $results = [[]];
    foreach ($array as $key => $value) {
        foreach ($results as $combination) {
            $results[] = $combination + [$key => $value];
        }
    }

    # array_shift($results); # uncomment if you don't want the empty set in your results
    $order = array_map('count', $results);
    uksort($results, function($key_a, $key_b) use ($order) {
        $comp = $order[$key_a] - $order[$key_b]; # change only this to $order[$key_b] - $order[$key_a] for descending size
        if ($comp == 0) {
            $comp = $key_a - $key_b;
        }
        return $comp;
    });
    return array_values($results);
}

给定OP的输入, var_dump(power_set(['A', 'B', 'C', 'D'])); 提供:

array(16) {
  [0] =>
  array(0) {
  }
  [1] =>
  array(1) {
    [0] =>
    string(1) "A"
  }
  [2] =>
  array(1) {
    [1] =>
    string(1) "B"
  }
  [3] =>
  array(1) {
    [2] =>
    string(1) "C"
  }
  [4] =>
  array(1) {
    [3] =>
    string(1) "D"
  }
  [5] =>
  array(2) {
    [0] =>
    string(1) "A"
    [1] =>
    string(1) "B"
  }
  [6] =>
  array(2) {
    [0] =>
    string(1) "A"
    [2] =>
    string(1) "C"
  }
  [7] =>
  array(2) {
    [1] =>
    string(1) "B"
    [2] =>
    string(1) "C"
  }
  [8] =>
  array(2) {
    [0] =>
    string(1) "A"
    [3] =>
    string(1) "D"
  }
  [9] =>
  array(2) {
    [1] =>
    string(1) "B"
    [3] =>
    string(1) "D"
  }
  [10] =>
  array(2) {
    [2] =>
    string(1) "C"
    [3] =>
    string(1) "D"
  }
  [11] =>
  array(3) {
    [0] =>
    string(1) "A"
    [1] =>
    string(1) "B"
    [2] =>
    string(1) "C"
  }
  [12] =>
  array(3) {
    [0] =>
    string(1) "A"
    [1] =>
    string(1) "B"
    [3] =>
    string(1) "D"
  }
  [13] =>
  array(3) {
    [0] =>
    string(1) "A"
    [2] =>
    string(1) "C"
    [3] =>
    string(1) "D"
  }
  [14] =>
  array(3) {
    [1] =>
    string(1) "B"
    [2] =>
    string(1) "C"
    [3] =>
    string(1) "D"
  }
  [15] =>
  array(4) {
    [0] =>
    string(1) "A"
    [1] =>
    string(1) "B"
    [2] =>
    string(1) "C"
    [3] =>
    string(1) "D"
  }
}

Based on @Yada's answer, this will generate the power set of an array, but preserve the original array's keys in each subset (the return value is still numerically & sequentially indexed). This very useful if you need subsets of an associative array.

The subsets also retain the element order of the original array. I added a stable sort to $results because I needed it, but you can omit it.

function power_set($array) {
    $results = [[]];
    foreach ($array as $key => $value) {
        foreach ($results as $combination) {
            $results[] = $combination + [$key => $value];
        }
    }

    # array_shift($results); # uncomment if you don't want the empty set in your results
    $order = array_map('count', $results);
    uksort($results, function($key_a, $key_b) use ($order) {
        $comp = $order[$key_a] - $order[$key_b]; # change only this to $order[$key_b] - $order[$key_a] for descending size
        if ($comp == 0) {
            $comp = $key_a - $key_b;
        }
        return $comp;
    });
    return array_values($results);
}

Given OP's input, var_dump(power_set(['A', 'B', 'C', 'D'])); provides:

array(16) {
  [0] =>
  array(0) {
  }
  [1] =>
  array(1) {
    [0] =>
    string(1) "A"
  }
  [2] =>
  array(1) {
    [1] =>
    string(1) "B"
  }
  [3] =>
  array(1) {
    [2] =>
    string(1) "C"
  }
  [4] =>
  array(1) {
    [3] =>
    string(1) "D"
  }
  [5] =>
  array(2) {
    [0] =>
    string(1) "A"
    [1] =>
    string(1) "B"
  }
  [6] =>
  array(2) {
    [0] =>
    string(1) "A"
    [2] =>
    string(1) "C"
  }
  [7] =>
  array(2) {
    [1] =>
    string(1) "B"
    [2] =>
    string(1) "C"
  }
  [8] =>
  array(2) {
    [0] =>
    string(1) "A"
    [3] =>
    string(1) "D"
  }
  [9] =>
  array(2) {
    [1] =>
    string(1) "B"
    [3] =>
    string(1) "D"
  }
  [10] =>
  array(2) {
    [2] =>
    string(1) "C"
    [3] =>
    string(1) "D"
  }
  [11] =>
  array(3) {
    [0] =>
    string(1) "A"
    [1] =>
    string(1) "B"
    [2] =>
    string(1) "C"
  }
  [12] =>
  array(3) {
    [0] =>
    string(1) "A"
    [1] =>
    string(1) "B"
    [3] =>
    string(1) "D"
  }
  [13] =>
  array(3) {
    [0] =>
    string(1) "A"
    [2] =>
    string(1) "C"
    [3] =>
    string(1) "D"
  }
  [14] =>
  array(3) {
    [1] =>
    string(1) "B"
    [2] =>
    string(1) "C"
    [3] =>
    string(1) "D"
  }
  [15] =>
  array(4) {
    [0] =>
    string(1) "A"
    [1] =>
    string(1) "B"
    [2] =>
    string(1) "C"
    [3] =>
    string(1) "D"
  }
}
尤怨 2024-11-15 17:34:28

@fbstj回答之后,我更新了函数:

function powerSet(array $in, int $minLength = 0): array
{
    $return = [];
    
    if ($minLength === 0) {
        $return[] = [];
    }

    for ($i = 1 << count($in); --$i;) {
        $out = [];

        foreach ($in as $j => $u) {
            if ($i >> $j & 1) {
                $out[] = $u;
            }
        }

        if (count($out) >= $minLength) {
            $return[] = $out;
        }
    }
    
    return $return;
}

因为幂集函数会大量增加内存负载(2count ($in) 迭代),考虑使用 生成器

function powerSet(array $in, int $minLength = 0): \Generator
{
    if ($minLength === 0) {
        yield [];
    }

    for ($i = 1 << count($in); --$i;) {
        $out = [];

        foreach ($in as $j => $u) {
            if ($i >> $j & 1) {
                $out[] = $u;
            }
        }

        if (count($out) >= $minLength) {
            yield $out;
        }
    }
}

用法:

foreach (powerSet(range(1, 10)) as $value) {
    echo implode(', ', $value) . "\n";
}

Following @fbstj answer, I update the function:

function powerSet(array $in, int $minLength = 0): array
{
    $return = [];
    
    if ($minLength === 0) {
        $return[] = [];
    }

    for ($i = 1 << count($in); --$i;) {
        $out = [];

        foreach ($in as $j => $u) {
            if ($i >> $j & 1) {
                $out[] = $u;
            }
        }

        if (count($out) >= $minLength) {
            $return[] = $out;
        }
    }
    
    return $return;
}

Since power set functions can increase by a lot the memory load (2count($in) iterations), consider using Generator:

function powerSet(array $in, int $minLength = 0): \Generator
{
    if ($minLength === 0) {
        yield [];
    }

    for ($i = 1 << count($in); --$i;) {
        $out = [];

        foreach ($in as $j => $u) {
            if ($i >> $j & 1) {
                $out[] = $u;
            }
        }

        if (count($out) >= $minLength) {
            yield $out;
        }
    }
}

Usage:

foreach (powerSet(range(1, 10)) as $value) {
    echo implode(', ', $value) . "\n";
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文