数值范围优化

发布于 2024-10-30 05:40:51 字数 500 浏览 6 评论 0原文

我有一组想要优化的数字范围。

这是初始值的一个简单示例:

Start    End
9        12
1        2
60       88
10       11
79       80

优化后我期望的输出:

Start    End
1        2
9        12
60       88

这些是存储的修改后的预序树遍历(嵌套集)数据的 leftright 值在 MySQL 数据库中。我使用它们从结果中排除不活动的分支,并且目前根本没有优化范围。我想我可能会通过在使用前优化范围来获得性能提升。


更多信息

这些值将传递到查询中,以使用 NOT BETWEEN 子句排除树中的非活动分支。我认为我可以通过使用最小的范围集来优化该查询的性能。

I have an set numeric ranges that I would like to optimize.

Here's a simple example of initial values:

Start    End
9        12
1        2
60       88
10       11
79       80

What I'd expect as output after optimization:

Start    End
1        2
9        12
60       88

These are the left and right values from Modified Preorder Tree Traversal (Nested Set) data stored in a MySQL database. I use them to exclude inactive branches from the result, and am not currently optimizing the ranges at all. I thought I might get a performance gain from optimizing the ranges before use.


MORE INFO

The values are passed into a query for exclusion of the inactive branches in the tree using a NOT BETWEEN clause. I thought that I could optimize the performance of that query by using a minimal set of ranges.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

ぃ弥猫深巷。 2024-11-06 05:40:51

这是一个将返回您想要的内容的 SQL

mysql> CREATE TABLE sample (Start INT, End INT);

mysql> INSERT sample VALUES (9,12),(1,2),(60,88),(10,11),(79,80);

mysql> SELECT * 
    -> FROM sample s 
    -> WHERE NOT EXISTS (SELECT 1 
    ->                   FROM sample 
    ->                   WHERE s.Start > Start AND s.Start < End);
+-------+------+
| Start | End  |
+-------+------+
|     9 |   12 |
|     1 |    2 |
|    60 |   88 |
+-------+------+

当然,您可以使用上面的 SQL 创建 VIEW、将数据移动到另一个表或删除行。

注意:我不太确定你为什么要进行这种“优化”。

编辑:
查询可以重写为

SELECT s.* 
FROM sample s LEFT JOIN 
     sample s2 ON s.Start > s2.Start AND s.Start < s2.End 
WHERE s2.start IS NULL;

Which will create differentexecution plan (2xsimple select vs Primary/dependent subquery for EXISTS),因此性能可能会有所不同。两个查询都将使用 (Start, End) 上的索引(如果存在)。

Here is an SQL that will return what you want

mysql> CREATE TABLE sample (Start INT, End INT);

mysql> INSERT sample VALUES (9,12),(1,2),(60,88),(10,11),(79,80);

mysql> SELECT * 
    -> FROM sample s 
    -> WHERE NOT EXISTS (SELECT 1 
    ->                   FROM sample 
    ->                   WHERE s.Start > Start AND s.Start < End);
+-------+------+
| Start | End  |
+-------+------+
|     9 |   12 |
|     1 |    2 |
|    60 |   88 |
+-------+------+

You can, of course, create VIEW, move the data to another table or delete rows using the above SQL.

NOTE: I am not really sure why are you doing this 'optimization'.

EDIT:
The query can be rewritten as

SELECT s.* 
FROM sample s LEFT JOIN 
     sample s2 ON s.Start > s2.Start AND s.Start < s2.End 
WHERE s2.start IS NULL;

Which will create different execution plan (2xsimple select vs primary/dependent subquery for EXISTS), so performance might be different. Both queries will use an index on (Start, End) if it exists.

乞讨 2024-11-06 05:40:51

将它们放入已排序的列表中。标记排序列表中的哪些元素代表范围开始,哪些元素代表范围结束。首先根据值对列表进行排序;但是,请确保范围开始时间早于范围结束时间。 (这可能涉及某种可以按给定键排序的结构。我不知道 php 中的详细信息。)

现在,从头到尾遍历列表。保留一个计数器,c。当您传递范围起始值时,递增c。当您传递范围结束时,递减c

当 c 从 0 变为 1 时,即为最终集合中某个范围的开始。当 c 从 1 到 0 时,表示范围结束。

编辑::如果您已经在数据库表中的某个位置有了范围,您可能可以构造一个 SQL 查询来执行上面的第一步(再次确保范围起点在范围结束之前返回) - 点)。

Put them in a sorted list. Mark which elements in the sorted list represent range starts and which are range ends. Sort the list based on value first; however, make sure that range starts come before range ends. (This will probably involve a structure of some sort that can be sorted on a given key. I don't know the details in php.)

Now, traverse the list from start to end. Keep a counter, c. When you pass a range start, increment c. When you pass a range end, decrement c.

When c goes from 0 to 1, that's the start of a range in the final set. When c goes from 1 to 0, that's the end of a range.

EDIT:: If you already have the ranges in a database table somewhere, you can probably structure an SQL query to do the first step above (again, making sure that range start-points are returned before range end-points).

黑寡妇 2024-11-06 05:40:51

这是一个简单的实现:

// I picked this format because it's convenient for the solution
// and because it's very natural for a human to read/write
$ranges = array(
  9    =>    12,
  1    =>    2,
  60   =>    81,
  10   =>    11,
  79   =>    88);

ksort($ranges);
$count = count($ranges);
$prev = null; // holds the previous start-end pair

foreach($ranges as $start => $end) {
    // If this range overlaps or is adjacent to the previous one
    if ($prev !== null && $start <= $prev[1] + 1) {
        // Update the previous one (both in $prev and in $ranges)
        // to the union of its previous value and the current range
        $ranges[$prev[0]] = $prev[1] = max($end, $prev[1]);

        // Mark the current range as "deleted"
        $ranges[$start] = null;
        continue;
    }

    $prev = array($start, $end);
}

// Filter all "deleted" ranges out
$ranges = array_filter($ranges);

限制/注释:

  1. 范围边界必须足够小才能适合 int
  2. 如果结束边界为 0,此示例将错误地从最终结果中删除任何范围。如果您的数据可以合法包含这样的范围,请为 array_filter 提供适当的回调: function($item) { return $item === null; }

查看实际操作

Here's a simple implementation:

// I picked this format because it's convenient for the solution
// and because it's very natural for a human to read/write
$ranges = array(
  9    =>    12,
  1    =>    2,
  60   =>    81,
  10   =>    11,
  79   =>    88);

ksort($ranges);
$count = count($ranges);
$prev = null; // holds the previous start-end pair

foreach($ranges as $start => $end) {
    // If this range overlaps or is adjacent to the previous one
    if ($prev !== null && $start <= $prev[1] + 1) {
        // Update the previous one (both in $prev and in $ranges)
        // to the union of its previous value and the current range
        $ranges[$prev[0]] = $prev[1] = max($end, $prev[1]);

        // Mark the current range as "deleted"
        $ranges[$start] = null;
        continue;
    }

    $prev = array($start, $end);
}

// Filter all "deleted" ranges out
$ranges = array_filter($ranges);

Limitations/notes:

  1. The range boundaries have to be small enough to fit into an int.
  2. This example will incorrectly remove any range from the final result if the ending boundary is 0. If your data can legitimately contain such a range, provide an appropriate callback to array_filter: function($item) { return $item === null; }.

See it in action.

风向决定发型 2024-11-06 05:40:51
$ranges = array(
  array(9, 12),
  array(1, 2),
  array(60, 81),
  array(10, 11),
  array(79, 88),
  );

function optimizeRangeArray($r) {
  $flagarr = array();
  foreach ($r as $range => $bounds) {
    $flagarr = array_pad($flagarr, $bounds[1], false);
    for ($i = $bounds[0]-1; $i < $bounds[1]; $i++) $flagarr[$i] = true;
    }
  $res = array(); $min = 0; $max = 0; $laststate = false;
  $ctr = 0;
  foreach ($flagarr as $state) {
    if ($state != $laststate) {
      if ($state) $min = $ctr + 1;
      else {
        $max = $ctr;
        $res[] = array($min, $max);
        }
      $laststate = $state;
      }
    $ctr++;
    }
  $max = $ctr;
  $res[] = array($min, $max);
  return($res);
  }

print_r(optimizeRangeArray($ranges));

输出:

Array
(
    [0] => Array
        (
            [0] => 1
            [1] => 2
        )

    [1] => Array
        (
            [0] => 9
            [1] => 12
        )

    [2] => Array
        (
            [0] => 60
            [1] => 88
        )

)

注意:这不适用于负整数!

或者像这样使用它

$rres = optimizeRangeArray($ranges);

$out = "<pre>Start    End<br />";
foreach($rres as $range=>$bounds) {
  $out .= str_pad($bounds[0], 9, ' ') . str_pad($bounds[1], 9, ' ') . "<br />";
  }
$out .= "</pre>";
echo $out;

在浏览器中获取它

Start    End
1        2
9        12
60       88
$ranges = array(
  array(9, 12),
  array(1, 2),
  array(60, 81),
  array(10, 11),
  array(79, 88),
  );

function optimizeRangeArray($r) {
  $flagarr = array();
  foreach ($r as $range => $bounds) {
    $flagarr = array_pad($flagarr, $bounds[1], false);
    for ($i = $bounds[0]-1; $i < $bounds[1]; $i++) $flagarr[$i] = true;
    }
  $res = array(); $min = 0; $max = 0; $laststate = false;
  $ctr = 0;
  foreach ($flagarr as $state) {
    if ($state != $laststate) {
      if ($state) $min = $ctr + 1;
      else {
        $max = $ctr;
        $res[] = array($min, $max);
        }
      $laststate = $state;
      }
    $ctr++;
    }
  $max = $ctr;
  $res[] = array($min, $max);
  return($res);
  }

print_r(optimizeRangeArray($ranges));

Output:

Array
(
    [0] => Array
        (
            [0] => 1
            [1] => 2
        )

    [1] => Array
        (
            [0] => 9
            [1] => 12
        )

    [2] => Array
        (
            [0] => 60
            [1] => 88
        )

)

Note: This doesn't work for negative integers!

Or use it like this

$rres = optimizeRangeArray($ranges);

$out = "<pre>Start    End<br />";
foreach($rres as $range=>$bounds) {
  $out .= str_pad($bounds[0], 9, ' ') . str_pad($bounds[1], 9, ' ') . "<br />";
  }
$out .= "</pre>";
echo $out;

To get this in your browser

Start    End
1        2
9        12
60       88
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文