使用概率分布对数组进行排序

发布于 2024-11-03 21:20:00 字数 2344 浏览 0 评论 0原文

数组应按其值从高到低排序。

<?php
$items = array(
  1 => f(1),
  2 => f(2),
  3 => f(3),
  4 => f(4),
  5 => f(5),
);
?>

排序后,我查看第 1、2、3、4、5 项中哪一项是第一个。我一次又一次地尝试。 之后

  • 5 应该是第一项 1 的五倍
  • 4 应该是第一项 1 的四倍
  • 3 应该是第一项 1 的三倍
  • 4 应该是第一项 2 的两倍
  • ...

一个想法是

<?php
function f(key) {
  return key / random();
}
?>

的,1'000'000 次尝试的结果是

key | times on top | ratio with key one | expected ratio
----+--------------+--------------------+---------------
 5  |      374'365 | 6.75               | 5
 4  |      267'863 | 4.83               | 4
 3  |      185'707 | i am so lazy ...   | 3
 2  |      116'618 |                    | 2
 1  |       55'447 | 1                  | 1

对我来说看起来很奇怪,但也许

  • f 有一个简单的问题吗?
  • 有更好的f吗?


我的实现:

<?php

abstract class Test {

  private $result;

  protected abstract function f($x);

  protected function iteration() {
    $values = array(
      1 => $this->f(1),
      2 => $this->f(2),
      3 => $this->f(3),
      4 => $this->f(4),
      5 => $this->f(5),
    );

    arsort($values);

    $top = key($values);

    if (!isset($this->result[$top])) {
      $this->result[$top] = 1;
    } else {
      $this->result[$top]++;
    }
  }

  public function run($iterations) {
    $this->result = array();
    for($i = 0; $i < $iterations; $i++) {
      $this->iteration();
    }
    arsort($this->result);
    return $this->result;
  }
}

class MyTest extends Test {
  protected function f($x) {
    return $x / rand();
  }
}

$test = new MyTest();
$result = $test->run(1000 * 1000);
print_r($result);
printf("Ratio of key 5 to 1, which should be 5: %f\n", $result[5] / $result[1]);

?>

我已经尝试了十亿轮。但比率还是 6.75 - 重点是:为什么不是 5?


结果

<?php
class BetterRandomGeneratorTest extends Test {
  protected function f($x) {
    return $x / mt_rand();
  }
}
?>

Array
(
  [5] => 3742816
  [4] => 2674352
  [3] => 1861444
  [2] => 1168333
  [1] => 553055
)
Ratio of key 5 to 1: 6.767529

An array shall be sorted high to low by its values.

<?php
$items = array(
  1 => f(1),
  2 => f(2),
  3 => f(3),
  4 => f(4),
  5 => f(5),
);
?>

After sorting I look which item 1, 2, 3, 4, 5 is the first one. I try that again and again and again.
Afterwards

  • 5 should be the first item five times more than 1
  • 4 should be the first item four times more than 1
  • 3 should be the first item three times more than 1
  • 4 should be the first item two times more than 2
  • ...

One idea is

<?php
function f(key) {
  return key / random();
}
?>

which, for 1'000'000 tries resulted in

key | times on top | ratio with key one | expected ratio
----+--------------+--------------------+---------------
 5  |      374'365 | 6.75               | 5
 4  |      267'863 | 4.83               | 4
 3  |      185'707 | i am so lazy ...   | 3
 2  |      116'618 |                    | 2
 1  |       55'447 | 1                  | 1

Looks wierd to me, but maybe

  • there is a simple problem with f?
  • there is a better f?


My implementation:

<?php

abstract class Test {

  private $result;

  protected abstract function f($x);

  protected function iteration() {
    $values = array(
      1 => $this->f(1),
      2 => $this->f(2),
      3 => $this->f(3),
      4 => $this->f(4),
      5 => $this->f(5),
    );

    arsort($values);

    $top = key($values);

    if (!isset($this->result[$top])) {
      $this->result[$top] = 1;
    } else {
      $this->result[$top]++;
    }
  }

  public function run($iterations) {
    $this->result = array();
    for($i = 0; $i < $iterations; $i++) {
      $this->iteration();
    }
    arsort($this->result);
    return $this->result;
  }
}

class MyTest extends Test {
  protected function f($x) {
    return $x / rand();
  }
}

$test = new MyTest();
$result = $test->run(1000 * 1000);
print_r($result);
printf("Ratio of key 5 to 1, which should be 5: %f\n", $result[5] / $result[1]);

?>

I have tried a billion rounds. But again the ratio is 6.75 - the whole point is: why isn't it five?


The results for

<?php
class BetterRandomGeneratorTest extends Test {
  protected function f($x) {
    return $x / mt_rand();
  }
}
?>

are

Array
(
  [5] => 3742816
  [4] => 2674352
  [3] => 1861444
  [2] => 1168333
  [1] => 553055
)
Ratio of key 5 to 1: 6.767529

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

情绪少女 2024-11-10 21:20:00

这是一个简单的 f 可以做到这一点。

function f(key) {
  $x = 0;
  for($i = 0; $i < $key; $i++) {
    $y = random();
    if ($x < $y) {
      $x = $y;
    }
  }
  return $x;
}

这是保证有效的,因为最大值同样可能是所选的 15 个随机数中的任何一个,并且该数字出现在 f(5) 中的时间为 1/3,而对于f(1)

至于你的f出了什么问题,很简单。您的解决方案具有良好的对称性,恰好 80% 的时间 f(1) f(1) < f(5)。然而,当 f(1) 大于平均值时,f(1) 往往大于 f(5) em> f(5) 小于平均值。 f(2)f(3)f(4) 也是如此。然而,所有 f(2), ... f(5) 同时变小的情况并不常见。这会导致相关性导致 f(1) 成为最大的频率比您天真的想象的要少。反之亦然,相关性往往比您天真的想象的更倾向于 f(5)

如果您想计算每个数字出现在顶部的确切概率,那么通过积分计算准确的答案应该不会太难。这个想法是,您将概率从 0 积分到 1,如果这是 f(i)random() 值,则 f(i) 是最大值。 (因此,例如,对于 5,您需要集成 (1-x/5)(1-x/4)(1-x/3)(1-x/2),而对于 1,您需要集成如果 random() 大于 0.2,则对函数进行积分,该函数为 0,否则为 (1-2x)(1-3x)(1-4x)(1-5x)< /code>.) 表达式将是复杂,并且比率不会得出很好的答案。

Here is a simple f which will do it.

function f(key) {
  $x = 0;
  for($i = 0; $i < $key; $i++) {
    $y = random();
    if ($x < $y) {
      $x = $y;
    }
  }
  return $x;
}

This is guaranteed to work because the max is equally likely to be any of the 15 random numbers chosen, and 1/3 of the time that number will be in f(5), versus 1/15 for f(1).

As for what was wrong with your f, it is quite simple. Your solution has the nice symmetry that exactly 80% of the time, f(1) < f(5). However f(1) tends to be bigger than f(5) when f(1) is larger than average and f(5) is smaller than average. Ditto for f(2), f(3) and f(4). However it is unusual for all of f(2), ... f(5) to be small at once. This causes correlations that cause f(1) to be the largest less often than you would naively think. Vice versa correlations tend to come out in favor of f(5) more often than you would naively think.

If you want to compute the exact probabilities of each number coming out on top, it shouldn't be too hard to compute exact answers with integration. The idea is that you integrate from 0 to 1 the probability that, if that was the value of random() for f(i) that f(i) is the maximum. (So, for instance, for 5 you would integrate (1-x/5)(1-x/4)(1-x/3)(1-x/2) while for 1 you would integrate a function that is 0 if random() is bigger than 0.2, and otherwise is (1-2x)(1-3x)(1-4x)(1-5x).) The expressions will be complicated, and the ratios won't come out to nice answers.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文