基于两种类型变量（在 php 中）实现加权随机选择的最佳方法是什么？

发布于 2024-07-24 14:24:17 字数 432 浏览 9 评论 0原文

基本上我的困境是这样的。我有一个托管文件的 x 服务器列表。还有另一台服务器，托管站点的 mysql 数据库和应用程序。当文件上传（到前端服务器）时，应用程序会检查哪个服务器上有最多的可用空间，并将文件移动到那里。如果您从 2 个以上具有相同可用空间的空服务器开始，这种方法效果很好。如果您稍后将另一台服务器引入混合中......它将比当前服务器拥有更多的可用空间，此方法不是那么有效，因为所有新文件都将难以捉摸地上传到新服务器，这会导致过载因为它将处理大部分新流量，直到它在可用空间方面赶上其余盒子。

所以我想引入一个权重系统，这将有助于规范文件的分布。因此，如果 3 台服务器分别设置为 33%，并且 1 台服务器具有明显更多的可用空间，它仍然会比其他服务器获得更多的上传（即使它具有相同的权重），但负载将分散到所有服务器上服务器。

谁能建议一个好的仅 php 实现吗？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

jJeQQOZ5 2024-07-31 14:24:17

一种方法是将所有有空间保存文件的服务器上的所有可用空间相加（因此，具有可用空间但不足以保存文件的服务器显然会被排除在外）。然后确定每个服务器所占空间的百分比（因此新服务器将占比例更大的百分比）。使用随机数并将其与百分比对齐以确定选择哪个服务器。

例如，考虑拥有五台具有以下可用空间级别的服务器：

Server 1:   2048MB
Server 2:  51400MB
Server 3:   1134MB
Server 4: 140555MB

您需要存储一个 1500MB 的文件。这使得服务器 3 停止运行，为我们留下了 194003MB 的可用空间。

Server 1:  1.0%
Server 2: 26.5%
Server 4: 72.5%

然后，您选择一个 0 到 100 之间的随机数： 40

Numbers between 0 and 1 (inclusive) would go to Server 1
Numbers > 1 and <= 26.5 would go to Server 2
Numbers > 26.5 and <= 100 would go to Server 4

因此，在本例中，40 表示它存储在服务器 4 上。

One approach would be to sum all available space on all of the servers that have the space to hold the file (so a server with available space but not enough to hold the file would obviously be excluded). Then determine the percentage of that space that each server accounts for (so a new server would account for a proportionally larger percentage). Use a random number and align it with the percentages to determine which server to pick.

For instance, consider having five servers with the following free space levels:

Server 1:   2048MB
Server 2:  51400MB
Server 3:   1134MB
Server 4: 140555MB

You need to store a 1500MB file. That knocks Server 3 out of the running, leaving us with 194003MB total free space.

Server 1:  1.0%
Server 2: 26.5%
Server 4: 72.5%

You then choose a random number between 0 and 100: 40

Numbers between 0 and 1 (inclusive) would go to Server 1
Numbers > 1 and <= 26.5 would go to Server 2
Numbers > 26.5 and <= 100 would go to Server 4

So in this case, 40 means it gets stored on Server 4.

回复收藏 0 原文

九命猫 2024-07-31 14:24:17

流量平衡通常非常重要。您可以添加某种加权系统来平衡它（尽管，正如您所说，新服务器仍然会比其他服务器超载），或者使用其他一些交替方法，其中一台服务器永远不会连续两次被击中，就像例子。

但我想我可能会人为地平衡服务器数据，通过将内容从一个服务器移动到另一个服务器，使它们几乎彼此相等，然后让原始或加权/交替算法正常工作。

这不是一个仅限 php 的实现，而只是一些需要考虑的想法。

回复收藏 0 原文

π浅易 2024-07-31 14:24:17

实现它的方法如下：

创建一个包含所有空白空间的数组，在您的情况下以分数形式 { 0.5, 0.5, 1.0 }
创建第二个权重数组 - 服务器中的空间量除以总空间量空间量，如第一个数组中所示 - { 0.25, 0.25, 0.5 }
通过调用 1.0*mt_rand()/mt_getmaxrand() 获取随机数，标准化为 (0.0,1.0)

运行以下循环：

$total_weight = 0.0; 
  for ( $i = 10; $i <= sizeof($weights); $i++) { 
    $total_weight += #weights[$i]; 
    if($rand <= $total_weight) { 
  返回$i； 
    } 
  }

返回值是服务器的索引

A way to implement it is the following:

Create an array of all the empty space, as a fraction, in your case { 0.5, 0.5, 1.0 }
Create a second array of weights - the amount of space in the server divided by the total amount of space, as it is represented in the first array - { 0.25, 0.25, 0.5 }
Get a random number, normalised to (0.0,1.0) by calling 1.0*mt_rand()/mt_getmaxrand()

run the following loop:

$total_weight = 0.0;
for ( $i = 10; $i <= sizeof($weights); $i++) {
  $total_weight += #weights[$i];
  if($rand <= $total_weight) {
return $i;
  }
}

The returned value is the index of the server

回复收藏 0 原文

╰◇生如夏花灿烂 2024-07-31 14:24:17

您已经进入了分布式文件系统的世界——一个比您想象的更大的问题空间预计。

在这个领域已经做了很多工作/研究。您应该考虑使用可用的解决方案，例如 MogileFS，或者至少对他们如何解决您遇到的问题（以及您尚未遇到的问题）

举个我所说的“您尚未遇到的问题”的意思的例子：您实际上不应该存储至少 2 个每个文件的副本，这样如果您丢失一台服务器，您就不会丢失该服务器上的所有文件？当然，一旦开始这样做，您是否应该能够同时从多个服务器读取单个文件的部分内容，以提高性能？当然，现在您必须弄清楚文件是如何分发的，当服务器发生故障时，当新服务器上线时，它们如何重新分发，等等......

正确执行此操作很复杂。如果可以避免，就不要重新发明轮子。如果你必须重新发明轮子，至少花一些时间看看其他人是如何建造他们的轮子的。

回复收藏 0 原文

~没有更多了~