有效地从链式哈希表中选取随机元素?

发布于 2024-12-22 19:48:23 字数 367 浏览 0 评论 0原文

只是为了练习(而不是作为家庭作业)我一直在尝试解决这个问题(CLRS,第 3 版,练习 11.2-6):

假设我们在大小为 m 的哈希表中存储了 n 个键,其中 通过链接解决冲突,并且我们知道每个的长度 链,包括最长链的长度L。描述一个 从密钥中均匀随机选择一个密钥的过程 放入哈希表中,并在预计时间 O(L * (1 + m/n)) 内返回。

到目前为止我的想法是每个键被返回的概率是 1/n。如果我们尝试获取 1 到 n 之间的随机值 x,并尝试首先按存储桶排序,然后沿着存储桶中的链查找序列中的第 x 个键,则需要 O(m) 才能找到正确的存储桶:一个接一个地遍历桶并在 O(L) 时间内获得链中正确的密钥。

Just for practice (and not as a homework assignment) I have been trying to solve this problem (CLRS, 3rd edition, exercise 11.2-6):

Suppose we have stored n keys in a hash table of size m, with
collisions resolved by chaining, and that we know the length of each
chain, including the length L of the longest chain. Describe a
procedure that selects a key uniformly at random from among the keys
in the hash table and returns it in expected time O(L * (1 + m/n)).

What I thought so far is that the probability of each key being returned is 1/n. If we try to get a random value x between 1 to n, and try to find the xth key in sequence first sorted by bucket then along the chain in the bucket, then it will take O(m) to find the right bucket by going through buckets one by one and O(L) time to get the right key in chain.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

不奢求什么 2024-12-29 19:48:23

重复以下步骤,直到返回一个元素:

  • 随机选择一个桶。令k 为桶中元素的数量。
  • 1 ... L中均匀随机选择p。如果p <= k,则返回存储桶中的第p元素。

应该清楚的是,该过程均匀地随机返回一个元素。我们正在对二维数组中的元素应用拒绝采样。

每个存储桶的预期元素数量为n / m。采样尝试成功的概率为(n / m) / L。因此,找到存储桶所需的预期尝试次数为 L * m / n。加上从存储桶中检索元素的 O(L) 成本,预期运行时间为 O(L * (1 + m / n))

Repeat the following steps until an element is returned:

  • Randomly select a bucket. Let k be the number of elements in the bucket.
  • Select p uniformly at random from 1 ... L. If p <= k then return the pth element in the bucket.

It should be clear that the procedure returns an element uniformly at random. We are sort of applying rejection sampling to the elements placed in a 2D array.

The expected number of elements per bucket is n / m. The probability that the sampling attempt succeeds is (n / m) / L. The expected number of attempts needed to find a bucket is therefore L * m / n. Together with the O(L) cost of retrieving the element from the bucket, the expected running time is O(L * (1 + m / n)).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文