具体的数据结构
嗯,这个问题有点具体,但我认为其中有一些一般性的想法,我无法理解。
假设我有 K 个服务器(这是一个我知道其大小的常数)。我有一个获取请求的程序,每个请求都有一个 id 和服务器 id 来处理它。我有 n 个请求 - 大小未知,可以是任意数量。
我需要一个数据结构来支持给定复杂性内的下一个操作:
GetServer - 该函数获取请求 ID 并返回在当前情况下应该处理此请求的服务器 ID,而不一定是原始服务器(见下文)。
复杂度:平均O(log K)。
KillServer - 该函数获取应删除的服务器 ID 和已删除服务器的所有请求应传递到的另一个服务器 ID 作为输入。
复杂度:最坏情况下O(1)。
--
所有结构的位置复杂度为 O(K+n)
--
KillServer 函数让我想到使用并查找,因为我可以按照要求在 O(1) 中进行并集,但我的问题是第一次操作。为什么是LogK?实际上,无论我如何“保存”请求,如果我想访问任何请求(假设它是一个 AVL 树),那么在最坏的情况下复杂度将是 O(log n) 并且说我不能假设 K> ;n(可能还有 K
尝试思考了几个小时,但我找不到任何解决方案。 可以使用的已知结构有:B+树、AVL树、跳跃列表、哈希表、并查找、排名树,当然还有所有基础知识,如数组等。
Well, this question is a bit specific but I think there is some general idea in it that I can't get it.
Lets say I got K servers (which is a constant that I know its size). I have a program that get requests and every request has an id and server id that will handle it. I have n requests - unknown size and can be any number.
I need a data structure to support the next operations within the given complexity:
GetServer - the function gets the request ID and returns the server id that is supposed to handle this request at the current situation and not necessarily the original server (see below).
Complexity: O(log K) at average.
KillServer - the function gets as input a server id that should be removed and another server id that all the requests of the removed server should be passed to.
Complexity: O(1) at the worst case.
--
Place complexity for all the structure is O(K+n)
--
The KillServer function made me think using a Union-Find as I can do the union in O(1) as requested but my problem is the first operation. Why it's LogK? Actually, no matter how I "save" the requests if I want to access to any request (lets say it's an AVL tree) so the complexity will be O(log n) at the worst case and said that I can't assume K>n (and probably K
Tried thinking about it a couple of hours but I can't find any solution.
Known structures that can be used are: B+ tree, AVL tree, skip list, hash table, Union-Find, rank tree and of course all the basics like arrays and such.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
考虑一个由两部分组成的数据结构:
最初,作业ID 等于服务器ID。当服务器被移除时,服务器将处理多个作业。
换句话说,您只需添加一个小的间接
client ->;工作->服务器
,以便您的操作仅触及第一个或仅第二个箭头。Consider a data structure consisting of two parts:
Initially, the job IDs are equal to the server IDs. As servers are removed, servers will handle multiple jobs.
In other words, you just add a small indirection
client -> job -> server
so that your operations touch only the first or only the second arrow.从服务器分区的请求开始的联合查找应该可以工作。您从指向其服务器的每个请求开始,然后将服务器(而不是请求)联合在一起。因此,GetServer 的摊余成本将为 O(alpha(K)) 而不是 O(alpha(n)),其中 O(alpha(K)) << O(log K)。
Union-find starting with requests partitioned by server should work. You start with each request pointing to it's server, then you union together the servers (not the requests). Thus, the amortized cost of GetServer would be O(alpha(K)) rather than O(alpha(n)), where O(alpha(K)) << O(log K).
您可以尝试使用哈希表(其中哈希表的每个条目都是对列表的引用):
令哈希表的大小为大于 K 的最小素数 (
p
)那么,requestID 的哈希函数将是
requestID % p
。这将使GetServer
的复杂度变为 O(1)。然而,这样做的缺点是没有 requestID 的平衡。You could try using a hash table (where each entry of the hash table is a reference to a list):
Let the size of the hash table be the smallest prime number (
p
) which is greater than K.The hash function for a requestID, then, would be
requestID % p
. This would make the complexity forGetServer
O(1). However, the downside of this is that there is no balancing of requestIDs.