我需要为此实现 B 树搜索吗?
我有一个整数数组,可能有数十万(或更多),按数字升序排序,因为这就是它们最初的堆叠方式。
我需要能够尽可能高效地查询数组以获取其第一次出现的数字 >=
某些输入的索引。我不假思索地知道如何做到这一点的唯一方法是迭代测试条件的数组,直到它返回 true,此时我将停止迭代。然而,这是解决这个问题最昂贵的解决方案,我正在寻找最好的算法来解决它。
我正在使用 Objective-C 进行编码,但我将给出一个 JavaScript 示例,以扩大能够做出回应的受众。
// Sample set
var numbers = [1, 7, 23, 23, 23, 89, 1002, 1003];
var indexAfter100 = getIndexOfValueGreaterThan(100);
var indexAfter7 = getIndexOfValueGreaterThan(7);
// (indexAfter100 == 6) == true
// (indexAfter7 == 2) == true
将这些数据放入数据库中以执行此搜索只是最后的解决方案,因为我渴望看到某种算法来在内存中快速解决此问题。
我确实有能力更改数据结构,或者在构建数组时存储附加数据结构,因为我的程序已经将每个数字一一推入此堆栈,所以我只需修改将它们添加到堆栈的代码即可。在将索引添加到堆栈时搜索索引是不可能的,因为事后将使用不同的值频繁重复搜索操作。
现在我正在考虑“B 树”,但说实话,我不知道如何实现它,在我开始弄清楚这一点之前,我想知道是否有一个适合这个单一用例的好算法更好的?
I have an array of integers, which could run into the hundreds of thousands (or more), sorted numerically ascending since that's how they were originally stacked.
I need to be able to query the array to get the index of its first occurrence of a number >=
some input, as efficiently as possible. The only way I would know how to do this without even thinking about it would be to iterate through the array testing the condition until it returns true, at which point I'd stop iterating. However, this is the most expensive solution to this problem and I'm looking for the best algorithm to solve it.
I'm coding in Objective-C, but I'll give an example in JavaScript to broaden the audience of people who are able to respond.
// Sample set
var numbers = [1, 7, 23, 23, 23, 89, 1002, 1003];
var indexAfter100 = getIndexOfValueGreaterThan(100);
var indexAfter7 = getIndexOfValueGreaterThan(7);
// (indexAfter100 == 6) == true
// (indexAfter7 == 2) == true
Putting this data into a DB in order to perform this search will only be a last-resort solution since I'm keen to see some sort of algorithm to tackle this quickly in memory.
I do have the ability to change the data structure, or to store an additional data structure as I'm building the array, since my program has already pushed each number one by one onto this stack, so I'd just modify the code that's adding them to the stack. Searching for the index as they're being added to the stack isn't possible since the search operation will be repeated frequently with different values after the fact.
Right now I'm thinking "B-Tree" but to be honest, I would have no idea how to implement one and before I go off and start figuring that out, I wonder if there's a nice algorithm that fits this single use-case better?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
您应该使用二分搜索。 Objective C 甚至可以有一个内置的方法(我知道的许多语言都有)。 B 树可能不会有太大帮助,除非您想将数据存储在磁盘上。
You should use binary search. Objective C could even have a built-in method for that (many languages I know do). B-tree won't probably help much, unless you want to store the data on disk.
我不了解 Objective-C,但是 C(plain 'ol C)带有一个名为
bsearch
的函数(此外,据我所知,Obj-C 可以很好地调用 C 函数):http://www.cplusplus.com/reference/clibrary/cstdlib/bsearch/
那基本上进行了二分搜索,这听起来像是您所需要的。
I don't know about Objective-C, but C (plain 'ol C) comes with a function called
bsearch
(besides, AFAIK, Obj-C can call C functions just fine):http://www.cplusplus.com/reference/clibrary/cstdlib/bsearch/
That basically does a binary search which sounds like it's what you need.
我认为,快速搜索算法应该能够处理该大小的整数数组,而不会花费太长时间(并且该数组已排序,因此二分搜索可能是可行的方法)。
我认为 btree 可能有点过分了......
A fast search algorithm should be able to handle an array of ints of that size without taking too long, I should think (and the array is sorted, so a binary search would probably be the way to go).
I think a btree is probably overkill...
由于它们按特定的 ASCending 顺序排序,并且您只需要较大的数组,因此我将序列化该数组,按 INT 对其进行分解,并保留序列化字符串中包含较大 INT 的部分,然后对其进行反序列化,瞧。
Since they are sorted in a particular ASCending order and you only need the bigger ones, I would serialize that array, explode it by the INT and keep the part of the serialized string that holds the bigger INTs, then unserialize it and voilá.
线性搜索也称为顺序搜索,从头开始按顺序查看每个元素,以查看数据结构中是否存在所需的元素。当数据量较小时,这种搜索速度很快。它很容易,但所需的工作量与要搜索的数据量成正比。如果不存在所需的元素,则元素数量加倍将使搜索时间加倍。
二分查找对于较大的数组来说是有效的。在这里,我们检查中间的元素。如果值大于我们要查找的值,则在前半部分查找;否则,在后半部分查找。重复此操作,直到找到所需的项目。该表必须经过排序才能进行二分搜索。它在每次迭代时消除一半的数据。其对数
Linear search also referred to as sequential search looks at each element in sequence from the start to see if the desired element is present in the data structure. When the amount of data is small, this search is fast.Its easy but work needed is in proportion to the amount of data to be searched.Doubling the number of elements will double the time to search if the desired element is not present.
Binary search is efficient for larger array. In this we check the middle element.If the value is bigger that what we are looking for, then look in the first half;otherwise,look in the second half. Repeat this until the desired item is found. The table must be sorted for binary search. It eliminates half the data at each iteration.Its logarithmic