穷举搜索与排序后进行二分搜索

发布于 2024-09-28 10:20:31 字数 372 浏览 6 评论 0原文

这是直接引用自 G. Michael Scneider 和 Judith L. Gersting 的教科书《计算机科学邀请》。

在第 3.4.2 节的末尾,我们讨论了在未排序列表上使用顺序搜索与对列表进行排序然后使用二分搜索之间的权衡。如果列表大小为 n=100,000,则在第二个替代方案在比较次数方面更好之前,必须完成多少次最坏情况搜索?

我实在不明白题主想问什么。

顺序搜索的阶数为 (n),二进制搜索的阶数为 (lgn),在任何情况下 lgn 都将始终小于 n。在这种情况下 n 已经给出了,所以我应该找到什么。

这是我的家庭作业之一,但我真的不知道该怎么做。有人可以用简单的英语为我解释这个问题吗?

This is a direct quote from the textbook, Invitation to Computer Science by G. Michael Scneider and Judith L. Gersting.

At the end of Section 3.4.2, we talked about the tradeoff between using sequential search on an unsorted list as opposed to sorting the list and then using binary search. If the list size is n=100,000 about how many worst-case searches must be done before the second alternative is better in terms of number of comparisons?

I don't really get what the question is asking for.

Sequential search is of order (n) and binary is of order (lgn) which in any case lgn will always be less than n. And in this case n is already given so what am I supposed to find.

This is one of my homework assignment but I don't really know what to do. Could anyone explain the question in plain English for me?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

红焚 2024-10-05 10:20:31

二进制的阶数为 (lgn),在任何情况下 lgn 都将始终小于 n
这就是你错的地方。在作业中,您还需要考虑对数组进行排序的成本。

显然,如果您只需要一次搜索,第一种方法比对数组进行排序并进行二分搜索更好: n*logn + logn。然后您会被问到,您需要进行多少次搜索才能使第二种方法变得更有效。

提示结束。

and binary is of order (lgn) which in any case lgn will always be less than n
This is where you're wrong. In assignment, you're asked to consider the cost of sorting array too.

Obviously, if you need only one search, first approach is better than sorting array and doing binary search: n < n*logn + logn. And you're asked, how many searches you need for second approach to become more effective.

End of hint.

太阳哥哥 2024-10-05 10:20:31

问题是如何决定选择哪种方法 - 只使用线性搜索或先排序然后使用二分搜索。

如果只搜索几次,线性搜索会更好 - 它是 O(n),而排序已经是 O(n*logn)。如果您经常在同一个集合上搜索,排序会更好 - 多次搜索可能会变成 O(n*n) 但排序然后使用二分搜索进行搜索又是 O(n*logn) + NumberOfSearches*O(logn) ,这可以是比使用线性搜索少或多取决于 NumberOfSearches 和 n 的关系。

任务是确定 NumberOfSearches 的确切值(不是确切的数字,而是 n 的函数),这将使选项之一更可取:

 NumberOfSearches * O(n) <> O(n*logn) + NumberOfSearches * O(logn)

不要忘记每个 O() 可以有不同的常量值。

The question is how to decide which approach to choose - to just use linear search or to sort and then use binary search.

If you only search a couple of times linear search is better - it is O(n), while sorting is already O(n*logn). If you search very often on the same collection sorting is better - searching multiple times can become O(n*n) but sorting and then searching with binary search is again O(n*logn) + NumberOfSearches*O(logn) which can be less or more than using linear search depending on how NumberOfSearches and n relate.

The task is to determine the exact value of NumberOfSearches (not the exact number, but a function of n) which will make one of the options preferable:

 NumberOfSearches * O(n) <> O(n*logn) + NumberOfSearches * O(logn)

don't forget that each O() can have a different constant value.

偷得浮生 2024-10-05 10:20:31

方法的顺序在这里并不重要。它告诉你当问题变得越来越大时算法的扩展能力如何。如果您只知道O(n) ==,那么您就无法进行任何精确的计算,它的复杂性随着问题的规模呈线性增长。它不会给你任何数字。

这很可能意味着对于某些 n,具有 O(n) 复杂度的算法比 O(logn) 算法更快。因为 O(log(n)) 当它变大时可以更好地扩展,所以我们可以肯定地知道,存在一个 n(问题大小),其中具有 O(logn) 复杂度的算法更快。我们只是不知道什么时候(为了什么n)。

用简单的英语来说:

如果你想知道“有多少次搜索”,你需要精确的方程来求解,你需要精确的数字。顺序搜索需要多少次比较? (记住 n 已给出,因此您可以给出一个数字。)使用二分搜索进行搜索需要多少次比较(在最坏的情况下!)?在进行二分查找之前,您必须先进行排序。让我们将排序所需的比较次数添加到二分搜索的成本中。现在比较这两个数字,哪个较小?

二分查找速度快,但排序速度慢。顺序查找比二分查找慢,但比排序快。然而,无论搜索多少次,排序只需要进行一次。那么,什么时候一种繁重的排序比每次都必须进行缓慢(顺序)搜索更重要呢?

祝你好运!

The order of the methods is not important here. It tells you something how well algorithms scale when the problem becomes bigger and bigger. You can't do any exact calculations if you only know O(n) == it complexity grows linear in the size of the problem. It won't give you any numbers.

This can well mean that an algorithm with O(n) complexity is faster than a O(logn) algorithm, for some n. Because O(log(n)) scales better when it gets larger, we know for sure, there is a n (a problem size) where the algorithm with O(logn) complexity is faster. We just don't know when (for what n).

In plain english:

If you want to know 'how many searches', you need exact equations to solve, you need exact numbers. How many comparisons does it take to search sequential? (Remember n is given, so you can give a number.) How many comparisons (in the worst case!) does it take to search with a binary search? Before you can do a binary search, you have to sort. Let's add the number of comparisons needed to sort to the cost of binary search. Now compare the two numbers, which one is less?

The binary search is fast, but the sorting is slow. The sequential search is slower than binary search, but faster than sorting. However the sorting needs to be done only once, no matter how many times you search. So, when does one heavy sort outweigh having to do a slow (sequential) search every time?

Good luck!

樱娆 2024-10-05 10:20:31

对于顺序搜索,最坏的情况是 n = 100000,因此对于 p 次搜索,需要进行 p × 100000 次比较。

使用 θ(n2) 排序算法需要 100000 × 100000 次比较。

二分查找每次查找需要 1 + log n = 1 + log 100000 = 17 次比较,

总共需要 100000×100000 + 17p 比较。

第一个表达式比第二个表达式大,意思是
100000p> 100000^2 + 17p

对于 p > 100017。

For sequential search, the worst case is n = 100000, so for p searches p × 100000 comparisons are required.

Using a Θ(n2) sorting algorithm would require 100000 × 100000 comparisons.

Binary search would require 1 + log n = 1 + log 100000 = 17 comparisons for each search,

together there would be 100000×100000 + 17p comparisons.

The first expression is larger than the second, meaning
100000p > 100000^2 + 17p

For p > 100017.

写给空气的情书 2024-10-05 10:20:31

问题在于评估补偿排序成本所需的 NUM_SEARCHES 数量。所以我们会有:

 time( NUM_SEARCHES * O(n) ) > time( NUM_SEARCHES * O(log(n)) + O(n* log(n)) )

The question is about appreciating the number NUM_SEARCHES needed to compensate the cost of sorting. So we'll have:

 time( NUM_SEARCHES * O(n) ) > time( NUM_SEARCHES * O(log(n)) + O(n* log(n)) )
离旧人 2024-10-05 10:20:31

谢谢你们。我想我现在明白了。您能否看一下我的回答,看看我是否走在正确的道路上。

对于最坏情况的搜索
顺序搜索的比较次数为n = 100,000。
二分查找的比较次数为 lg(n) = 17。
排序比较次数为(n-1)/2 * n = (99999)(50000)。
(我按照我的教科书并使用了我的课程中介绍的选择排序算法)

所以让 p 为最坏情况搜索的数量,然后 100,000p > 100,000p > 100,000p (99999)(50000) + 17p
或p> 50008

总之,我需要 50,008 次最坏情况搜索才能使排序和使用二分搜索比对 n=100,000 列表的顺序搜索更好。

Thank you guys. I think I get the point now. Could you take a look at my answer and see whether I'm on the right track.

For worst case searches
Number of comparison for sequential search is n = 100,000.
Number of comparison for binary search is lg(n) = 17.
Number of comparison for sorting is (n-1)/2 * n = (99999)(50000).
(I'm following my textbook and used the selection sort algorithm covered in my class)

So let p be the number of worst case searches, then 100,000p > (99999)(50000) + 17p
OR p > 50008

In conclusion, I need 50,008 worst case searches to make sorting and using binary search better than a sequential search for a list of n=100,000.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文