当前位置：文江博客话题详情

重复的二进制搜索阵列

发布于 2025-02-07 19:58:10 字数 4558 浏览 2 评论 0原文

第一次在这里发布，如果我不遵循最佳实践，请提前表示歉意。我的算法应该在分类的阵列中进行以下操作，并具有可能的重复。

返回-1如果元素在数组中不存在，
返回存在元素的最小索引。

我已经为一个不复制的数组编写了二进制搜索算法。这返回元素的位置或-1。基于BlackBox测试，我知道二进制搜索的非删除版本有效。然后，我通过另一个函数递归地调用该函数，以搜索从0到位置1，以找到元素的第一个发病率（如果有）。

我目前未通过黑匣子测试。我遇到错误的答案错误，而不是超时错误。我尝试了我能想到的大多数角落案例，并使用幼稚的搜索算法进行了蛮力测试，找不到问题。

我正在寻找有关实施中可能出了问题而不是替代解决方案的指导。

格式如下：输入：

5 #array尺寸

3 4 7 7 8 #array元素需要排序

5 #search查询数组尺寸

3 7 2 8 4 4 #query Elements

输出

0 2 -1 4 1

我的代码如下：

class BinarySearch:
 
 def __init__(self,input_list,query):
        
    self.array=input_list
    self.length=len(input_list)
    self.query=query
    return
 
 def binary_search(self,low,high):
    '''
    Implementing the binary search algorithm with distinct numbers on a 
    sorted input.
    '''
    
    #trivial case
    if (self.query<self.array[low]) or (self.query>self.array[high-1]):
        
        return -1
        
    elif (low>=high-1) and self.array[low]!=self.query:
        
        return -1
    else:
        m=low+int(np.floor((high-low)/2))
        if self.array[low]==self.query:
            return low
        elif (self.array[m-1]>=self.query):
            return self.binary_search(low,m)
        elif self.array[high-1]==self.query:
            return high-1
        else:    
            return self.binary_search(m,high)
    return
        
class DuplicateBinarySearch(BinarySearch):
     
     def __init__(self,input_list,query):
       
        BinarySearch.__init__(self,input_list,query)
     
     def handle_duplicate(self,position):
        '''
        Function handles the duplicate number problem.
        Input: position where query is identified.
        Output: updated earlier position if it exists else return 
        original position.
        '''
        
        if position==-1:
            return -1
        elif position==0:
            return 0
        elif self.array[position-1]!=self.query:
            return position
        else:
            new_position=self.binary_search(0,position)
            if new_position==-1 or new_position>=position:
                return position
            else:
                return self.handle_duplicate(new_position)
     
     def naive_duplicate(self,position):
        
        old_position=position
        if position==-1:
            return -1
        else:
            while position>=0 and self.array[position]==self.query:
                position-=1
            if position==-1:
                return old_position
                
            else:
                return position+1
     


 if __name__ == '__main__':
    num_keys = int(input())
    input_keys = list(map(int, input().split()))
    assert len(input_keys) == num_keys
    

    num_queries = int(input())
    input_queries = list(map(int, input().split()))
    assert len(input_queries) == num_queries
   

    for q in input_queries:
        item=DuplicateBinarySearch(input_keys,q)
        #res=item.handle_duplicate(item.binary_search(0,item.length))
        #res=item.naive_duplicate(item.binary_search(0,item.length))
        #assert res_check==res
        print(item.handle_duplicate(item.binary_search(0,item.length)), end=' ')
        #print(item.naive_duplicate(item.binary_search(0,item.length)), end=' ')

当我运行天真时，重复算法，我会收到一个超时错误：

失败的情况＃56/57：超出时间限制（使用时间：10.00/5.00，使用的内存：42201088/536870912。

）在其他测试案例上的答案错误：

失败情况＃24/57：错误答案

（使用时间：0.11/5.00，使用内存：42106880/536870912。）

问题语句如下：

在第一种情况下，代码将失败。

原始的二进制搜索功能无重复，但当handle_duplate函数递归调用它时，未知的边缘情况都未知。我将二进制搜索功能更改为以下：

 def binary_search(self,low,high):
        '''
        Implementing the binary search algorithm with distinct numbers on a sorted input.
        '''
        
        #trivial case
        if (low>=high-1) and self.array[low]!=self.query:
            return -1
        
        elif (self.query<self.array[low]) or (self.query>self.array[high-1]):
            return -1
            
        
        else:
            m=low+(high-low)//2
            if self.array[low]==self.query:
                return low
            elif (self.array[m-1]>=self.query):
                return self.binary_search(low,m)
            elif self.array[m]<=self.query:   
                return self.binary_search(m,high)
            elif self.array[high-1]==self.query:
                return high-1
            else: 
                return -1

原文

First time posting here, so apologies in advance if I am not following best practices. My algorithm is supposed to do the following in a sorted array with possible duplicates.

Return -1 if the element does not exist in the array
Return the smallest index where the element is present.

I have written a binary search algorithm for an array without duplicate. This returns a position of the element or -1. Based on blackbox testing, I know that the non-duplicate version of the binary search works. I have then recursively called that function via another function to search from 0 to position-1 to find the first incidence of the element, if any.

I am currently failing a black box test. I am getting a wrong answer error and not a time out error. I have tried most of the corner cases that I could think of and also ran a brute force test with the naive search algorithm and could not find an issue.

I am looking for some guidance on what might be wrong in the implementation rather than an alternate solution.

The format is as follow:
Input:

5 #array size

3 4 7 7 8 #array elements need to be sorted

5 #search query array size

3 7 2 8 4 #query elements

Output

0 2 -1 4 1

My code is shown below:

class BinarySearch:
 
 def __init__(self,input_list,query):
        
    self.array=input_list
    self.length=len(input_list)
    self.query=query
    return
 
 def binary_search(self,low,high):
    '''
    Implementing the binary search algorithm with distinct numbers on a 
    sorted input.
    '''
    
    #trivial case
    if (self.query<self.array[low]) or (self.query>self.array[high-1]):
        
        return -1
        
    elif (low>=high-1) and self.array[low]!=self.query:
        
        return -1
    else:
        m=low+int(np.floor((high-low)/2))
        if self.array[low]==self.query:
            return low
        elif (self.array[m-1]>=self.query):
            return self.binary_search(low,m)
        elif self.array[high-1]==self.query:
            return high-1
        else:    
            return self.binary_search(m,high)
    return
        
class DuplicateBinarySearch(BinarySearch):
     
     def __init__(self,input_list,query):
       
        BinarySearch.__init__(self,input_list,query)
     
     def handle_duplicate(self,position):
        '''
        Function handles the duplicate number problem.
        Input: position where query is identified.
        Output: updated earlier position if it exists else return 
        original position.
        '''
        
        if position==-1:
            return -1
        elif position==0:
            return 0
        elif self.array[position-1]!=self.query:
            return position
        else:
            new_position=self.binary_search(0,position)
            if new_position==-1 or new_position>=position:
                return position
            else:
                return self.handle_duplicate(new_position)
     
     def naive_duplicate(self,position):
        
        old_position=position
        if position==-1:
            return -1
        else:
            while position>=0 and self.array[position]==self.query:
                position-=1
            if position==-1:
                return old_position
                
            else:
                return position+1
     


 if __name__ == '__main__':
    num_keys = int(input())
    input_keys = list(map(int, input().split()))
    assert len(input_keys) == num_keys
    

    num_queries = int(input())
    input_queries = list(map(int, input().split()))
    assert len(input_queries) == num_queries
   

    for q in input_queries:
        item=DuplicateBinarySearch(input_keys,q)
        #res=item.handle_duplicate(item.binary_search(0,item.length))
        #res=item.naive_duplicate(item.binary_search(0,item.length))
        #assert res_check==res
        print(item.handle_duplicate(item.binary_search(0,item.length)), end=' ')
        #print(item.naive_duplicate(item.binary_search(0,item.length)), end=' ')

When I run a naive duplicate algorithm, I get a time out error:

Failed case #56/57: time limit exceeded (Time used: 10.00/5.00, memory used: 42201088/536870912.)

When I run the binary search with duplicate algorithm, I get a wrong answer error on a different test case:

Failed case #24/57: Wrong answer

(Time used: 0.11/5.00, memory used: 42106880/536870912.)

The problem statement is as follows:

Problem Statement

Update:

I could make the code work by making the following change but I have not been able to create a test case to see why the code would fail in the first case.

Original binary search function that works with no duplicates but fails an unknown edge case when a handle_duplicate function calls it recursively. I changed the binary search function to the following:

 def binary_search(self,low,high):
        '''
        Implementing the binary search algorithm with distinct numbers on a sorted input.
        '''
        
        #trivial case
        if (low>=high-1) and self.array[low]!=self.query:
            return -1
        
        elif (self.query<self.array[low]) or (self.query>self.array[high-1]):
            return -1
            
        
        else:
            m=low+(high-low)//2
            if self.array[low]==self.query:
                return low
            elif (self.array[m-1]>=self.query):
                return self.binary_search(low,m)
            elif self.array[m]<=self.query:   
                return self.binary_search(m,high)
            elif self.array[high-1]==self.query:
                return high-1
            else: 
                return -1

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

彩扇题诗 2025-02-14 19:58:10

由于您将用递归实施二进制搜索，因此我建议您添加一个变量“结果”，该变量为返回值并保持等于目标值的中间索引。

这是一个示例：

def binarySearchRecursive(nums, left, right, target, result):

    """
    This is your exit point. 
    If the target is not found, result will be -1 since it won't change from initial value.
    If the target is found, result will be the index of the first occurrence of the target.
    """
    if left > right:
        return result 

    # Overflow prevention
    mid = left + (right - left) // 2

    if nums[mid] == target:
        # We are not sure if this is the first occurrence of the target.
        # So we will store the index to the result now, and keep checking.
        result = mid 
        # Since we are looking for "first occurrence", we discard right half.
        return binarySearchRecursive(nums, left, mid - 1, target, result) 
    elif target < nums[mid]:
        return binarySearchRecursive(nums, left, mid - 1, target, result)
    else:
        return binarySearchRecursive(nums, mid + 1, right, target, result)

if __name__ == '__main__':

    nums = [2,4,4,4,7,7,9]
    target = 4 

    (left, right) = (0, len(nums)-1)
    result = -1 # Initial value
    index = binarySearchRecursive(nums, left, right, target, result)

    if index != -1:
        print(index)
    else:
        print('Not found')

从您的更新版本中，我仍然觉得您功能的出口有点不直觉。（您的“琐事”部分），

因为您搜索应停止的唯一条件是您已经搜索了所有可能的部分列表。那是搜索区域的范围为0的时候，没有剩余的搜索和检查元素。在实施中，就是左＆lt;对，或高＆lt;低，是真的。

“结果”变量初始化为-1，当该函数首次从main调用。如果没有匹配，不会更改。在每次成功的匹配之后，由于我们无法确定它是否是第一次出现，因此我们只会将此索引存储到结果中。如果有更多的“左匹配”，则值将更新。如果没有，则该值最终将返回。如果目标不在列表中，则将退货为-1，为其原始初始化值。

Since you are going to implement binary search with recursive, i would suggest you add a variable 'result' which act as returning value and hold intermediate index which equal to target value.

Here is an example:

def binarySearchRecursive(nums, left, right, target, result):

    """
    This is your exit point. 
    If the target is not found, result will be -1 since it won't change from initial value.
    If the target is found, result will be the index of the first occurrence of the target.
    """
    if left > right:
        return result 

    # Overflow prevention
    mid = left + (right - left) // 2

    if nums[mid] == target:
        # We are not sure if this is the first occurrence of the target.
        # So we will store the index to the result now, and keep checking.
        result = mid 
        # Since we are looking for "first occurrence", we discard right half.
        return binarySearchRecursive(nums, left, mid - 1, target, result) 
    elif target < nums[mid]:
        return binarySearchRecursive(nums, left, mid - 1, target, result)
    else:
        return binarySearchRecursive(nums, mid + 1, right, target, result)

if __name__ == '__main__':

    nums = [2,4,4,4,7,7,9]
    target = 4 

    (left, right) = (0, len(nums)-1)
    result = -1 # Initial value
    index = binarySearchRecursive(nums, left, right, target, result)

    if index != -1:
        print(index)
    else:
        print('Not found')

From your updated version, I still feel the exit point of your function is a little unintuitive.(Your "trivial case" section)

Since the only condition that your searching should stop, is that you have searched all possible section of the list. That is when the range of searching area is 0, there is no element left to be search and check. In implementation, that is when left < right, or high < low, is true.

The 'result' variable, is initialized as -1 when the function first been called from main. And won't change if there is no match find. And after each successful matching, since we can not be sure if it is the first occurrence, we will just store this index into the result. If there are more 'left matching', then the value will be update. If there is not, then the value will be eventually returned. If the target is not in the list, the return will be -1, as its original initialized value.

回复收藏 0 原文

~没有更多了~