当前位置：文江博客话题详情

给定一个整数数组，找到第一个唯一的整数

发布于 2024-12-12 08:14:48 字数 276 浏览 0 评论 0原文

给定一个整数数组，找到第一个唯一的整数。

我的解决方案：使用 std::map

将整数（数字作为键，其索引作为值）一一放入其中 (O(n^2 lgn))，如果如果有重复，则从映射中删除该条目 (O(lg n))，将所有数字放入映射后，迭代映射并找到索引最小的键 O(n)。

O(n^2 lgn) 因为map需要排序。

这效率不高。

其他更好的解决方案？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

若相惜即相离 2024-12-19 08:14:48

我相信以下将是最佳解决方案，至少基于时间/空间复杂度：

步骤 1：
将整数存储在哈希映射中，该映射将整数作为键，并将其出现的次数作为值。这通常是一个O(n)操作，平均而言，哈希表中元素的插入/更新应该是恒定时间。如果发现一个整数出现两次以上，您实际上不必进一步增加使用计数（如果您不想）。

步骤2：
对整数执行第二次传递。在哈希映射中查找每个，第一个出现次数为 1 的就是您要查找的那个（即第一个出现的整数）。这也是O(n)，使得整个过程O(n)。

针对特殊情况的一些可能的优化：

优化A：可以使用简单的数组代替哈希表。即使在最坏的情况下，这也保证了 O(1)，用于计算特定整数的出现次数以及查找其出现计数。此外，这还增强了实时性能，因为不需要执行哈希算法。由于引用局部性可能较差（即较大的稀疏表与具有合理负载因子的哈希表实现），可能会出现命中。然而，这将适用于整数排序的非常特殊的情况，并且可以通过哈希表的哈希函数根据传入整数产生伪随机存储桶放置来缓解（即，开始时引用的位置较差）。

数组中的每个字节都表示该字节索引所表示的整数的计数（最多 255）。只有当最低整数和最高整数之间的差异（即有效整数域的基数）足够小以使该数组适合内存时，这才是可能的。特定整数数组中的索引将是其值减去数据集中存在的最小整数。

例如，在具有 64 位操作系统的现代硬件上，可以想象可以分配一个 4GB 的数组来处理整个 32 位整数域。如果有足够的内存，甚至可以想象更大的阵列。

在处理之前必须知道最小和最大整数，或者需要使用最小最大算法对数据进行另一次线性遍历以找出该信息。

优化 B：您可以进一步优化优化 A，每个整数最多使用 2 位（一位表示存在，另一位表示多重性）。这将允许每个字节表示四个整数，从而扩展数组实现以在给定的可用内存量下处理更大的整数域。这里可以玩更多的位游戏来进一步压缩表示，但它们仅支持传入数据的特殊情况，因此不能推荐用于仍然主要是一般的情况。

I believe that the following would be the optimal solution, at least based on time / space complexity:

Step 1:
Store the integers in a hash map, which holds the integer as a key and the count of the number of times it appears as the value. This is generally an O(n) operation and the insertion / updating of elements in the hash table should be constant time, on the average. If an integer is found to appear more than twice, you really don't have to increment the usage count further (if you don't want to).

Step 2:
Perform a second pass over the integers. Look each up in the hash map and the first one with an appearance count of one is the one you were looking for (i.e., the first single appearing integer). This is also O(n), making the entire process O(n).

Some possible optimizations for special cases:

Optimization A: It may be possible to use a simple array instead of a hash table. This guarantees O(1) even in the worst case for counting the number of occurrences of a particular integer as well as the lookup of its appearance count. Also, this enhances real time performance, since the hash algorithm does not need to be executed. There may be a hit due to potentially poorer locality of reference (i.e., a larger sparse table vs. the hash table implementation with a reasonable load factor). However, this would be for very special cases of integer orderings and may be mitigated by the hash table's hash function producing pseudorandom bucket placements based on the incoming integers (i.e., poor locality of reference to begin with).

Each byte in the array would represent the count (up to 255) for the integer represented by the index of that byte. This would only be possible if the difference between the lowest integer and the highest (i.e., the cardinality of the domain of valid integers) was small enough such that this array would fit into memory. The index in the array of a particular integer would be its value minus the smallest integer present in the data set.

For example on modern hardware with a 64-bit OS, it is quite conceivable that a 4GB array can be allocated which can handle the entire domain of 32-bit integers. Even larger arrays are conceivable with sufficient memory.

The smallest and largest integers would have to be known before processing, or another linear pass through the data using the minmax algorithm to find out this information would be required.

Optimization B: You could optimize Optimization A further, by using at most 2 bits per integer (One bit indicates presence and the other indicates multiplicity). This would allow for the representation of four integers per byte, extending the array implementation to handle a larger domain of integers for a given amount of available memory. More bit games could be played here to compress the representation further, but they would only support special cases of data coming in and therefore cannot be recommended for the still mostly general case.

回复收藏 0 原文

独享拥抱 2024-12-19 08:14:48

这一切都是没有原因的。只需使用 2 个 for 循环 &一个变量会给你一个简单的O(n^2)算法。

如果您不厌其烦地使用哈希映射，那么这也可能是 @Micheal Goldshteyn 建议的

更新：我知道这个问题已经有 1 年历史了。但在查看我回答的问题时发现了这一点。认为有比使用哈希表更好的解决方案。

当我们说独特时，我们就会有一个模式。例如：[5,5,66,66,7,1,1,77]。在此我们有 3 的移动窗口。首先考虑 (5,5,66)。我们可以很容易地建立。这里有重复的。因此将窗口移动 1 个元素，得到 (5,66,66)。同样在这里。移至下一个 (66,66,7)。再次在这里重复。下一个 (66,7,1)。这里没有重复的人！取中间元素，因为它必须是集合中的第一个唯一元素。左边的元素属于 dup，所以可以是 1。因此 7 是第一个唯一元素。

空间：O(1)
时间： O(n) * O(m^2) = O(n) * 9 ≈ O(n)

回复收藏 0 原文

我要还你自由 2024-12-19 08:14:48

插入到地图是 O(log n) 而不是 O(n log n) 因此插入 n 个键将是 n log n。另外最好使用set。

回复收藏 0 原文

多情癖 2024-12-19 08:14:48

虽然它是 O(n^2)，但下面的系数很小，对缓存的影响也不错，并且使用速度很快的 memmem() 。

 for(int x=0;x<len-1;x++)
     if(memmem(&array[x+1], sizeof(int)*(len-(x+1)), array[x], sizeof(int))==NULL &&
        memmem(&array[x+1], sizeof(int)*(x-1), array[x], sizeof(int))==NULL)
            return array[x];

Although it's O(n^2), the following has small coefficients, isn't too bad on the cache, and uses memmem() which is fast.

 for(int x=0;x<len-1;x++)
     if(memmem(&array[x+1], sizeof(int)*(len-(x+1)), array[x], sizeof(int))==NULL &&
        memmem(&array[x+1], sizeof(int)*(x-1), array[x], sizeof(int))==NULL)
            return array[x];

回复收藏 0 原文

似最初 2024-12-19 08:14:48

public static string firstUnique(int[] input)  
{
    int size = input.Length;
    bool[] dupIndex = new bool[size];
    for (int i = 0; i < size; ++i) 
    {
        if (dupIndex[i]) 
        {
            continue;
        } 
        else if (i == size - 1) 
        {
            return input[i].ToString();
        }
        for (int j = i + 1; j < size; ++j) 
        {
            if (input[i]==input[j]) 
            {
                dupIndex[j] = true;
                break;
            } 
            else if (j == size - 1)
            {
                return input[i].ToString();
            }
        }
    }
    return "No unique element";
}

public static string firstUnique(int[] input)  
{
    int size = input.Length;
    bool[] dupIndex = new bool[size];
    for (int i = 0; i < size; ++i) 
    {
        if (dupIndex[i]) 
        {
            continue;
        } 
        else if (i == size - 1) 
        {
            return input[i].ToString();
        }
        for (int j = i + 1; j < size; ++j) 
        {
            if (input[i]==input[j]) 
            {
                dupIndex[j] = true;
                break;
            } 
            else if (j == size - 1)
            {
                return input[i].ToString();
            }
        }
    }
    return "No unique element";
}

回复收藏 0 原文

鞋纸虽美，但不合脚ㄋ〞 2024-12-19 08:14:48

@用户3612419

  Solution given you is good with some what close to O(N*N2) but further optimization in same code is possible I just added two-3 lines that you missed.



public static string firstUnique(int[] input)  
{
  int size = input.Length;
  bool[] dupIndex = new bool[size];
  for (int i = 0; i < size; ++i) 
  {
     if (dupIndex[i]) 
    {
        continue;
    } 
    else if (i == size - 1) 
    {
        return input[i].ToString();
    }
    for (int j = i + 1; j < size; ++j) 
    {
  if(dupIndex[j]==true)
  {
 continue;
 }
        if (input[i]==input[j]) 
        {
            dupIndex[j] = true;
    dupIndex[i] = true;
            break;
        } 
        else if (j == size - 1)
        {
            return input[i].ToString();
         }
     } 
  }
return "No unique element";
 }

@user3612419

  Solution given you is good with some what close to O(N*N2) but further optimization in same code is possible I just added two-3 lines that you missed.



public static string firstUnique(int[] input)  
{
  int size = input.Length;
  bool[] dupIndex = new bool[size];
  for (int i = 0; i < size; ++i) 
  {
     if (dupIndex[i]) 
    {
        continue;
    } 
    else if (i == size - 1) 
    {
        return input[i].ToString();
    }
    for (int j = i + 1; j < size; ++j) 
    {
  if(dupIndex[j]==true)
  {
 continue;
 }
        if (input[i]==input[j]) 
        {
            dupIndex[j] = true;
    dupIndex[i] = true;
            break;
        } 
        else if (j == size - 1)
        {
            return input[i].ToString();
         }
     } 
  }
return "No unique element";
 }

回复收藏 0 原文

~没有更多了~