在数组中查找不重复三倍的元素?

发布于 2024-12-03 23:13:42 字数 635 浏览 0 评论 0原文

阅读后这个有趣的问题让我想起了我曾经遇到过的一个棘手的面试问题,但我从未给出令人满意的答案:

给定一个由 n 个 32 位无符号整数组成的数组,其中每个元素(除了一个)都重复三倍。在 O(n) 时间内,使用尽可能少的辅助空间,找到数组中未出现 3 次倍数的元素。

举个例子,给定这个数组:

1 1 2 2 2 3 3 3 3 3 3

给定数组,我们将输出 1

3 2 1 3 2 1 2 3 1 4 4 4 4

我们将输出 4。

通过使用哈希表计算每个元素的频率,可以在 O(n) 时间和 O(n) 空间中轻松解决这个问题,尽管我强烈怀疑,因为问题陈述特别提到数组包含 32 位无符号整数,所以有更好的解决方案(我猜测 O(1) 空间)。

有谁对如何解决这个问题有任何想法?

After reading this interesting question I was reminded of a tricky interview question I had once that I never satisfactorily answered:

You are given an array of n 32-bit unsigned integers where each element (except one) is repeated a multiple of three times. In O(n) time and using as little auxiliary space as possible, find the element of the array that does not appear a multiple of three times.

As an example, given this array:

1 1 2 2 2 3 3 3 3 3 3

We would output 1, while given the array

3 2 1 3 2 1 2 3 1 4 4 4 4

We would output 4.

This can easily be solved in O(n) time and O(n) space by using a hash table to count the frequencies of each element, though I strongly suspect that because the problem statement specifically mentioned that the array contains 32-bit unsigned integers that there is a much better solution (I'm guessing O(1) space).

Does anyone have any ideas on how to solve this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

长发绾君心 2024-12-10 23:13:42

它可以在 O(n) 时间和 O(1) 空间内完成。

以下是如何在 C# 中使用常量空间来实现这一点。我使用“异或除了三态位”的想法。对于每个设置位,“异或”运算会递增相应的三态值。

最终输出将是其二进制表示形式在最终值中 1 或 2 的位置有 1 的数字。

void Main() {
    Console.WriteLine (FindNonTriple(new uint[] 
                        {1, 1, 2, 2, 2, 3, 3, 3, 3, 3, 3} ));
    // 1

    Console.WriteLine (FindNonTriple(new uint[] 
                        {3, 2, 1, 3, 2, 1, 3, 2, 1, 4, 4, 4, 4} ));
    // 4
}

uint FindNonTriple(uint[] args) {
    byte[] occurred = new byte[32];

    foreach (uint val in args) {
        for (int i = 0; i < 32; i++) {
            occurred[i] = (byte)((occurred[i] + (val >> i & 1)) % 3);
        }
    }

    uint result = 0;
    for (int i = 0; i < 32; i++) {
        if (occurred[i] != 0) result |= 1u << i;
    }
    return result;
}

It can be done in O(n) time and O(1) space.

Here is how you can do it with constant space in C#. I'm using the idea of "xor except with 3-state bits". For every set bit, the "xor" operation increments the corresponding 3-state value.

The final output will be the number whose binary representation has 1s in places that are either 1 or 2 in the final value.

void Main() {
    Console.WriteLine (FindNonTriple(new uint[] 
                        {1, 1, 2, 2, 2, 3, 3, 3, 3, 3, 3} ));
    // 1

    Console.WriteLine (FindNonTriple(new uint[] 
                        {3, 2, 1, 3, 2, 1, 3, 2, 1, 4, 4, 4, 4} ));
    // 4
}

uint FindNonTriple(uint[] args) {
    byte[] occurred = new byte[32];

    foreach (uint val in args) {
        for (int i = 0; i < 32; i++) {
            occurred[i] = (byte)((occurred[i] + (val >> i & 1)) % 3);
        }
    }

    uint result = 0;
    for (int i = 0; i < 32; i++) {
        if (occurred[i] != 0) result |= 1u << i;
    }
    return result;
}
莫言歌 2024-12-10 23:13:42

在恒定空间中执行此操作的明显解决方案是使用就地算法对其进行排序,然后在数组上扫描一次。

遗憾的是,这通常需要 O(n log n) 时间和 O(1) 空间。

但由于条目的密钥长度有限(32 位),您可以使用基数排序作为排序算法(存在就地基数排序,它们不稳定,但这并不重要)。在那里你有 O(n) 时间和 O(1) 空间。

编辑:顺便说一句,您可以使用这种方法来查找所有出现次数不是 3 倍数的数字,以防万一您允许多个数字可以具有此属性。

The obvious solution to do it in constant space is to sort it using an in place algorithm, and then scan once over the array.

Sadly this requires usually O(n log n) time and O(1) space.

But as the entries have a limited key length (32 bit) you can use as sort algorithm radix sort (there exist in place radix sort, they are not stable, but that doesnt matter here). There you have O(n) time and O(1) space.

EDIT: Btw you could use this approach to find also ALL numbers that appear not a multiple of 3 times, in case you would allow that more than one number could have this property.

无所的.畏惧 2024-12-10 23:13:42

您正在寻找重复次数非零 (mod 3) 的项目。我想我会递归地做到这一点:

  1. 将数组分成两半,
  2. 找到每一半中代表计数非零(mod 3)的项目,
  3. 合并两半,保留不相等键的计数,并添加相等键的计数,
  4. 删除任何count = 0 (mod 3)
  5. 将其返回为输入当前部分的值集。

甚至不需要尝试优化基本算法之外的东西(例如,不用担心每次计数只存储两位),这似乎做得很好。我已经添加了代码来生成相当大的测试用例(例如,1500 多个项目)并打印出它正在创建的地图的大小。在任何给定时间,它创建的地图中似乎最多有大约 50 个项目(即,它一次只使用两个地图,我见过的最大的项目约为 25 个)。从技术上讲,就目前情况而言,我相信目前的时间复杂度类似于 O(N log N),但如果您切换到基于哈希的容器,我相信您会期望 O(N)。

#include <map>
#include <iterator>
#include <iostream>
#include <algorithm>
#include <utility>
#include <vector>
#include <time.h>

class zero_mod { 
    unsigned base;
public:
    zero_mod(unsigned b) : base(b) {}

    bool operator()(std::pair<int, int> const &v) { 
        return v.second % base == 0; 
    }
};

// merge two maps together -- all keys from both maps, and the sums 
// of equal values.
// Then remove any items with a value congruent to 0 (mod 3)
//
std::map<int, int> 
merge(std::map<int, int> const &left, std::map<int, int> const &right) { 
    std::map<int, int>::const_iterator p, pos;
    std::map<int, int> temp, ret;

    std::copy(left.begin(), left.end(), std::inserter(temp, temp.end()));
    for (p=right.begin(); p!=right.end(); ++p) 
        temp[p->first] += p->second;
    std::remove_copy_if(temp.begin(), temp.end(), 
                        std::inserter(ret, ret.end()), 
                        zero_mod(3));
    return ret;
}   

// Recursively find items with counts != 0 (mod 3):    
std::map<int, int> 
do_count(std::vector<int> const &input, size_t left, size_t right) { 
    std::map<int, int> left_counts, right_counts, temp, ret;

    if (right - left <= 2) {
        for (size_t i=left; i!=right; ++i) 
            ++ret[input[i]];
        return ret;
    }
    size_t middle = left + (right-left)/2;
    left_counts = do_count(input, left, middle);
    right_counts = do_count(input, middle, right);
    ret = merge(left_counts, right_counts);

    // show the size of the map to get an idea of how much storage we're using.
    std::cerr << "Size: " << ret.size() << "\t";
    return ret;
}

std::map<int, int> count(std::vector<int> const &input) { 
    return do_count(input, 0, input.size());
}

namespace std { 
    ostream &operator<<(ostream &os, pair<int, int> const &p) { 
        return os << p.first;
    }
}

int main() {
    srand(time(NULL));
    std::vector<int> test;

    // generate a bunch of data by inserting packets of 3 items    
    for (int i=0; i<100; i++) {
        int packets = std::rand() % 10;
        int value = rand() % 50;
        for (int i=0; i<packets * 3; i++) 
            test.push_back(value);
    }

    // then remove one item, so that value will not occur a multiple of 3 times
    size_t pos = rand() % test.size();
    std::cerr << "Removing: " << test[pos] << " at position: " << pos << "\n";

    test.erase(test.begin()+pos);

    std::cerr << "Overall size: " << test.size() << "\n";

    std::random_shuffle(test.begin(), test.end());

    // Note that we use a map of results, so this will work if multiple values
    // occur the wrong number of times.    
    std::map<int, int> results = count(test);

    // show the results. Should match the value shown as removed above:
    std::copy(results.begin(), results.end(), 
        std::ostream_iterator<std::pair<int, int> >(std::cout, "\n"));
    return 0;
}

You're looking for an item with a rep-count that's non-zero (mod 3). I think I'd do it recursively:

  1. split the array in half
  2. find items with rep count that's non-zero (mod 3) in each half
  3. merge the halves, keeping counts for unequal keys, and adding the counts for equal keys
  4. strike out any with count = 0 (mod 3)
  5. return that as the set of values for the current part of the input.

Without even trying to optimize things beyond the basic algorithm (e.g., not worrying about storing only two bits per count), this seems to do pretty well. I've included code to generate a reasonably large test case (e.g., 1500+ items) and print out the sizes of the maps it's creating. At any given time, it seems to have a maximum of around 50 items in the maps it creates (i.e., it only uses two maps at a time, and the largest I've seen is around 25 items). Technically, as it stands I believe this is currently something like O(N log N), but if you switched to a hash-based container, I believe you'd expect O(N).

#include <map>
#include <iterator>
#include <iostream>
#include <algorithm>
#include <utility>
#include <vector>
#include <time.h>

class zero_mod { 
    unsigned base;
public:
    zero_mod(unsigned b) : base(b) {}

    bool operator()(std::pair<int, int> const &v) { 
        return v.second % base == 0; 
    }
};

// merge two maps together -- all keys from both maps, and the sums 
// of equal values.
// Then remove any items with a value congruent to 0 (mod 3)
//
std::map<int, int> 
merge(std::map<int, int> const &left, std::map<int, int> const &right) { 
    std::map<int, int>::const_iterator p, pos;
    std::map<int, int> temp, ret;

    std::copy(left.begin(), left.end(), std::inserter(temp, temp.end()));
    for (p=right.begin(); p!=right.end(); ++p) 
        temp[p->first] += p->second;
    std::remove_copy_if(temp.begin(), temp.end(), 
                        std::inserter(ret, ret.end()), 
                        zero_mod(3));
    return ret;
}   

// Recursively find items with counts != 0 (mod 3):    
std::map<int, int> 
do_count(std::vector<int> const &input, size_t left, size_t right) { 
    std::map<int, int> left_counts, right_counts, temp, ret;

    if (right - left <= 2) {
        for (size_t i=left; i!=right; ++i) 
            ++ret[input[i]];
        return ret;
    }
    size_t middle = left + (right-left)/2;
    left_counts = do_count(input, left, middle);
    right_counts = do_count(input, middle, right);
    ret = merge(left_counts, right_counts);

    // show the size of the map to get an idea of how much storage we're using.
    std::cerr << "Size: " << ret.size() << "\t";
    return ret;
}

std::map<int, int> count(std::vector<int> const &input) { 
    return do_count(input, 0, input.size());
}

namespace std { 
    ostream &operator<<(ostream &os, pair<int, int> const &p) { 
        return os << p.first;
    }
}

int main() {
    srand(time(NULL));
    std::vector<int> test;

    // generate a bunch of data by inserting packets of 3 items    
    for (int i=0; i<100; i++) {
        int packets = std::rand() % 10;
        int value = rand() % 50;
        for (int i=0; i<packets * 3; i++) 
            test.push_back(value);
    }

    // then remove one item, so that value will not occur a multiple of 3 times
    size_t pos = rand() % test.size();
    std::cerr << "Removing: " << test[pos] << " at position: " << pos << "\n";

    test.erase(test.begin()+pos);

    std::cerr << "Overall size: " << test.size() << "\n";

    std::random_shuffle(test.begin(), test.end());

    // Note that we use a map of results, so this will work if multiple values
    // occur the wrong number of times.    
    std::map<int, int> results = count(test);

    // show the results. Should match the value shown as removed above:
    std::copy(results.begin(), results.end(), 
        std::ostream_iterator<std::pair<int, int> >(std::cout, "\n"));
    return 0;
}
内心激荡 2024-12-10 23:13:42

计算每个位的总数(0-31,假设 int 是 32 位)并将每个位的计数存储在数组中。完成后,迭代生成的位计数数组,并检查哪个位位置出现的次数不是模 3 次,并根据这些位构造最终的数字。完整的代码如下(也适用于负数):

void get_num_bits(int n, vector<int>& vbits)
{
    int k = 0;
    vbits.resize(32);
    std::fill(vbits.begin(), vbits.end(), 0);

    uint32_t val = (uint32_t) n;
    printf("%u\n", val);

    if (n < 0)
    {
        while(val)
        {
            if ((1 & val) == 1)
                vbits[k] = 1;
            ++k;
            val = val >> 1;
        }
        return;
    }

    while(n)
    {
        if ((1 & n) == 1)
            vbits[k] = 1;
        ++k;
        n = n >> 1;
    }
}

void sum_bit_counts(vector<int>& all_bits, vector<int>& num_bits)
{
    for (int i = 0; i < num_bits.size(); ++i)
    {
        all_bits[i] += num_bits[i];
    }
}

int single_number(vector<int>& nums) 
{
    vector<int> all_bits(32);

    for (int i = 0; i < nums.size(); ++i)
    {
        vector<int> bits(32);   
        get_num_bits(nums[i], bits);
        sum_bit_counts(all_bits, bits);
    }

    int rv = 0;
    for (int i = 0; i < all_bits.size(); ++i)
    {
        if (all_bits[i] && (all_bits[i] % 3 != 0))
        {
            rv = rv | (1 << i);
        }
    }
    
    return rv;
}

Count the total number of each bit (0–31, assuming int is 32-bits) and store the count of each bit in an array. When finished, iterate over the resulting array of bits count and check which bit position occurs not a modulo 3 times and construct your final number based upon those bits. Complete code is below (works properly for the negative numbers too):

void get_num_bits(int n, vector<int>& vbits)
{
    int k = 0;
    vbits.resize(32);
    std::fill(vbits.begin(), vbits.end(), 0);

    uint32_t val = (uint32_t) n;
    printf("%u\n", val);

    if (n < 0)
    {
        while(val)
        {
            if ((1 & val) == 1)
                vbits[k] = 1;
            ++k;
            val = val >> 1;
        }
        return;
    }

    while(n)
    {
        if ((1 & n) == 1)
            vbits[k] = 1;
        ++k;
        n = n >> 1;
    }
}

void sum_bit_counts(vector<int>& all_bits, vector<int>& num_bits)
{
    for (int i = 0; i < num_bits.size(); ++i)
    {
        all_bits[i] += num_bits[i];
    }
}

int single_number(vector<int>& nums) 
{
    vector<int> all_bits(32);

    for (int i = 0; i < nums.size(); ++i)
    {
        vector<int> bits(32);   
        get_num_bits(nums[i], bits);
        sum_bit_counts(all_bits, bits);
    }

    int rv = 0;
    for (int i = 0; i < all_bits.size(); ++i)
    {
        if (all_bits[i] && (all_bits[i] % 3 != 0))
        {
            rv = rv | (1 << i);
        }
    }
    
    return rv;
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文