读取位性能

发布于 2024-11-03 09:01:23 字数 1519 浏览 0 评论 0原文

我正在编写一个辅助类,我打算用它来从数据块中反向读取位。

我尝试进行优化,使用“rol”指令来屏蔽数据。然而,令我惊讶的是,这实际上比在每次访问期间创建新的位掩码要慢。

class reverse_bit_reader
{
public:
    static const size_t bits_per_block = sizeof(unsigned long)*8;
    static const size_t high_bit = 1 << (bits_per_block-1);

    reverse_bit_reader(const void* data, size_t size) 
        : data_(reinterpret_cast<const unsigned long*>(data))
        , index_(size-1)
    {
            // Bits are stored in left to right order, potentially ignore the last bits
        size_t last_bit_index = index_ % bits_per_block;
        bit_mask_ = high_bit >> (last_bit_index+1);
        if(bit_mask_ == 0)
            bit_mask_ = high_bit;
    }

    bool next_bit1()
    {
        return get_bit(index_--);   
    }

    bool next_bit2() // Why is next_bit1 faster?
    {
        __asm // Rotate bit_mask.
        {
            mov eax, [ecx+0];  
            rol eax, 1;
            mov [ecx+0], eax;
        }
        return data_[index_-- / bits_per_block] & bit_mask_;    
    }

    bool eof() const{return index_ < 0;}
private:

    bool get_bit(size_t index) const
    {       
        const size_t block_index = index / bits_per_block;
        const size_t bit_index = index % bits_per_block;
        const size_t bit_mask = high_bit >> bit_index;
        return data_[block_index] & bit_mask;
    }

    unsigned long bit_mask_;
    int index_;
    const unsigned long* data_;
};

谁能解释为什么 next_bit1 比 next_bit2 快?

I'm writing a helper class which I intend to use for reading bits in reverse from a data block.

I tried doing an optimization where I used "rol" instructions for masking the data. However, to my surprise this is actually slower than creating a new bitmask during each access.

class reverse_bit_reader
{
public:
    static const size_t bits_per_block = sizeof(unsigned long)*8;
    static const size_t high_bit = 1 << (bits_per_block-1);

    reverse_bit_reader(const void* data, size_t size) 
        : data_(reinterpret_cast<const unsigned long*>(data))
        , index_(size-1)
    {
            // Bits are stored in left to right order, potentially ignore the last bits
        size_t last_bit_index = index_ % bits_per_block;
        bit_mask_ = high_bit >> (last_bit_index+1);
        if(bit_mask_ == 0)
            bit_mask_ = high_bit;
    }

    bool next_bit1()
    {
        return get_bit(index_--);   
    }

    bool next_bit2() // Why is next_bit1 faster?
    {
        __asm // Rotate bit_mask.
        {
            mov eax, [ecx+0];  
            rol eax, 1;
            mov [ecx+0], eax;
        }
        return data_[index_-- / bits_per_block] & bit_mask_;    
    }

    bool eof() const{return index_ < 0;}
private:

    bool get_bit(size_t index) const
    {       
        const size_t block_index = index / bits_per_block;
        const size_t bit_index = index % bits_per_block;
        const size_t bit_mask = high_bit >> bit_index;
        return data_[block_index] & bit_mask;
    }

    unsigned long bit_mask_;
    int index_;
    const unsigned long* data_;
};

Can anyone explain why next_bit1 is faster than next_bit2?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

燕归巢 2024-11-10 09:01:23

如果您要从 long 中顺序读取位,从最高有效位开始,并且您希望它尽可能快,您可以按照这些思路做一些事情吗?

#define GETBIT ((theBit = (theLong < 0)), (theLong <<= 1), theBit)

If you're going to be reading bits sequentially out of longs, starting from the most significant bit, and you want it to be as fast as possible, could you do something along these lines?

#define GETBIT ((theBit = (theLong < 0)), (theLong <<= 1), theBit)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文