加速增长boost :: iostreams :: Feltering_streambuf

发布于 2025-02-12 04:54:10 字数 3591 浏览 1 评论 0 原文

我是C ++流概念的新手，想寻求一些一般建议，以图像处理。我使用流缓冲区 boost :: iostreams :: Filtering_streambuf 从文件中加载和解压缩图像，如此帖子和

Relavent代码如下：


template <int _NCH>
class MultiChImg {
    public: 
        ...
        ...

    private:

        std::atomic<bool> __in_operation;
        std::atomic<bool> __content_loaded;
        char  **_IMG[_NCH];
        int _W, _H;

        void dcmprss ( const std::string & file_name, bool is_decomp = true) {

            ...
            ...

            // decompress
            int counter = 0, iw = -1, ih = -1, _r = 0;
            auto _fill_ = [&](const char c){
                            _r = counter % _NCH ; // was 3 for RGB mindset
                            if ( _r == 0 ) {
                                iw++; // fast index
                                if ( iw%_W==0 ) { iw=0; ih++; } // slow index
                            }
                            _IMG[_r][_H-1-ih][iw] = c; 
                            counter ++ ;
            } ;
            auto EoS = std::istreambuf_iterator<char>() ;
            // char buf[4096]; // UPDATE: improved code according to @sehe
            if ( is_decomp ) {
                // decompress 
                bio::filtering_streambuf<bio::input> input;     
                input.push( bio::gzip_decompressor() ); //  
                input.push( fstrm );
                std::basic_istream<char> inflated( &input );

                auto T3 = timing(T2, "Timing : dcmprss() prepare decomp ") ;

                // assign values to _IMG (0=>R, 1=>G, 2=>B)
                // TODO // bottleneck
                std::for_each( 
                    std::istreambuf_iterator<char>(inflated), EoS, _fill_ );
                // UPDATE: improved code according to @sehe , replace the previous two lines
                // while (inflated.read(buf, sizeof(buf))) 
                //     std::for_each(buf, buf + inflated.gcount(), _fill_);
                auto T4 = timing(T3, "Timing : dcmprss() decomp+assign ") ;

            } else {
                // assign values to _IMG (0=>R, 1=>G, 2=>B)
                // TODO // bottleneck
                std::for_each( 
                    std::istreambuf_iterator<char>(fstrm), EoS, _fill_ ); // different !
                // UPDATE: improved code according to @sehe , replace the previous two lines
                // while (fstrm.read(buf, sizeof(buf))) 
                //     std::for_each(buf, buf + fstrm.gcount(), _fill_);
                auto T3 = timing(T2, "Timing : dcmprss() assign ") ;
            }
            assert(counter == _NCH*_H*_W);

            ...
            ...
        };
...
...
}

瓶颈似乎是 for_each（）零件，我在其中迭代流，要么通过通过 std :: istreambuf_iterator＆lt; char＆lt; char＆gt ;（充气），或 fstrm 通过 std :: istreambuf_iterator＆lt; char＆gt;（fstrm），应用lambda函数 _fill _fill _fill _fill _ 。此lambda函数将流中的字节传输到多维数组类成员 _img 中的指定位置。

更新：由于内存泄漏而导致的时间不正确。我已经纠正了。

上述函数的定时结果 dcmprss（）是30MB大小的.gz文件的450ms，未压缩文件为400ms。我认为这花费了太长时间。因此，我要求社区提供某种建议以改进。

感谢您在我的帖子上的宝贵时间！

原文

I am new to the C++ concept of streams and want to ask for some general advice to speed up my code in image processing. I use a stream buffer boost::iostreams::filtering_streambuf to load and decompress the image from a file, as suggested in this post and another post. The performance is not satisfactory.

The relavent code is the following:


template <int _NCH>
class MultiChImg {
    public: 
        ...
        ...

    private:

        std::atomic<bool> __in_operation;
        std::atomic<bool> __content_loaded;
        char  **_IMG[_NCH];
        int _W, _H;

        void dcmprss ( const std::string & file_name, bool is_decomp = true) {

            ...
            ...

            // decompress
            int counter = 0, iw = -1, ih = -1, _r = 0;
            auto _fill_ = [&](const char c){
                            _r = counter % _NCH ; // was 3 for RGB mindset
                            if ( _r == 0 ) {
                                iw++; // fast index
                                if ( iw%_W==0 ) { iw=0; ih++; } // slow index
                            }
                            _IMG[_r][_H-1-ih][iw] = c; 
                            counter ++ ;
            } ;
            auto EoS = std::istreambuf_iterator<char>() ;
            // char buf[4096]; // UPDATE: improved code according to @sehe
            if ( is_decomp ) {
                // decompress 
                bio::filtering_streambuf<bio::input> input;     
                input.push( bio::gzip_decompressor() ); //  
                input.push( fstrm );
                std::basic_istream<char> inflated( &input );

                auto T3 = timing(T2, "Timing : dcmprss() prepare decomp ") ;

                // assign values to _IMG (0=>R, 1=>G, 2=>B)
                // TODO // bottleneck
                std::for_each( 
                    std::istreambuf_iterator<char>(inflated), EoS, _fill_ );
                // UPDATE: improved code according to @sehe , replace the previous two lines
                // while (inflated.read(buf, sizeof(buf))) 
                //     std::for_each(buf, buf + inflated.gcount(), _fill_);
                auto T4 = timing(T3, "Timing : dcmprss() decomp+assign ") ;

            } else {
                // assign values to _IMG (0=>R, 1=>G, 2=>B)
                // TODO // bottleneck
                std::for_each( 
                    std::istreambuf_iterator<char>(fstrm), EoS, _fill_ ); // different !
                // UPDATE: improved code according to @sehe , replace the previous two lines
                // while (fstrm.read(buf, sizeof(buf))) 
                //     std::for_each(buf, buf + fstrm.gcount(), _fill_);
                auto T3 = timing(T2, "Timing : dcmprss() assign ") ;
            }
            assert(counter == _NCH*_H*_W);

            ...
            ...
        };
...
...
}

The bottleneck appears to be the for_each() part, where I iterate the stream, either inflated via std::istreambuf_iterator<char>(inflated), or fstrm via std::istreambuf_iterator<char>(fstrm), to apply a lambda function _fill_. This lambda function transfers the bytes in the stream to the designated place in the multi-dimensional array class member _IMG.

UPDATE: the timing was incorrect due to memory leakage. I've corrected that.

The timing results of the above function dcmprss() are 450ms for a .gz file of 30MB size, 400ms for uncompressed file. I think it takes too long. So I am asking the community for some kind advice to improve.

Thanks for your time on my post!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

恰似旧人归 2025-02-19 04:54:10

您可以使用Blockwise IO

char buf[4096];
inflated.read(buf, sizeof(buf));
std::for_each(buf, buf + inflated.gcount(), _fill_);

，但是，我也认为可能会在 _fill _ 中浪费大量时间，其中一些尺寸被重塑。那感觉是任意的。

请注意，几个库具有透明重新指数多维数据的功能，因此您可能会节省时间，只是线性地复制源数据并访问该数据：

Boost Multiarray（允许您指定存储订单，方向和偏移：
Boost Gil允许您直接从Interleaved/Planar Buffers直接使用图像数据： https://www.boost.org/doc/doc/doc/doc/libs/1_79_79_0/libs/libs/libs/gil/gil/gil/doc/doc/doc/doc/html/html/html/html/design-sign-sign-sign /dynamic_image.html

You can use blockwise IO

char buf[4096];
inflated.read(buf, sizeof(buf));
std::for_each(buf, buf + inflated.gcount(), _fill_);

However, I also think considerable time might be wasted in _fill_ where some dimensions are reshaped. That feels arbitrary.

Note that several libraries have the features to transparently re-index multi-dimensional data, so you may potentially save time just linearly copy the source data and accessing that: