加速增长boost :: iostreams :: Feltering_streambuf

发布于 2025-02-12 04:54:10 字数 3591 浏览 1 评论 0 原文

我是C ++流概念的新手,想寻求一些一般建议,以图像处理。我使用流缓冲区 boost :: iostreams :: Filtering_streambuf 从文件中加载和解压缩图像,如此帖子

Relavent代码如下:


template <int _NCH>
class MultiChImg {
    public: 
        ...
        ...

    private:

        std::atomic<bool> __in_operation;
        std::atomic<bool> __content_loaded;
        char  **_IMG[_NCH];
        int _W, _H;

        void dcmprss ( const std::string & file_name, bool is_decomp = true) {

            ...
            ...

            // decompress
            int counter = 0, iw = -1, ih = -1, _r = 0;
            auto _fill_ = [&](const char c){
                            _r = counter % _NCH ; // was 3 for RGB mindset
                            if ( _r == 0 ) {
                                iw++; // fast index
                                if ( iw%_W==0 ) { iw=0; ih++; } // slow index
                            }
                            _IMG[_r][_H-1-ih][iw] = c; 
                            counter ++ ;
            } ;
            auto EoS = std::istreambuf_iterator<char>() ;
            // char buf[4096]; // UPDATE: improved code according to @sehe
            if ( is_decomp ) {
                // decompress 
                bio::filtering_streambuf<bio::input> input;     
                input.push( bio::gzip_decompressor() ); //  
                input.push( fstrm );
                std::basic_istream<char> inflated( &input );

                auto T3 = timing(T2, "Timing : dcmprss() prepare decomp ") ;

                // assign values to _IMG (0=>R, 1=>G, 2=>B)
                // TODO // bottleneck
                std::for_each( 
                    std::istreambuf_iterator<char>(inflated), EoS, _fill_ );
                // UPDATE: improved code according to @sehe , replace the previous two lines
                // while (inflated.read(buf, sizeof(buf))) 
                //     std::for_each(buf, buf + inflated.gcount(), _fill_);
                auto T4 = timing(T3, "Timing : dcmprss() decomp+assign ") ;

            } else {
                // assign values to _IMG (0=>R, 1=>G, 2=>B)
                // TODO // bottleneck
                std::for_each( 
                    std::istreambuf_iterator<char>(fstrm), EoS, _fill_ ); // different !
                // UPDATE: improved code according to @sehe , replace the previous two lines
                // while (fstrm.read(buf, sizeof(buf))) 
                //     std::for_each(buf, buf + fstrm.gcount(), _fill_);
                auto T3 = timing(T2, "Timing : dcmprss() assign ") ;
            }
            assert(counter == _NCH*_H*_W);

            ...
            ...
        };
...
...
}

瓶颈似乎是 for_each()零件,我在其中迭代流,要么通过通过 std :: istreambuf_iterator&lt; char&lt; char&gt ;(充气),或 fstrm 通过 std :: istreambuf_iterator&lt; char&gt;(fstrm),应用lambda函数 _fill _fill _fill _fill _ 。此lambda函数将流中的字节传输到多维数组类成员 _img 中的指定位置。

更新:由于内存泄漏而导致的时间不正确。我已经纠正了。

上述函数的定时结果 dcmprss()是30MB大小的.gz文件的450ms,未压缩文件为400ms。我认为这花费了太长时间。因此,我要求社区提供某种建议以改进。

感谢您在我的帖子上的宝贵时间!

I am new to the C++ concept of streams and want to ask for some general advice to speed up my code in image processing. I use a stream buffer boost::iostreams::filtering_streambuf to load and decompress the image from a file, as suggested in this post and another post. The performance is not satisfactory.

The relavent code is the following:


template <int _NCH>
class MultiChImg {
    public: 
        ...
        ...

    private:

        std::atomic<bool> __in_operation;
        std::atomic<bool> __content_loaded;
        char  **_IMG[_NCH];
        int _W, _H;

        void dcmprss ( const std::string & file_name, bool is_decomp = true) {

            ...
            ...

            // decompress
            int counter = 0, iw = -1, ih = -1, _r = 0;
            auto _fill_ = [&](const char c){
                            _r = counter % _NCH ; // was 3 for RGB mindset
                            if ( _r == 0 ) {
                                iw++; // fast index
                                if ( iw%_W==0 ) { iw=0; ih++; } // slow index
                            }
                            _IMG[_r][_H-1-ih][iw] = c; 
                            counter ++ ;
            } ;
            auto EoS = std::istreambuf_iterator<char>() ;
            // char buf[4096]; // UPDATE: improved code according to @sehe
            if ( is_decomp ) {
                // decompress 
                bio::filtering_streambuf<bio::input> input;     
                input.push( bio::gzip_decompressor() ); //  
                input.push( fstrm );
                std::basic_istream<char> inflated( &input );

                auto T3 = timing(T2, "Timing : dcmprss() prepare decomp ") ;

                // assign values to _IMG (0=>R, 1=>G, 2=>B)
                // TODO // bottleneck
                std::for_each( 
                    std::istreambuf_iterator<char>(inflated), EoS, _fill_ );
                // UPDATE: improved code according to @sehe , replace the previous two lines
                // while (inflated.read(buf, sizeof(buf))) 
                //     std::for_each(buf, buf + inflated.gcount(), _fill_);
                auto T4 = timing(T3, "Timing : dcmprss() decomp+assign ") ;

            } else {
                // assign values to _IMG (0=>R, 1=>G, 2=>B)
                // TODO // bottleneck
                std::for_each( 
                    std::istreambuf_iterator<char>(fstrm), EoS, _fill_ ); // different !
                // UPDATE: improved code according to @sehe , replace the previous two lines
                // while (fstrm.read(buf, sizeof(buf))) 
                //     std::for_each(buf, buf + fstrm.gcount(), _fill_);
                auto T3 = timing(T2, "Timing : dcmprss() assign ") ;
            }
            assert(counter == _NCH*_H*_W);

            ...
            ...
        };
...
...
}

The bottleneck appears to be the for_each() part, where I iterate the stream, either inflated via std::istreambuf_iterator<char>(inflated), or fstrm via std::istreambuf_iterator<char>(fstrm), to apply a lambda function _fill_. This lambda function transfers the bytes in the stream to the designated place in the multi-dimensional array class member _IMG.

UPDATE: the timing was incorrect due to memory leakage. I've corrected that.

The timing results of the above function dcmprss() are 450ms for a .gz file of 30MB size, 400ms for uncompressed file. I think it takes too long. So I am asking the community for some kind advice to improve.

Thanks for your time on my post!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

恰似旧人归 2025-02-19 04:54:10

您可以使用Blockwise IO

char buf[4096];
inflated.read(buf, sizeof(buf));
std::for_each(buf, buf + inflated.gcount(), _fill_);

,但是,我也认为可能会在 _fill _ 中浪费大量时间,其中一些尺寸被重塑。那感觉是任意的。

请注意,几个库具有透明重新指数多维数据的功能,因此您可能会节省时间,只是线性地复制源数据并访问该数据:

You can use blockwise IO

char buf[4096];
inflated.read(buf, sizeof(buf));
std::for_each(buf, buf + inflated.gcount(), _fill_);

However, I also think considerable time might be wasted in _fill_ where some dimensions are reshaped. That feels arbitrary.

Note that several libraries have the features to transparently re-index multi-dimensional data, so you may potentially save time just linearly copy the source data and accessing that:

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文