分析代码的快速而肮脏的方法

发布于 2024-07-04 19:27:31 字数 37 浏览 7 评论 0 原文

当您想要获取有关特定代码路径的性能数据时,您会使用什么方法?

What method do you use when you want to get performance data about specific code paths?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

烟火散人牵绊 2024-07-11 19:27:31

在 C++11 中, chrono 标头提供了许多有用的计时函数。 因此,您可以测量代码的特定部分并将其转换为适当的测量值,例如秒或毫秒。

#include <iostream>
#include <chrono>

int main() {
    auto const t0 = std::chrono::steady_clock::now();
    std::cout << "Hello world" << std::endl;
    auto const t1 = std::chrono::steady_clock::now();

    auto const us = std::chrono::duration_cast<std::chrono::microseconds>(t1-t0);
    auto const ms = std::chrono::duration_cast<std::chrono::milliseconds>(t1-t0);
    auto const s = std::chrono::duration_cast<std::chrono::seconds>(t1-t0);

    std::cout << "time taken in s " << s.count() << std::endl;
    std::cout << "time taken in ms " << ms.count() << std::endl;
    std::cout << "time taken in us " << us.count() << std::endl;
}

输出:

Hello world
time taken in s 0
time taken in ms 0
time taken in us 30

With C++11 the chrono header provides many useful timing functions. So you can measure a specific section of code and cast it to the appropriate measurement e.g. seconds or milliseconds.

#include <iostream>
#include <chrono>

int main() {
    auto const t0 = std::chrono::steady_clock::now();
    std::cout << "Hello world" << std::endl;
    auto const t1 = std::chrono::steady_clock::now();

    auto const us = std::chrono::duration_cast<std::chrono::microseconds>(t1-t0);
    auto const ms = std::chrono::duration_cast<std::chrono::milliseconds>(t1-t0);
    auto const s = std::chrono::duration_cast<std::chrono::seconds>(t1-t0);

    std::cout << "time taken in s " << s.count() << std::endl;
    std::cout << "time taken in ms " << ms.count() << std::endl;
    std::cout << "time taken in us " << us.count() << std::endl;
}

Output:

Hello world
time taken in s 0
time taken in ms 0
time taken in us 30
森林散布 2024-07-11 19:27:31

我有一个快速而肮脏的分析类,即使在最紧密的内部循环中也可以使用它进行分析。 重点是极其轻量级和简单的代码。 该类分配一个固定大小的二维数组。 然后我在各处添加“检查点”调用。 当检查点 M 之后立即到达检查点 N 时,我将经过的时间(以微秒为单位)添加到数组项 [M,N] 中。 由于这是为了分析紧密循环而设计的,因此我还有“迭代开始”调用来重置“最后一个检查点”变量。 测试结束时,dumpResults() 调用会生成所有相互跟随的检查点对的列表,以及已计算和未计算的总时间。

I have a quick-and-dirty profiling class that can be used in profiling in even the most tight inner loops. The emphasis is on extreme light weight and simple code. The class allocates a two-dimensional array of fixed size. I then add "checkpoint" calls all over the place. When checkpoint N is reached immediately after checkpoint M, I add the time elapsed (in microseconds) to the array item [M,N]. Since this is designed to profile tight loops, I also have "start of iteration" call that resets the the "last checkpoint" variable. At the end of test, the dumpResults() call produces the list of all pairs of checkpoints that followed each other, together with total time accounted for and unaccounted for.

林空鹿饮溪 2024-07-11 19:27:31

为此,我编写了一个简单的跨平台类,名为 nanotimer 。 目标是尽可能轻量级,以免通过添加太多指令从而影响指令缓存来干扰实际代码性能。 它能够在 Windows、Mac 和 Linux(可能还有一些 UNIX 变体)上获得微秒级的精度。

基本用法:

plf::timer t;
timer.start();

// stuff

double elapsed = t.get_elapsed_ns(); // Get nanoseconds

start() 还会在必要时重新启动计时器。 “暂停”计时器可以通过存储经过的时间来实现,然后在“取消暂停”时重新启动计时器,并在下次检查经过的时间时添加到存储的结果中。

I wrote a simple cross-platform class called nanotimer for this reason. The goal was to be as lightweight as possible so as to not interfere with actual code performance by adding too many instructions and thereby influencing the instruction cache. It is capable of getting microsecond accuracy across windows, mac and linux (and probably some unix variants).

Basic usage:

plf::timer t;
timer.start();

// stuff

double elapsed = t.get_elapsed_ns(); // Get nanoseconds

start() also restarts the timer when necessary. "Pausing" the timer can be achieved by storing the elapsed time, then restarting the timer when "unpausing" and adding to the stored result the next time you check elapsed time.

浅唱ヾ落雨殇 2024-07-11 19:27:31

好吧,我有两个代码片段。 在 pseudocode 中,它们看起来像(这是一个简化版本,我正在使用 QueryPerformanceFrequency):

第一个片段:

Timer timer = new Timer
timer.Start

第二个片段:

timer.Stop
show elapsed time

一点热键功夫,我可以说这段代码从我的 CPU 中偷走了多少时间。

Well, I have two code snippets. In pseudocode they are looking like (it's a simplified version, I'm using QueryPerformanceFrequency actually):

First snippet:

Timer timer = new Timer
timer.Start

Second snippet:

timer.Stop
show elapsed time

A bit of hot-keys kung fu, and I can say how much time this piece of code stole from my CPU.

没有心的人 2024-07-11 19:27:31

文章代码分析器和优化提供了大量有关 C++ 代码分析的信息,并且还提供了程序/类的免费下载链接,该链接将向您展示不同代码路径/方法的图形演示。

The article Code profiler and optimizations has lots of information about C++ code profiling and also has a free download link to a program/class that will show you a graphic presentation for different code paths/methods.

秋叶绚丽 2024-07-11 19:27:31

注意,以下内容都是专门针对Windows编写的。

我还有一个计时器类,我编写它来进行快速而肮脏的分析,它使用 QueryPerformanceCounter() 来获得高精度计时,但略有不同。 当 Timer 对象超出范围时,我的计时器类不会转储经过的时间。 相反,它将经过的时间累积到一个集合中。 我添加了一个静态成员函数 Dump(),它创建一个运行时间表,按计时类别(在 Timer 的构造函数中指定为字符串)排序,并进行一些统计分析,例如平均运行时间、标准偏差、最大值和最小值。 我还添加了一个 Clear() 静态成员函数,用于清除集合和集合。 让你重新开始。

如何使用 Timer 类(psudocode):

int CInsertBuffer::Read(char* pBuf)
{
       // TIMER NOTES: Avg Execution Time = ~1 ms
       Timer timer("BufferRead");
       :      :
       return -1;
}

示例输出:

Timer Precision = 418.0095 ps

=== Item               Trials    Ttl Time  Avg Time  Mean Time StdDev    ===
    AddTrade           500       7 ms      14 us     12 us     24 us
    BufferRead         511       1:19.25   0.16 s    621 ns    2.48 s
    BufferWrite        516       511 us    991 ns    482 ns    11 us
    ImportPos Loop     1002      18.62 s   19 ms     77 us     0.51 s
    ImportPosition     2         18.75 s   9.38 s    16.17 s   13.59 s
    Insert             515       4.26 s    8 ms      5 ms      27 ms
    recv               101       18.54 s   0.18 s    2603 ns   1.63 s

文件 Timer.inl:

#include <map>
#include "x:\utils\stlext\stringext.h"
#include <iterator>
#include <set>
#include <vector>
#include <numeric>
#include "x:\utils\stlext\algorithmext.h"
#include <math.h>

    class Timer
    {
    public:
        Timer(const char* name)
        {
            label = std::safe_string(name);
            QueryPerformanceCounter(&startTime);
        }

        virtual ~Timer()
        {
            QueryPerformanceCounter(&stopTime);
            __int64 clocks = stopTime.QuadPart-startTime.QuadPart;
            double elapsed = (double)clocks/(double)TimerFreq();
            TimeMap().insert(std::make_pair(label,elapsed));
        };

        static std::string Dump(bool ClipboardAlso=true)
        {
            static const std::string loc = "Timer::Dump";

            if( TimeMap().empty() )
            {
                return "No trials\r\n";
            }

            std::string ret = std::formatstr("\r\n\r\nTimer Precision = %s\r\n\r\n", format_elapsed(1.0/(double)TimerFreq()).c_str());

            // get a list of keys
            typedef std::set<std::string> keyset;
            keyset keys;
            std::transform(TimeMap().begin(), TimeMap().end(), std::inserter(keys, keys.begin()), extract_key());

            size_t maxrows = 0;

            typedef std::vector<std::string> strings;
            strings lines;

            static const size_t tabWidth = 9;

            std::string head = std::formatstr("=== %-*.*s %-*.*s %-*.*s %-*.*s %-*.*s %-*.*s ===", tabWidth*2, tabWidth*2, "Item", tabWidth, tabWidth, "Trials", tabWidth, tabWidth, "Ttl Time", tabWidth, tabWidth, "Avg Time", tabWidth, tabWidth, "Mean Time", tabWidth, tabWidth, "StdDev");
            ret += std::formatstr("\r\n%s\r\n", head.c_str());
            if( ClipboardAlso ) 
                lines.push_back("Item\tTrials\tTtl Time\tAvg Time\tMean Time\tStdDev\r\n");
            // dump the values for each key
            {for( keyset::iterator key = keys.begin(); keys.end() != key; ++key )
            {
                time_type ttl = 0;
                ttl = std::accumulate(TimeMap().begin(), TimeMap().end(), ttl, accum_key(*key));
                size_t num = std::count_if( TimeMap().begin(), TimeMap().end(), match_key(*key));
                if( num > maxrows ) 
                    maxrows = num;
                time_type avg = ttl / num;

                // compute mean
                std::vector<time_type> sortedTimes;
                std::transform_if(TimeMap().begin(), TimeMap().end(), std::inserter(sortedTimes, sortedTimes.begin()), extract_val(), match_key(*key));
                std::sort(sortedTimes.begin(), sortedTimes.end());
                size_t mid = (size_t)floor((double)num/2.0);
                double mean = ( num > 1 && (num % 2) != 0 ) ? (sortedTimes[mid]+sortedTimes[mid+1])/2.0 : sortedTimes[mid];
                // compute variance
                double sum = 0.0;
                if( num > 1 )
                {
                    for( std::vector<time_type>::iterator timeIt = sortedTimes.begin(); sortedTimes.end() != timeIt; ++timeIt )
                        sum += pow(*timeIt-mean,2.0);
                }
                // compute std dev
                double stddev = num > 1 ? sqrt(sum/((double)num-1.0)) : 0.0;

                ret += std::formatstr("    %-*.*s %-*.*s %-*.*s %-*.*s %-*.*s %-*.*s\r\n", tabWidth*2, tabWidth*2, key->c_str(), tabWidth, tabWidth, std::formatstr("%d",num).c_str(), tabWidth, tabWidth, format_elapsed(ttl).c_str(), tabWidth, tabWidth, format_elapsed(avg).c_str(), tabWidth, tabWidth, format_elapsed(mean).c_str(), tabWidth, tabWidth, format_elapsed(stddev).c_str()); 
                if( ClipboardAlso )
                    lines.push_back(std::formatstr("%s\t%s\t%s\t%s\t%s\t%s\r\n", key->c_str(), std::formatstr("%d",num).c_str(), format_elapsed(ttl).c_str(), format_elapsed(avg).c_str(), format_elapsed(mean).c_str(), format_elapsed(stddev).c_str())); 

            }
            }
            ret += std::formatstr("%s\r\n", std::string(head.length(),'=').c_str());

            if( ClipboardAlso )
            {
                // dump header row of data block
                lines.push_back("");
                {
                    std::string s;
                    for( keyset::iterator key = keys.begin(); key != keys.end(); ++key )
                    {
                        if( key != keys.begin() )
                            s.append("\t");
                        s.append(*key);
                    }
                    s.append("\r\n");
                    lines.push_back(s);
                }

                // blow out the flat map of time values to a seperate vector of times for each key
                typedef std::map<std::string, std::vector<time_type> > nodematrix;
                nodematrix nodes;
                for( Times::iterator time = TimeMap().begin(); time != TimeMap().end(); ++time )
                    nodes[time->first].push_back(time->second);

                // dump each data point
                for( size_t row = 0; row < maxrows; ++row )
                {
                    std::string rowDump;
                    for( keyset::iterator key = keys.begin(); key != keys.end(); ++key )
                    {
                        if( key != keys.begin() )
                            rowDump.append("\t");
                        if( nodes[*key].size() > row )
                            rowDump.append(std::formatstr("%f", nodes[*key][row]));
                    }
                    rowDump.append("\r\n");
                    lines.push_back(rowDump);
                }

                // dump to the clipboard
                std::string dump;
                for( strings::iterator s = lines.begin(); s != lines.end(); ++s )
                {
                    dump.append(*s);
                }

                OpenClipboard(0);
                EmptyClipboard();
                HGLOBAL hg = GlobalAlloc(GMEM_MOVEABLE, dump.length()+1);
                if( hg != 0 )
                {
                    char* buf = (char*)GlobalLock(hg);
                    if( buf != 0 )
                    {
                        std::copy(dump.begin(), dump.end(), buf);
                        buf[dump.length()] = 0;
                        GlobalUnlock(hg);
                        SetClipboardData(CF_TEXT, hg);
                    }
                }
                CloseClipboard();
            }

            return ret;
        }

        static void Reset()
        {
            TimeMap().clear();
        }

        static std::string format_elapsed(double d) 
        {
            if( d < 0.00000001 )
            {
                // show in ps with 4 digits
                return std::formatstr("%0.4f ps", d * 1000000000000.0);
            }
            if( d < 0.00001 )
            {
                // show in ns
                return std::formatstr("%0.0f ns", d * 1000000000.0);
            }
            if( d < 0.001 )
            {
                // show in us
                return std::formatstr("%0.0f us", d * 1000000.0);
            }
            if( d < 0.1 )
            {
                // show in ms
                return std::formatstr("%0.0f ms", d * 1000.0);
            }
            if( d <= 60.0 )
            {
                // show in seconds
                return std::formatstr("%0.2f s", d);
            }
            if( d < 3600.0 )
            {
                // show in min:sec
                return std::formatstr("%01.0f:%02.2f", floor(d/60.0), fmod(d,60.0));
            }
            // show in h:min:sec
            return std::formatstr("%01.0f:%02.0f:%02.2f", floor(d/3600.0), floor(fmod(d,3600.0)/60.0), fmod(d,60.0));
        }

    private:
        static __int64 TimerFreq()
        {
            static __int64 freq = 0;
            static bool init = false;
            if( !init )
            {
                LARGE_INTEGER li;
                QueryPerformanceFrequency(&li);
                freq = li.QuadPart;
                init = true;
            }
            return freq;
        }
        LARGE_INTEGER startTime, stopTime;
        std::string label;

        typedef std::string key_type;
        typedef double time_type;
        typedef std::multimap<key_type, time_type> Times;
//      static Times times;
        static Times& TimeMap()
        {
            static Times times_;
            return times_;
        }

        struct extract_key : public std::unary_function<Times::value_type, key_type>
        {
            std::string operator()(Times::value_type const & r) const
            {
                return r.first;
            }
        };

        struct extract_val : public std::unary_function<Times::value_type, time_type>
        {
            time_type operator()(Times::value_type const & r) const
            {
                return r.second;
            }
        };
        struct match_key : public std::unary_function<Times::value_type, bool>
        {
            match_key(key_type const & key_) : key(key_) {};
            bool operator()(Times::value_type const & rhs) const
            {
                return key == rhs.first;
            }
        private:
            match_key& operator=(match_key&) { return * this; }
            const key_type key;
        };

        struct accum_key : public std::binary_function<time_type, Times::value_type, time_type>
        {
            accum_key(key_type const & key_) : key(key_), n(0) {};
            time_type operator()(time_type const & v, Times::value_type const & rhs) const
            {
                if( key == rhs.first )
                {
                    ++n;
                    return rhs.second + v;
                }
                return v;
            }
        private:
            accum_key& operator=(accum_key&) { return * this; }
            const Times::key_type key;
            mutable size_t n;
        };
    };

文件 stringext.h(提供 formatstr( )函数):

namespace std
{
    /*  ---

    Formatted Print

        template<class C>
        int strprintf(basic_string<C>* pString, const C* pFmt, ...);

        template<class C>
        int vstrprintf(basic_string<C>* pString, const C* pFmt, va_list args);

    Returns :

        # characters printed to output


    Effects :

        Writes formatted data to a string.  strprintf() works exactly the same as sprintf(); see your
        documentation for sprintf() for details of peration.  vstrprintf() also works the same as sprintf(), 
        but instead of accepting a variable paramater list it accepts a va_list argument.

    Requires :

        pString is a pointer to a basic_string<>

    --- */

    template<class char_type> int vprintf_generic(char_type* buffer, size_t bufferSize, const char_type* format, va_list argptr);

    template<> inline int vprintf_generic<char>(char* buffer, size_t bufferSize, const char* format, va_list argptr)
    {
#       ifdef SECURE_VSPRINTF
        return _vsnprintf_s(buffer, bufferSize-1, _TRUNCATE, format, argptr);
#       else
        return _vsnprintf(buffer, bufferSize-1, format, argptr);
#       endif
    }

    template<> inline int vprintf_generic<wchar_t>(wchar_t* buffer, size_t bufferSize, const wchar_t* format, va_list argptr)
    {
#       ifdef SECURE_VSPRINTF
        return _vsnwprintf_s(buffer, bufferSize-1, _TRUNCATE, format, argptr);
#       else
        return _vsnwprintf(buffer, bufferSize-1, format, argptr);
#       endif
    }

    template<class Type, class Traits>
    inline int vstringprintf(basic_string<Type,Traits> & outStr, const Type* format, va_list args)
    {
        // prologue
        static const size_t ChunkSize = 1024;
        size_t curBufSize = 0;
        outStr.erase(); 

        if( !format )
        {
            return 0;
        }

        // keep trying to write the string to an ever-increasing buffer until
        // either we get the string written or we run out of memory
        while( bool cont = true )
        {
            // allocate a local buffer
            curBufSize += ChunkSize;
            std::ref_ptr<Type> localBuffer = new Type[curBufSize];
            if( localBuffer.get() == 0 )
            {
                // we ran out of memory -- nice goin'!
                return -1;
            }
            // format output to local buffer
            int i = vprintf_generic(localBuffer.get(), curBufSize * sizeof(Type), format, args);
            if( -1 == i )
            {
                // the buffer wasn't big enough -- try again
                continue;
            }
            else if( i < 0 )
            {
                // something wierd happened -- bail
                return i;
            }
            // if we get to this point the string was written completely -- stop looping
            outStr.assign(localBuffer.get(),i);
            return i;
        }
        // unreachable code
        return -1;
    };

    // provided for backward-compatibility
    template<class Type, class Traits>
    inline int vstrprintf(basic_string<Type,Traits> * outStr, const Type* format, va_list args)
    {
        return vstringprintf(*outStr, format, args);
    }

    template<class Char, class Traits>
    inline int stringprintf(std::basic_string<Char, Traits> & outString, const Char* format, ...)
    {
        va_list args;
        va_start(args, format);
        int retval = vstringprintf(outString, format, args);
        va_end(args);
        return retval;
    }

    // old function provided for backward-compatibility
    template<class Char, class Traits>
    inline int strprintf(std::basic_string<Char, Traits> * outString, const Char* format, ...)
    {
        va_list args;
        va_start(args, format);
        int retval = vstringprintf(*outString, format, args);
        va_end(args);
        return retval;
    }

    /*  ---

    Inline Formatted Print

        string strprintf(const char* Format, ...);

    Returns :

        Formatted string


    Effects :

        Writes formatted data to a string.  formatstr() works the same as sprintf(); see your
        documentation for sprintf() for details of operation.  

    --- */

    template<class Char>
    inline std::basic_string<Char> formatstr(const Char * format, ...)
    {
        std::string outString;

        va_list args;
        va_start(args, format);
        vstringprintf(outString, format, args);
        va_end(args);
        return outString;
    }
};

文件algorithmext.h(提供transform_if()函数):

/*  ---

Transform
25.2.3

    template<class InputIterator, class OutputIterator, class UnaryOperation, class Predicate>
        OutputIterator transform_if(InputIterator first, InputIterator last, OutputIterator result, UnaryOperation op, Predicate pred)

    template<class InputIterator1, class InputIterator2, class OutputIterator, class BinaryOperation, class Predicate>
        OutputIterator transform_if(InputIterator first, InputIterator last, OutputIterator result, BinaryOperation binary_op, Predicate pred)

Requires:   

    T is of type EqualityComparable (20.1.1) 
    op and binary_op have no side effects

Effects :

    Assigns through every iterator i in the range [result, result + (last1-first1)) a new corresponding value equal to one of:
        1:  op( *(first1 + (i - result)) 
        2:  binary_op( *(first1 + (i - result), *(first2 + (i - result))

Returns :

    result + (last1 - first1)

Complexity :

    At most last1 - first1 applications of op or binary_op

--- */

template<class InputIterator, class OutputIterator, class UnaryFunction, class Predicate>
OutputIterator transform_if(InputIterator first, 
                            InputIterator last, 
                            OutputIterator result, 
                            UnaryFunction f, 
                            Predicate pred)
{
    for (; first != last; ++first)
    {
        if( pred(*first) )
            *result++ = f(*first);
    }
    return result; 
}

template<class InputIterator1, class InputIterator2, class OutputIterator, class BinaryOperation, class Predicate>
OutputIterator transform_if(InputIterator1 first1, 
                            InputIterator1 last1, 
                            InputIterator2 first2, 
                            OutputIterator result, 
                            BinaryOperation binary_op, 
                            Predicate pred)
{
    for (; first1 != last1 ; ++first1, ++first2)
    {
        if( pred(*first1) )
            *result++ = binary_op(*first1,*first2);
    }
    return result;
}

Note, the following is all written specifically for Windows.

I also have a timer class that I wrote to do quick-and-dirty profiling that uses QueryPerformanceCounter() to get high-precision timings, but with a slight difference. My timer class doesn't dump the elapsed time when the Timer object falls out of scope. Instead, it accumulates the elapsed times in to an collection. I added a static member function, Dump(), which creates a table of elapsed times, sorted by timing category (specified in Timer's constructor as a string) along with some statistical analysis such as mean elapsed time, standard deviation, max and min. I also added a Clear() static member function which clears the collection & lets you start over again.

How to use the Timer class (psudocode):

int CInsertBuffer::Read(char* pBuf)
{
       // TIMER NOTES: Avg Execution Time = ~1 ms
       Timer timer("BufferRead");
       :      :
       return -1;
}

Sample output :

Timer Precision = 418.0095 ps

=== Item               Trials    Ttl Time  Avg Time  Mean Time StdDev    ===
    AddTrade           500       7 ms      14 us     12 us     24 us
    BufferRead         511       1:19.25   0.16 s    621 ns    2.48 s
    BufferWrite        516       511 us    991 ns    482 ns    11 us
    ImportPos Loop     1002      18.62 s   19 ms     77 us     0.51 s
    ImportPosition     2         18.75 s   9.38 s    16.17 s   13.59 s
    Insert             515       4.26 s    8 ms      5 ms      27 ms
    recv               101       18.54 s   0.18 s    2603 ns   1.63 s

file Timer.inl :

#include <map>
#include "x:\utils\stlext\stringext.h"
#include <iterator>
#include <set>
#include <vector>
#include <numeric>
#include "x:\utils\stlext\algorithmext.h"
#include <math.h>

    class Timer
    {
    public:
        Timer(const char* name)
        {
            label = std::safe_string(name);
            QueryPerformanceCounter(&startTime);
        }

        virtual ~Timer()
        {
            QueryPerformanceCounter(&stopTime);
            __int64 clocks = stopTime.QuadPart-startTime.QuadPart;
            double elapsed = (double)clocks/(double)TimerFreq();
            TimeMap().insert(std::make_pair(label,elapsed));
        };

        static std::string Dump(bool ClipboardAlso=true)
        {
            static const std::string loc = "Timer::Dump";

            if( TimeMap().empty() )
            {
                return "No trials\r\n";
            }

            std::string ret = std::formatstr("\r\n\r\nTimer Precision = %s\r\n\r\n", format_elapsed(1.0/(double)TimerFreq()).c_str());

            // get a list of keys
            typedef std::set<std::string> keyset;
            keyset keys;
            std::transform(TimeMap().begin(), TimeMap().end(), std::inserter(keys, keys.begin()), extract_key());

            size_t maxrows = 0;

            typedef std::vector<std::string> strings;
            strings lines;

            static const size_t tabWidth = 9;

            std::string head = std::formatstr("=== %-*.*s %-*.*s %-*.*s %-*.*s %-*.*s %-*.*s ===", tabWidth*2, tabWidth*2, "Item", tabWidth, tabWidth, "Trials", tabWidth, tabWidth, "Ttl Time", tabWidth, tabWidth, "Avg Time", tabWidth, tabWidth, "Mean Time", tabWidth, tabWidth, "StdDev");
            ret += std::formatstr("\r\n%s\r\n", head.c_str());
            if( ClipboardAlso ) 
                lines.push_back("Item\tTrials\tTtl Time\tAvg Time\tMean Time\tStdDev\r\n");
            // dump the values for each key
            {for( keyset::iterator key = keys.begin(); keys.end() != key; ++key )
            {
                time_type ttl = 0;
                ttl = std::accumulate(TimeMap().begin(), TimeMap().end(), ttl, accum_key(*key));
                size_t num = std::count_if( TimeMap().begin(), TimeMap().end(), match_key(*key));
                if( num > maxrows ) 
                    maxrows = num;
                time_type avg = ttl / num;

                // compute mean
                std::vector<time_type> sortedTimes;
                std::transform_if(TimeMap().begin(), TimeMap().end(), std::inserter(sortedTimes, sortedTimes.begin()), extract_val(), match_key(*key));
                std::sort(sortedTimes.begin(), sortedTimes.end());
                size_t mid = (size_t)floor((double)num/2.0);
                double mean = ( num > 1 && (num % 2) != 0 ) ? (sortedTimes[mid]+sortedTimes[mid+1])/2.0 : sortedTimes[mid];
                // compute variance
                double sum = 0.0;
                if( num > 1 )
                {
                    for( std::vector<time_type>::iterator timeIt = sortedTimes.begin(); sortedTimes.end() != timeIt; ++timeIt )
                        sum += pow(*timeIt-mean,2.0);
                }
                // compute std dev
                double stddev = num > 1 ? sqrt(sum/((double)num-1.0)) : 0.0;

                ret += std::formatstr("    %-*.*s %-*.*s %-*.*s %-*.*s %-*.*s %-*.*s\r\n", tabWidth*2, tabWidth*2, key->c_str(), tabWidth, tabWidth, std::formatstr("%d",num).c_str(), tabWidth, tabWidth, format_elapsed(ttl).c_str(), tabWidth, tabWidth, format_elapsed(avg).c_str(), tabWidth, tabWidth, format_elapsed(mean).c_str(), tabWidth, tabWidth, format_elapsed(stddev).c_str()); 
                if( ClipboardAlso )
                    lines.push_back(std::formatstr("%s\t%s\t%s\t%s\t%s\t%s\r\n", key->c_str(), std::formatstr("%d",num).c_str(), format_elapsed(ttl).c_str(), format_elapsed(avg).c_str(), format_elapsed(mean).c_str(), format_elapsed(stddev).c_str())); 

            }
            }
            ret += std::formatstr("%s\r\n", std::string(head.length(),'=').c_str());

            if( ClipboardAlso )
            {
                // dump header row of data block
                lines.push_back("");
                {
                    std::string s;
                    for( keyset::iterator key = keys.begin(); key != keys.end(); ++key )
                    {
                        if( key != keys.begin() )
                            s.append("\t");
                        s.append(*key);
                    }
                    s.append("\r\n");
                    lines.push_back(s);
                }

                // blow out the flat map of time values to a seperate vector of times for each key
                typedef std::map<std::string, std::vector<time_type> > nodematrix;
                nodematrix nodes;
                for( Times::iterator time = TimeMap().begin(); time != TimeMap().end(); ++time )
                    nodes[time->first].push_back(time->second);

                // dump each data point
                for( size_t row = 0; row < maxrows; ++row )
                {
                    std::string rowDump;
                    for( keyset::iterator key = keys.begin(); key != keys.end(); ++key )
                    {
                        if( key != keys.begin() )
                            rowDump.append("\t");
                        if( nodes[*key].size() > row )
                            rowDump.append(std::formatstr("%f", nodes[*key][row]));
                    }
                    rowDump.append("\r\n");
                    lines.push_back(rowDump);
                }

                // dump to the clipboard
                std::string dump;
                for( strings::iterator s = lines.begin(); s != lines.end(); ++s )
                {
                    dump.append(*s);
                }

                OpenClipboard(0);
                EmptyClipboard();
                HGLOBAL hg = GlobalAlloc(GMEM_MOVEABLE, dump.length()+1);
                if( hg != 0 )
                {
                    char* buf = (char*)GlobalLock(hg);
                    if( buf != 0 )
                    {
                        std::copy(dump.begin(), dump.end(), buf);
                        buf[dump.length()] = 0;
                        GlobalUnlock(hg);
                        SetClipboardData(CF_TEXT, hg);
                    }
                }
                CloseClipboard();
            }

            return ret;
        }

        static void Reset()
        {
            TimeMap().clear();
        }

        static std::string format_elapsed(double d) 
        {
            if( d < 0.00000001 )
            {
                // show in ps with 4 digits
                return std::formatstr("%0.4f ps", d * 1000000000000.0);
            }
            if( d < 0.00001 )
            {
                // show in ns
                return std::formatstr("%0.0f ns", d * 1000000000.0);
            }
            if( d < 0.001 )
            {
                // show in us
                return std::formatstr("%0.0f us", d * 1000000.0);
            }
            if( d < 0.1 )
            {
                // show in ms
                return std::formatstr("%0.0f ms", d * 1000.0);
            }
            if( d <= 60.0 )
            {
                // show in seconds
                return std::formatstr("%0.2f s", d);
            }
            if( d < 3600.0 )
            {
                // show in min:sec
                return std::formatstr("%01.0f:%02.2f", floor(d/60.0), fmod(d,60.0));
            }
            // show in h:min:sec
            return std::formatstr("%01.0f:%02.0f:%02.2f", floor(d/3600.0), floor(fmod(d,3600.0)/60.0), fmod(d,60.0));
        }

    private:
        static __int64 TimerFreq()
        {
            static __int64 freq = 0;
            static bool init = false;
            if( !init )
            {
                LARGE_INTEGER li;
                QueryPerformanceFrequency(&li);
                freq = li.QuadPart;
                init = true;
            }
            return freq;
        }
        LARGE_INTEGER startTime, stopTime;
        std::string label;

        typedef std::string key_type;
        typedef double time_type;
        typedef std::multimap<key_type, time_type> Times;
//      static Times times;
        static Times& TimeMap()
        {
            static Times times_;
            return times_;
        }

        struct extract_key : public std::unary_function<Times::value_type, key_type>
        {
            std::string operator()(Times::value_type const & r) const
            {
                return r.first;
            }
        };

        struct extract_val : public std::unary_function<Times::value_type, time_type>
        {
            time_type operator()(Times::value_type const & r) const
            {
                return r.second;
            }
        };
        struct match_key : public std::unary_function<Times::value_type, bool>
        {
            match_key(key_type const & key_) : key(key_) {};
            bool operator()(Times::value_type const & rhs) const
            {
                return key == rhs.first;
            }
        private:
            match_key& operator=(match_key&) { return * this; }
            const key_type key;
        };

        struct accum_key : public std::binary_function<time_type, Times::value_type, time_type>
        {
            accum_key(key_type const & key_) : key(key_), n(0) {};
            time_type operator()(time_type const & v, Times::value_type const & rhs) const
            {
                if( key == rhs.first )
                {
                    ++n;
                    return rhs.second + v;
                }
                return v;
            }
        private:
            accum_key& operator=(accum_key&) { return * this; }
            const Times::key_type key;
            mutable size_t n;
        };
    };

file stringext.h (provides formatstr() function):

namespace std
{
    /*  ---

    Formatted Print

        template<class C>
        int strprintf(basic_string<C>* pString, const C* pFmt, ...);

        template<class C>
        int vstrprintf(basic_string<C>* pString, const C* pFmt, va_list args);

    Returns :

        # characters printed to output


    Effects :

        Writes formatted data to a string.  strprintf() works exactly the same as sprintf(); see your
        documentation for sprintf() for details of peration.  vstrprintf() also works the same as sprintf(), 
        but instead of accepting a variable paramater list it accepts a va_list argument.

    Requires :

        pString is a pointer to a basic_string<>

    --- */

    template<class char_type> int vprintf_generic(char_type* buffer, size_t bufferSize, const char_type* format, va_list argptr);

    template<> inline int vprintf_generic<char>(char* buffer, size_t bufferSize, const char* format, va_list argptr)
    {
#       ifdef SECURE_VSPRINTF
        return _vsnprintf_s(buffer, bufferSize-1, _TRUNCATE, format, argptr);
#       else
        return _vsnprintf(buffer, bufferSize-1, format, argptr);
#       endif
    }

    template<> inline int vprintf_generic<wchar_t>(wchar_t* buffer, size_t bufferSize, const wchar_t* format, va_list argptr)
    {
#       ifdef SECURE_VSPRINTF
        return _vsnwprintf_s(buffer, bufferSize-1, _TRUNCATE, format, argptr);
#       else
        return _vsnwprintf(buffer, bufferSize-1, format, argptr);
#       endif
    }

    template<class Type, class Traits>
    inline int vstringprintf(basic_string<Type,Traits> & outStr, const Type* format, va_list args)
    {
        // prologue
        static const size_t ChunkSize = 1024;
        size_t curBufSize = 0;
        outStr.erase(); 

        if( !format )
        {
            return 0;
        }

        // keep trying to write the string to an ever-increasing buffer until
        // either we get the string written or we run out of memory
        while( bool cont = true )
        {
            // allocate a local buffer
            curBufSize += ChunkSize;
            std::ref_ptr<Type> localBuffer = new Type[curBufSize];
            if( localBuffer.get() == 0 )
            {
                // we ran out of memory -- nice goin'!
                return -1;
            }
            // format output to local buffer
            int i = vprintf_generic(localBuffer.get(), curBufSize * sizeof(Type), format, args);
            if( -1 == i )
            {
                // the buffer wasn't big enough -- try again
                continue;
            }
            else if( i < 0 )
            {
                // something wierd happened -- bail
                return i;
            }
            // if we get to this point the string was written completely -- stop looping
            outStr.assign(localBuffer.get(),i);
            return i;
        }
        // unreachable code
        return -1;
    };

    // provided for backward-compatibility
    template<class Type, class Traits>
    inline int vstrprintf(basic_string<Type,Traits> * outStr, const Type* format, va_list args)
    {
        return vstringprintf(*outStr, format, args);
    }

    template<class Char, class Traits>
    inline int stringprintf(std::basic_string<Char, Traits> & outString, const Char* format, ...)
    {
        va_list args;
        va_start(args, format);
        int retval = vstringprintf(outString, format, args);
        va_end(args);
        return retval;
    }

    // old function provided for backward-compatibility
    template<class Char, class Traits>
    inline int strprintf(std::basic_string<Char, Traits> * outString, const Char* format, ...)
    {
        va_list args;
        va_start(args, format);
        int retval = vstringprintf(*outString, format, args);
        va_end(args);
        return retval;
    }

    /*  ---

    Inline Formatted Print

        string strprintf(const char* Format, ...);

    Returns :

        Formatted string


    Effects :

        Writes formatted data to a string.  formatstr() works the same as sprintf(); see your
        documentation for sprintf() for details of operation.  

    --- */

    template<class Char>
    inline std::basic_string<Char> formatstr(const Char * format, ...)
    {
        std::string outString;

        va_list args;
        va_start(args, format);
        vstringprintf(outString, format, args);
        va_end(args);
        return outString;
    }
};

File algorithmext.h (provides transform_if() function) :

/*  ---

Transform
25.2.3

    template<class InputIterator, class OutputIterator, class UnaryOperation, class Predicate>
        OutputIterator transform_if(InputIterator first, InputIterator last, OutputIterator result, UnaryOperation op, Predicate pred)

    template<class InputIterator1, class InputIterator2, class OutputIterator, class BinaryOperation, class Predicate>
        OutputIterator transform_if(InputIterator first, InputIterator last, OutputIterator result, BinaryOperation binary_op, Predicate pred)

Requires:   

    T is of type EqualityComparable (20.1.1) 
    op and binary_op have no side effects

Effects :

    Assigns through every iterator i in the range [result, result + (last1-first1)) a new corresponding value equal to one of:
        1:  op( *(first1 + (i - result)) 
        2:  binary_op( *(first1 + (i - result), *(first2 + (i - result))

Returns :

    result + (last1 - first1)

Complexity :

    At most last1 - first1 applications of op or binary_op

--- */

template<class InputIterator, class OutputIterator, class UnaryFunction, class Predicate>
OutputIterator transform_if(InputIterator first, 
                            InputIterator last, 
                            OutputIterator result, 
                            UnaryFunction f, 
                            Predicate pred)
{
    for (; first != last; ++first)
    {
        if( pred(*first) )
            *result++ = f(*first);
    }
    return result; 
}

template<class InputIterator1, class InputIterator2, class OutputIterator, class BinaryOperation, class Predicate>
OutputIterator transform_if(InputIterator1 first1, 
                            InputIterator1 last1, 
                            InputIterator2 first2, 
                            OutputIterator result, 
                            BinaryOperation binary_op, 
                            Predicate pred)
{
    for (; first1 != last1 ; ++first1, ++first2)
    {
        if( pred(*first1) )
            *result++ = binary_op(*first1,*first2);
    }
    return result;
}
小巷里的女流氓 2024-07-11 19:27:31

我通过创建两个类来创建配置文件:cProfilecProfileManager

cProfileManager 将保存 cProfile 生成的所有数据。

cProfile 具有以下要求:

  • cProfile 有一个初始化当前时间的构造函数。
  • cProfile 有一个解构函数,它将类的总生存时间发送给 cProfileManager

要使用这些配置文件类,我首先创建一个 cProfileManager 实例。 然后,我将要分析的代码块放在大括号内。 在大括号内,我创建了一个 cProfile 实例。 当代码块结束时,cProfile 会将代码块完成所需的时间发送给 cProfileManager

示例代码
下面是代码示例(简化):

class cProfile
{
    cProfile()
    {
        TimeStart = GetTime();
    };

    ~cProfile()
    {
        ProfileManager->AddProfile (GetTime() - TimeStart);
    }

    float TimeStart;
}

要使用 cProfile,我会执行以下操作:

int main()
{
    printf("Start test");
    {
        cProfile Profile;
        Calculate();
    }
    ProfileManager->OutputData();
}

或者这样:

void foobar()
{
    cProfile ProfileFoobar;

    foo();
    {
        cProfile ProfileBarCheck;
        while (bar())
        {
            cProfile ProfileSpam;
            spam();
        }
    }
}

技术说明

此代码实际上滥用了范围界定方式、构造函数和解构函数在 C++ 中工作。 cProfile 仅存在于块作用域(我们要测试的代码块)内。 一旦程序离开块作用域,cProfile 就会记录结果。

其他增强

  • 您可以向构造函数添加一个字符串参数,这样您就可以执行如下操作:
    cProfile Profile("复杂计算的配置文件");

  • 您可以使用宏使代码看起来更干净(注意不要滥用这一点。与我们对该语言的其他滥用不同,宏在使用时可能会很危险)。

    示例:

    #define START_PROFILE cProfile Profile(); {
    #define END_PROFILE }

  • cProfileManager 可以检查一段代码被调用了多少次。 但是您需要代码块的标识符。 第一个增强功能可以帮助识别块。 当您想要分析的代码位于循环内时(如上面的第二个示例),这可能很有用。 您还可以添加代码块所花费的平均、最快和最长执行时间。

  • 如果您处于调试模式,请不要忘记添加一个检查以跳过分析。

    如果

I do my profiles by creating two classes: cProfile and cProfileManager.

cProfileManager will hold all the data that resulted from cProfile.

cProfile with have the following requirements:

  • cProfile has a constructor which initializes the current time.
  • cProfile has a deconstructor which sends the total time the class was alive to cProfileManager

To use these profile classes, I first make an instance of cProfileManager. Then, I put the code block, which I want to profile, inside curly braces. Inside the curly braces, I create a cProfile instance. When the code block ends, cProfile will send the time it took for the block of code to finish to cProfileManager.

Example Code
Here's an example of the code (simplified):

class cProfile
{
    cProfile()
    {
        TimeStart = GetTime();
    };

    ~cProfile()
    {
        ProfileManager->AddProfile (GetTime() - TimeStart);
    }

    float TimeStart;
}

To use cProfile, I would do something like this:

int main()
{
    printf("Start test");
    {
        cProfile Profile;
        Calculate();
    }
    ProfileManager->OutputData();
}

or this:

void foobar()
{
    cProfile ProfileFoobar;

    foo();
    {
        cProfile ProfileBarCheck;
        while (bar())
        {
            cProfile ProfileSpam;
            spam();
        }
    }
}

Technical Note

This code is actually an abuse of the way scoping, constructors and deconstructors work in C++. cProfile exists only inside the block scope (the code block we want to test). Once the program leaves the block scope, cProfile records the result.

Additional Enhancements

  • You can add a string parameter to the constructor so you can do something like this:
    cProfile Profile("Profile for complicated calculation");

  • You can use a macro to make the code look cleaner (be careful not to abuse this. Unlike our other abuses on the language, macros can be dangerous when used).

    Example:

    #define START_PROFILE cProfile Profile(); {
    #define END_PROFILE }

  • cProfileManager can check how many times a block of code is called. But you would need an identifier for the block of code. The first enhancement can help identify the block. This can be useful in cases where the code you want to profile is inside a loop (like the second example aboe). You can also add the average, fastest and longest execution time the code block took.

  • Don't forget to add a check to skip profiling if you are in debug mode.

守望孤独 2024-07-11 19:27:31

这种方法有一些局限性,但我仍然发现它非常有用。 我将预先列出(我知道的)限制,并让任何想要使用它的人自行承担风险。

  1. 我发布的原始版本过度报告了递归调用所花费的时间(如答案的评论中所指出的)。
  2. 它不是线程安全的,在我添加忽略递归的代码之前它不是线程安全的,现在它的线程安全性更低。
  3. 虽然如果调用多次(数百万次),它会非常有效,但它会对结果产生可测量的影响,因此您测量的范围将比不测量的范围花费更长的时间。

当手头的问题无法证明对所有代码进行分析或我从分析器获取一些我想要验证的数据时,我会使用此类。 基本上它总结了您在特定块中花费的时间,并在程序结束时将其输出到调试流(可通过 DbgView),包括代码执行的次数(当然还有平均花费的时间))。

#pragma once
#include <tchar.h>
#include <windows.h>
#include <sstream>
#include <boost/noncopyable.hpp>

namespace scope_timer {
    class time_collector : boost::noncopyable {
        __int64 total;
        LARGE_INTEGER start;
        size_t times;
        const TCHAR* name;

        double cpu_frequency()
        { // cache the CPU frequency, which doesn't change.
            static double ret = 0; // store as double so devision later on is floating point and not truncating
            if (ret == 0) {
                LARGE_INTEGER freq;
                QueryPerformanceFrequency(&freq);
                ret = static_cast<double>(freq.QuadPart);
            }
            return ret;
        }
        bool in_use;

    public:
        time_collector(const TCHAR* n)
            : times(0)
            , name(n)
            , total(0)
            , start(LARGE_INTEGER())
            , in_use(false)
        {
        }

        ~time_collector()
        {
            std::basic_ostringstream<TCHAR> msg;
            msg << _T("scope_timer> ") <<  name << _T(" called: ");

            double seconds = total / cpu_frequency();
            double average = seconds / times;

            msg << times << _T(" times total time: ") << seconds << _T(" seconds  ")
                << _T(" (avg ") << average <<_T(")\n");
            OutputDebugString(msg.str().c_str());
        }

        void add_time(__int64 ticks)
        {
            total += ticks;
            ++times;
            in_use = false;
        }

        bool aquire()
        {
            if (in_use)
                return false;
            in_use = true;
            return true;
        }
    };

    class one_time : boost::noncopyable {
        LARGE_INTEGER start;
        time_collector* collector;
    public:
        one_time(time_collector& tc)
        {
            if (tc.aquire()) {
                collector = &tc;
                QueryPerformanceCounter(&start);
            }
            else
                collector = 0;
        }

        ~one_time()
        {
            if (collector) {
                LARGE_INTEGER end;
                QueryPerformanceCounter(&end);
                collector->add_time(end.QuadPart - start.QuadPart);
            }
        }
    };
}

// Usage TIME_THIS_SCOPE(XX); where XX is a C variable name (can begin with a number)
#define TIME_THIS_SCOPE(name) \
    static scope_timer::time_collector st_time_collector_##name(_T(#name)); \
    scope_timer::one_time st_one_time_##name(st_time_collector_##name)

This method has several limitations, but I still find it very useful. I'll list the limitations (I know of) up front and let whoever wants to use it do so at their own risk.

  1. The original version I posted over-reported time spent in recursive calls (as pointed out in the comments to the answer).
  2. It's not thread safe, it wasn't thread safe before I added the code to ignore recursion and it's even less thread safe now.
  3. Although it's very efficient if it's called many times (millions), it will have a measurable effect on the outcome so that scopes you measure will take longer than those you don't.

I use this class when the problem at hand doesn't justify profiling all my code or I get some data from a profiler that I want to verify. Basically it sums up the time you spent in a specific block and at the end of the program outputs it to the debug stream (viewable with DbgView), including how many times the code was executed (and the average time spent of course)).

#pragma once
#include <tchar.h>
#include <windows.h>
#include <sstream>
#include <boost/noncopyable.hpp>

namespace scope_timer {
    class time_collector : boost::noncopyable {
        __int64 total;
        LARGE_INTEGER start;
        size_t times;
        const TCHAR* name;

        double cpu_frequency()
        { // cache the CPU frequency, which doesn't change.
            static double ret = 0; // store as double so devision later on is floating point and not truncating
            if (ret == 0) {
                LARGE_INTEGER freq;
                QueryPerformanceFrequency(&freq);
                ret = static_cast<double>(freq.QuadPart);
            }
            return ret;
        }
        bool in_use;

    public:
        time_collector(const TCHAR* n)
            : times(0)
            , name(n)
            , total(0)
            , start(LARGE_INTEGER())
            , in_use(false)
        {
        }

        ~time_collector()
        {
            std::basic_ostringstream<TCHAR> msg;
            msg << _T("scope_timer> ") <<  name << _T(" called: ");

            double seconds = total / cpu_frequency();
            double average = seconds / times;

            msg << times << _T(" times total time: ") << seconds << _T(" seconds  ")
                << _T(" (avg ") << average <<_T(")\n");
            OutputDebugString(msg.str().c_str());
        }

        void add_time(__int64 ticks)
        {
            total += ticks;
            ++times;
            in_use = false;
        }

        bool aquire()
        {
            if (in_use)
                return false;
            in_use = true;
            return true;
        }
    };

    class one_time : boost::noncopyable {
        LARGE_INTEGER start;
        time_collector* collector;
    public:
        one_time(time_collector& tc)
        {
            if (tc.aquire()) {
                collector = &tc;
                QueryPerformanceCounter(&start);
            }
            else
                collector = 0;
        }

        ~one_time()
        {
            if (collector) {
                LARGE_INTEGER end;
                QueryPerformanceCounter(&end);
                collector->add_time(end.QuadPart - start.QuadPart);
            }
        }
    };
}

// Usage TIME_THIS_SCOPE(XX); where XX is a C variable name (can begin with a number)
#define TIME_THIS_SCOPE(name) \
    static scope_timer::time_collector st_time_collector_##name(_T(#name)); \
    scope_timer::one_time st_one_time_##name(st_time_collector_##name)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文