你认为是什么让这个 C++ 代码慢? (它循环遍历 ADODB 记录集,将 COM 类型转换为字符串,并填充 ostringstream)

发布于 2024-07-08 08:45:33 字数 1881 浏览 7 评论 0原文

这个循环比我预期的要慢,而且我还不知道在哪里。 看到什么了吗?

我正在使用客户端游标读取 Accces DB。 当我有 127,000 行 20 列时,这个循环大约需要 10 秒。 这 20 列分别是 string、int 和 date 类型。 所有类型在放入 ostringstream 缓冲区之前都会转换为 ANSI 字符串。

void LoadRecordsetIntoStream(_RecordsetPtr& pRs, ostringstream& ostrm)
{
    ADODB::FieldsPtr pFields = pRs->Fields;
    char buf[80];
    ::SYSTEMTIME sysTime;
    _variant_t var;

    while(!pRs->EndOfFile) // loop through rows
    {
        for (long i = 0L; i < nColumns; i++)  // loop through columns
        {

            var = pFields->GetItem(i)->GetValue();

            if (V_VT(&var) == VT_BSTR)
            {
                ostrm << (const char*) (_bstr_t) var;   
            }
            else if (V_VT(&var) == VT_I4
            || V_VT(&var) == VT_UI1
            || V_VT(&var) == VT_I2
            || V_VT(&var) == VT_BOOL)
            {
                ostrm << itoa(((int)var),buf,10);
            }
            else if (V_VT(&var) == VT_DATE)
            {
                ::VariantTimeToSystemTime(var,&sysTime);
                _stprintf(buf, _T("%4d-%02d-%02d %02d:%02d:%02d"),
                sysTime.wYear, sysTime.wMonth, sysTime.wDay, 
                sysTime.wHour, sysTime.wMinute, sysTime.wSecond);

                ostrm << buf;
            }
        }

        pRs->MoveNext();
    }
}

编辑:经过更多实验...

我现在知道这一行使用了大约一半的时间:
var = pFields->GetItem(i)->GetValue();

如果我绕过 Microsoft 生成的 COM 包装器,我的代码会更快吗? 我的猜测是否定的。

另一半时间花在转换数据并将其流式传输到 ostringstream 的语句中。

当我写这篇文章时,我现在不知道是转换还是流媒体花费了更多时间。

如果我不使用 ostringstream 而是管理自己的缓冲区,用我自己的逻辑来增长缓冲区(重新分配、复制、删除),会更快吗? 如果我的逻辑进行悲观的猜测并预先为 ostringstream 缓冲区保留大量空间,会不会更快? 这些可能是值得尝试的实验。

最后是转换本身。 在我的时间安排中,这三个人都不是特别糟糕的。 一个答案说我的 itoa 可能比其他选择慢。 值得一看。

This loop is slower than I would expect, and I'm not sure where yet. See anything?

I'm reading an Accces DB, using client-side cursors. When I have 127,000 rows with 20 columns, this loop takes about 10 seconds. The 20 columns are string, int, and date types. All the types get converted to ANSI strings before they are put into the ostringstream buffer.

void LoadRecordsetIntoStream(_RecordsetPtr& pRs, ostringstream& ostrm)
{
    ADODB::FieldsPtr pFields = pRs->Fields;
    char buf[80];
    ::SYSTEMTIME sysTime;
    _variant_t var;

    while(!pRs->EndOfFile) // loop through rows
    {
        for (long i = 0L; i < nColumns; i++)  // loop through columns
        {

            var = pFields->GetItem(i)->GetValue();

            if (V_VT(&var) == VT_BSTR)
            {
                ostrm << (const char*) (_bstr_t) var;   
            }
            else if (V_VT(&var) == VT_I4
            || V_VT(&var) == VT_UI1
            || V_VT(&var) == VT_I2
            || V_VT(&var) == VT_BOOL)
            {
                ostrm << itoa(((int)var),buf,10);
            }
            else if (V_VT(&var) == VT_DATE)
            {
                ::VariantTimeToSystemTime(var,&sysTime);
                _stprintf(buf, _T("%4d-%02d-%02d %02d:%02d:%02d"),
                sysTime.wYear, sysTime.wMonth, sysTime.wDay, 
                sysTime.wHour, sysTime.wMinute, sysTime.wSecond);

                ostrm << buf;
            }
        }

        pRs->MoveNext();
    }
}

EDIT: After more experimentation...

I know now that about half the time is used by this line:
var = pFields->GetItem(i)->GetValue();

If I bypass the Microsoft generated COM wrappers, will my code be faster? My guess is no.

The othe half of the time is spent in the statements which convert data and stream it into the ostringstream.

I don't know right now as I write this whether it's the conversions or the streaming that is taking more time.

Would it be faster if I didn't use ostringstream and instead managed my own buffer, with my own logic to grow the buffer (re-alloc, copy, delete)? Would it be faster if my logic made a pessimistic guesstimate and reserved a lot of space for the ostringstream buffer up front? These might be experiments worth trying.

Finally, the conversions themselves. None of the three stand out in my timings as being bad. One answer says that my itoa might be slower than an alternative. Worth checking out.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

白况 2024-07-15 08:45:33

我无法从你的代码中看出,更熟悉 COM/ATL 的人可能有更好的答案。

通过尝试错误,我会通过注释掉内部循环操作来找到缓慢的代码,直到你看到性能峰值,然后你就有了罪魁祸首,应该关注它。

I can't tell from looking at your code, someone more familiar with COM/ATL may have a better answer.

By trial n error I would find the slow code by commenting out inner loop operations out until you see perf spike, then you have your culprit and should focus on that.

沩ん囻菔务 2024-07-15 08:45:33

我假设 V_VT 是一个函数 - 如果是这样,那么对于每个日期值,V_VT(&var) 被调用 6 次。 一个简单的优化是在本地存储 V_VT(&var) 的值以节省每次循环最多对该函数的 5 次调用。

如果您还没有这样做,请重新排序类型的 if 测试,将最常见的列类型放在第一位 - 这会减少所需的测试数量。

I assume V_VT is a function - if so, then for each date value, V_VT(&var) is called 6 times. A simple optimisation is to locally store the value of V_VT(&var) to save up to save up to 5 calls to this function each time around the loop.

If you haven't already done so, re-order the if tests for the types to put the most common column types first - this reduces the number of tests required.

柳若烟 2024-07-15 08:45:33

尝试注释掉for循环中的代码并比较时间。 一旦你阅读完毕,就开始取消各个部分的注释,直到遇到瓶颈。

Try commenting out the code in the for loop and comparing the time. Once you have a reading, start uncommenting various sections until you hit the bottle-neck.

和我恋爱吧 2024-07-15 08:45:33

其中一个很好的部分是 Access 不是服务器数据库 - 所有文件读/写、锁定、游标处理等都发生在客户端应用程序内(通过网络,对吧?)并且需要如此,如果其他用户同时打开数据库。

如果没有,您可能可以删除游标设置,并以只读方式打开数据库。

A good part of it is that Access is not a server database - all the file reads/writes, locking, cursor handling, etc. is taking place within the client application (across a network, right?) And needs to be so, if other users have the database open at the same time.

If not, you probably would be able to drop the cursor settings, and open the database read-only.

羁绊已千年 2024-07-15 08:45:33

作为一个基本想法,您应该尝试在仅进行 VT_BSTR 转换时查看代码的速度,然后使用 VT_DATE 转换,最后使用其他类型,看看哪种类型花费的时间最多。

我唯一的观察是 itoa 不是标准 C。正如您从 这篇文章

As a basic idea you should try to see the speed of the code when you have only VT_BSTR conversion, after that with VT_DATE and at last with the other types, see which is taking the most time.

The only observation I have is that itoa is not standard C. The implementation may be very slow as you can see from this article.

疯狂的代价 2024-07-15 08:45:33

尝试分析。 如果您没有分析器,一个简单的方法可能是将所有调用包装在循环中,您认为可能需要一些时间,如下所示:

#define TIME_CALL(x) \
do { \
  const DWORD t1 = timeGetTime();\
  x;\
  const DWORD t2 = timeGetTime();\
  std::cout << "Call to '" << #x << "' took " << (t2 - t1) << " ms.\n";\
}while(false)

所以现在您可以说:

TIME_CALL(var = pFields->GetItem(i)->GetValue());
TIME_CALL(ostrm << (const char*) (_bstr_t) var);

等等...

Try profiling. If you don't have a profiler a simple way could be to wrap all calls in your loop you think may take some time with something like the following:

#define TIME_CALL(x) \
do { \
  const DWORD t1 = timeGetTime();\
  x;\
  const DWORD t2 = timeGetTime();\
  std::cout << "Call to '" << #x << "' took " << (t2 - t1) << " ms.\n";\
}while(false)

So now you can say:

TIME_CALL(var = pFields->GetItem(i)->GetValue());
TIME_CALL(ostrm << (const char*) (_bstr_t) var);

and so on...

別甾虛僞 2024-07-15 08:45:33

您不需要 itoa - 您正在写入流。

You don't need the itoa - you're writing to a stream.

握住我的手 2024-07-15 08:45:33

为了回答你的新问题,我认为你应该利用这样一个事实:你可以让流格式化你的数据,而不是将其格式化为字符串,然后将该字符串传递给流,例如:

_stprintf(buf, _T("%4d-%02d-%02d %02d:%02d:%02d"),
                  sysTime.wYear, sysTime.wMonth, sysTime.wDay, 
                  sysTime.wHour, sysTime.wMinute, sysTime.wSecond);

ostrm << buf;

变成:

ostrm.fill('0');
ostrm.width(4);
ostrm << sysTime.wYear << _T("-");
ostrm.width(2);
ostrm << sysTime.wMonth;

等等...

To answer your new question I think you should use the fact that you can let the stream format your data instead of formatting it into a string and then pass that string to the stream, e.g.:

_stprintf(buf, _T("%4d-%02d-%02d %02d:%02d:%02d"),
                  sysTime.wYear, sysTime.wMonth, sysTime.wDay, 
                  sysTime.wHour, sysTime.wMinute, sysTime.wSecond);

ostrm << buf;

Turns into:

ostrm.fill('0');
ostrm.width(4);
ostrm << sysTime.wYear << _T("-");
ostrm.width(2);
ostrm << sysTime.wMonth;

And so on...

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文