提高 System.String 到 std::wstring 转换的性能？

发布于 2024-10-20 03:28:47 字数 2174 浏览 1 评论 0原文

我目前正在评估 ADO.NET 在当前使用普通旧式 ADO 的 C++ 应用程序中的使用情况。鉴于我们正在重做整个数据库交互，我们希望确定使用更现代、更积极开发的 ADO.NET 技术是否有益。

经过一些测量后发现，对于某些检索大量行和少数列（全部包含字符串）的测试查询，ADO.NET 实际上比使用普通 ADO 慢 20% 左右。我们的探查器表明，将 System.String 结果转换为应用程序使用的 std::wstring 是瓶颈之一。我无法将应用程序的任何上层切换为使用 System.String，因此我们陷入了这种特定的转换。

代码的大致轮廓如下：

System::Data::SqlClient::SqlCommand^ sqlCmd =
  gcnew System::Data::SqlClient::SqlCommand(cmd, m_DBConnection.get());
System::Data::SqlClient::SqlDataReader^ reader = sqlCmd->ExecuteReader();
if (reader->HasRows)
{
    using namespace msclr::interop;
    while (reader->Read())
    {
      std::vector<std::wstring> results;
      for (int i=0; i < reader->FieldCount; ++i)
      {
        std::wstring col_data;
        TypeCode type = Type::GetTypeCode(reader->GetFieldType(i));
        switch (type)
        {
           // ... omit lots of different types
        case TypeCode::String:
          {
            System::String^ tmp = reader->GetString(i);
            col_data = marshal_as<std::wstring>(tmp);
          }
          break;
          // ... more type conversion code removed
        }
        results.push_back(col_data);
      }
      // NOTE: Callback into native result processing code
      ResultsCallback(results);
    }

我花了很多时间阅读从 System.String 中获取 std::wstring 的各种方法，并测量了其中的大部分方法。它们的表现似乎都大致相似——我们谈论的是 CPU 使用百分比的小数点。最后，我简单地选择使用 marshal_as 因为它是最具可读性的，并且看起来与其他解决方案一样高效（即使用 PtrToStringChars 或MSDN 此处中描述的方法）。

从概念的角度来看，使用 DataReader 效果非常好，因为我们对数据所做的大部分处理都是面向行的。

我注意到的唯一另一个稍微出乎意料的瓶颈是结果列的 TypeCode 检索；我已经计划将其移到主结果处理循环之外，并且每个查询结果仅检索一次类型代码。

经过如此冗长的介绍，任何人都可以推荐一种成本较低的方法将字符串数据从 System.String 转换为 std::wstring 或者我已经在寻找最佳方案这里的表现？鉴于我已经尝试过所有普通的方法，我显然更寻找稍微不寻常的方法...

编辑：看起来我在这里陷入了自己制造的陷阱。是的，上面的代码比调试模式下的等效纯 ADO 代码慢大约 20%。然而，将其切换到“Release”模式时，瓶颈仍然是可测量的，但上面的 ADO.NET 代码突然比旧的 ADO 代码快了近 50%。因此，虽然我仍然有点担心字符串转换的成本，但它在发布模式下并不像第一次出现时那么大。

原文

I'm currently evaluating the use of ADO.NET for a C++ application that currently uses plain old ADO. Given that we're redoing the whole database interaction, we'd like to determine if using the more modern and actively developed technology of ADO.NET would be beneficial.

After some measurements it appears that for certain test queries that retrieve a lot of rows with few columns that all contain strings, ADO.NET is actually about 20% slower for us than using plain ADO. Our profiler suggests that the conversion of System.String results into the std::wstring used by the application is one of the bottlenecks. I can't switch any of the upper layers of the application to using System.String, so we are stuck with this particular conversion.

A rough outline of the code looks like this:

System::Data::SqlClient::SqlCommand^ sqlCmd =
  gcnew System::Data::SqlClient::SqlCommand(cmd, m_DBConnection.get());
System::Data::SqlClient::SqlDataReader^ reader = sqlCmd->ExecuteReader();
if (reader->HasRows)
{
    using namespace msclr::interop;
    while (reader->Read())
    {
      std::vector<std::wstring> results;
      for (int i=0; i < reader->FieldCount; ++i)
      {
        std::wstring col_data;
        TypeCode type = Type::GetTypeCode(reader->GetFieldType(i));
        switch (type)
        {
           // ... omit lots of different types
        case TypeCode::String:
          {
            System::String^ tmp = reader->GetString(i);
            col_data = marshal_as<std::wstring>(tmp);
          }
          break;
          // ... more type conversion code removed
        }
        results.push_back(col_data);
      }
      // NOTE: Callback into native result processing code
      ResultsCallback(results);
    }

I've spent a lot of time reading up on the various ways of getting a std::wstring out of the System.String and measured most of them. They all seem to perform roughly similar - we're talking decimal points in the percentage of CPU usage. In the end I simply settled for using marshal_as<std::wstring> as it's the most readable and appears to be as performant as the other solutions (ie, using PtrToStringChars or the method described in MSDN here).

Using the DataReader works very well from a conceptual point of view as most of the processing we do on the data is row oriented anyway.

The only other slightly unexpected bottleneck I noticed is the retrieval of the TypeCode for the results columns; I'm already planning to move that outside the main results processing loop and only retrieve the type codes once per query result.

After this lengthy introduction, can anybody recommend a less costly way to convert the string data from a System.String to a std::wstring or am I already looking at the optimum performance here? I'm obviously more looking for slightly out of the ordinary ways given that I've already tried all the ordinary ones...

EDIT: Looks like I fell into a trap of my own making here. Yes, the code above is about 20% slower than the equivalent plain ADO code in Debug mode. However switching it into Release mode, the bottleneck is still measurable but the ADO.NET code above is suddenly almost 50% faster than the older ADO code. So while I'm still concerned a little about the cost of the string conversion, it's not as big in Release mode as it first appeared.

分享到QQ

分享到微博