需要加速 C++涉及 Boost 多索引和对 unordered_multimap 的查找的代码

发布于 2024-10-14 03:36:09 字数 2384 浏览 10 评论 0原文

我正在寻找加速基于代理的模型的策略，该模型基于 Host 类的对象，该模型的指针存储在 Boost 多索引容器中。我使用 Shark 来确定绝大多数时间是由函数 calcSI() 消耗的：

在此处输入图像描述

函数 calcSI() 必须为 Host 类的每个实例计算某些概率，这些概率取决于 Host 类的其他实例的属性。（Host 大约有 10,000-50,000 个实例，每个主机运行这些计算大约 25,600 次。）

如果我正确解释配置文件，则大部分时间都花在 calcSI() 转到 Host::isInfectedZ(int)，它只是对 InfectionMap 类型的 Boost unordered_multimap 中的某些实例进行计数：

struct Infection {
public:
  explicit Infection( double it, double rt ) : infT( it ), recT( rt ) {}
  double infT;
  double recT;
};
typedef boost::unordered_multimap< int, Infection > InfectionMap;

Host 的所有成员都包含 InfectionMap 运输，并且 Host::isInfectedZ(int) 只是计算 Infections 的数量与特定 int 键关联：

int Host::isInfectedZ( int z ) const {
  return carriage.count( z );
}

我无法找到有关 Boost 无序多重映射的 count 函数的成本的信息。我是否应该通过向 Host 添加一个单独的二维数组来跟踪每个键的实例数量（即与每个 相关的 Infections 数量）来增加开销>int)？
我想知道对 Boost 多索引进行更大规模的结构性改革（例如消除一两个不太需要的复合关键索引）是否会更有帮助。多索引的后台维护不会出现在分析器中，这（也许是愚蠢的）让我担心它可能很大。我在多重索引中有 8 个索引，其中大多数是ordered_non_unique。
是否有其他我应该关注但可能不会出现在探查器中的事情，或者我是否遗漏了探查器中的主要结果？

不幸的是，calcSI() 的并行化和多线程不是选项。

更新：了解 InfectionMap 运输 很少有超过 10 对且通常小于 5 对可能会有所帮助。

更新 2： 我尝试了上面 #1 中提出的策略，为每个 Host 提供一个数组 intcarriageSummary[ INIT_NUM_STYPES ]，该数组由z 的可能值（对于大多数模拟，可能的值小于 10 个）。每个条目的值跟踪对carriage所做的更改。 Host::isInfectedZ( int z ) 函数现在显示为：

int Host::isInfectedZ( int z ) const {
  //return carriage.count( z );
  return carriageSummary[ z ];
}

And the time dedicated to this function appears to have dropped substantially--I can't do an exact comparison right now: enter image description here

Obviously, using an array is kind of bulky but okay for small ranges of z. Would some other container (i.e., not an unordered_map) be more efficient for larger ranges?

也希望获得有关更改多索引的任何反馈。

原文

I'm looking for strategies to speed up an agent-based model that's based on objects of class Host, pointers to which are stored in a Boost multi-index container. I've used Shark to determine that the vast majority of the time is consumed by a function calcSI():

enter image description here

Function calcSI() has to compute for every instance of class Host certain probabilities that depend on attributes of other instances of class Host. (There are approximately 10,000-50,000 instances of Host, and these calculations are run for each host approximately 25,600 times.)

If I'm interpreting the profile correctly, the majority of the time spent in calcSI() goes to Host::isInfectedZ(int), which simply counts instances of something in a Boost unordered_multimap of type InfectionMap:

struct Infection {
public:
  explicit Infection( double it, double rt ) : infT( it ), recT( rt ) {}
  double infT;
  double recT;
};
typedef boost::unordered_multimap< int, Infection > InfectionMap;

All members of Host contain InfectionMap carriage, and Host::isInfectedZ(int) simply counts the number of Infections associated with a particular int key:

int Host::isInfectedZ( int z ) const {
  return carriage.count( z );
}

I'm having trouble finding information on how costly the count function is for Boost's unordered multimaps. Should I increase the overhead by adding to Host a separate two-dimensional array to track the number of instances of each key (i.e., the number of Infections associated with each int)?
I'm wondering if a larger structural overhaul of the Boost multi-index, like eliminating one or two less-needed composite key indices, would be more helpful. The background maintenance of the multi-index doesn't appear in the profiler, which (maybe stupidly) makes me worry it might be large. I have 8 indices in the multi-index, most of which are ordered_non_unique.
Are there other things I should be concerned with that might not appear in the profiler, or am I missing a major result from the profiler?

Parallelization and multithreading of calcSI() are unfortunately not options.

Update: It might be helpful to know that InfectionMap carriage rarely has more than 10 pairs and usually has <5.

Update 2: I tried the strategy proposed in #1 above, giving each Host an array int carriageSummary[ INIT_NUM_STYPES ], which is indexed by the possible values of z (for most simulations, there are <10 possible values). The value of each entry tracks changes made to carriage. The Host::isInfectedZ( int z ) function now reads:

int Host::isInfectedZ( int z ) const {
  //return carriage.count( z );
  return carriageSummary[ z ];
}

And the time dedicated to this function appears to have dropped substantially--I can't do an exact comparison right now:
enter image description here
Obviously, using an array is kind of bulky but okay for small ranges of z. Would some other container (i.e., not an unordered_map) be more efficient for larger ranges?

Would love any feedback on changing multi-index too.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

清风挽心 2024-10-21 03:36:09

就像您在 #1 中建议的那样，尝试在 Host::carriage unordered_multimap 旁边维护一个托架计数数组，并使它们保持“同步”。然后，您的 Host::isInfectedZ 将使用（希望）更快的托架计数数组：

int Host::isInfectedZ( int z ) const {
  return carriageCount[ z ];
}

如果可以传递到 isInfected 的整数范围很大，则使用关联数组作为托架计数。

您可以使用 std::map 或 boost::unordered 作为关联数组。对于查找，前者具有对数时间复杂度，后者具有恒定时间复杂度。但由于该关联数组通常非常小，因此 std::map 实际上可能更快。 std::map 也可能占用更少的空间开销。尝试两者并运行你的分析器来查看。我的赌注是 std::map。 :-)

编辑：

在看到您对我的评论的回答后，我建议使用常规的固定大小数组来表示托架计数。忘记关联数组的东西吧。

EDIT2：

您可能想要废弃

typedef boost::unordered_multimap< int, Infection > InfectionMap;

并汇总您自己的手写 InfectionMap 类，因为您正在处理如此小的索引。

对更新 #2 的回应：

很高兴看到您取得了进步。我怀疑您会找到一个比 16 个整数的固定数组“更小”的容器。 STL 和 boost 容器以块的形式分配内存，最终会与固定大小的数组一样大，即使它们的元素很少。

您可能对 boost::array 感兴趣，它在 C 风格的固定数组周围包装了类似 STL 的接口。这将使您更容易将固定大小的数组“交换”为 std::vector 或 std::map。

Like you suggested in #1, try maintaining a carriage count array next to the Host::carriage unordered_multimap and keep them both "synchronised". Your Host::isInfectedZ would then use the (hopefully) faster carriage count array:

int Host::isInfectedZ( int z ) const {
  return carriageCount[ z ];
}

If the range of integers that can be passed into isInfected is large, then use an associative array for your carriage count.

You can use std::map or boost::unordered for the associative array. For lookups, the former has logarithmic temporal complexity and the latter has constant temporal complexity. But since this associative array would be typically very small, std::map might actually be faster. std::map may also take up less space overhead. Try both and run your profiler to see. My bet is on std::map. :-)

EDIT:

Upon seeing your answer to my comment, I would suggest using a regular fixed-size array for the carriage count. Forget about the associative array stuff.

EDIT2:

You might want to scrap

typedef boost::unordered_multimap< int, Infection > InfectionMap;

and roll-up your own hand-written InfectionMap class, since you're dealing with such small indices.

RESPONSE TO UPDATE #2:

Glad to see you've made an improvement. I doubt you'll find an container that is "less bulky" than a fixed array of, say, 16 integers. STL and boost containers allocate memory in chunks and will end up as large as your fixed-size array, even if they have few elements.

You might be interested in boost::array, which wraps an STL-like interface around a C-style fixed array. This will make it easier to "swap-out" your fixed-size array for a std::vector or std::map.

回复收藏 0 原文

~没有更多了~