c++多线程优化
在我的代码中,我有 2/4 线程执行蒙特卡洛模拟。他们每个人都进行了多次实验,并将结果收集到一个 stl 向量中。 我的问题是这样的:假设每个线程依次运行 1000 个实验。一次将结果存储到共享向量中更好,还是每隔一段时间将结果存储到共享向量中更好?如果他们等到拥有一些一致的数据量,写入向量将花费更长的时间,所以我不确定第二个解决方案是否一定比第一个解决方案更好。
PS每次实验都是数值计算,所以没有IO操作。
谢谢
in my code I have 2/4 threads performing montecarlo simulations. Each of them runs a number of experiments and they all collect the results into a stl vector.
My question is this: suppose each thread runs 1000 experiments sequentially. Is is better to store the result into the shared vector one at the time, or every once in a while? If they wait until they have some consistent amount of data, writing into the vector will take longer, so I'm not sure whether the second solution is necessarily better than the first one.
PS each experiment is numerical computation, so no IO operations.
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果要等到计算完所有结果后再使用任何结果,请为向量中的 4,000 个结果预分配空间,并让每个线程写入向量中的一个元素范围。不需要锁定,因为没有两个线程访问向量中的同一元素。
如果您想在计算结果时使用结果,请使用某种并发队列数据结构而不是向量。
If you are going to wait until all the results are computed before you use any of the results, preallocate space for 4,000 results in the vector and have each thread write into one range of elements in the vector. No locking is required because no two threads access the same element in the vector.
If you want to use the results as they are computed, use some sort of a concurrent queue data structure instead of a vector.
如果你只在向量中放入 2000 到 4000 个元素,我怀疑这两种方式都会产生很大的差异。
做对算法来说最自然的事情。如果这还不够好,请考虑以其他方式进行。
经过一番思考后,让每个线程将结果存储到本地向量,然后将本地向量的内容复制到“全局”向量(受锁保护),可能会达到两个目的(简单和速度)线程完成。当然,只要等待结果的任何内容都可以等到线程完全完成后再获取更新。
If you're only putting 2000 to 4000 elements in the vector I doubt it would make much of a difference either way.
Do whatever is most natural for the algorithm. If that doesn't work well enough look into doing it the other way.
After thinking about it for a bit, it might serve both purposes (simplicity and speed) to have each thread store results to a local vector then copy the contents of the local vector to the 'global' vector (protected by a lock) when the thread is done. Of course, that's as long as whatever's waiting for the results can wait until a thread is fully finished before getting an update.
在这里,单链表可能是比向量更好的选择。
如果只有一个线程读取和一个线程写入 fifo .. 您不需要任何同步。技巧是在列表中始终保留至少一个“虚拟”元素,并且如果 head == tail 则 fifo 为空。可以操纵头指针和尾指针进行压入和弹出,这样就不需要同步..
使用这个..你可以制作几个Q..这将不需要任何同步
如果新建/删除需要时间..您可以使用 Q 来保存可重用元素。
祝你好运。
记住.. 恰好是一位读者,并且恰好是一位作家.. 不多也不少。
诀窍是创建很多像这样的Q,Q也可以回收对象..并且
你不需要任何线程同步的东西...
如果你的 Q 确实运行为空..只需要 sleep() /wakeup() 功能。
如果我还没有说过……恰好是一位读者,恰好是一位作家。
a singly linked list may be a better choice than vector here.
If there is only one thread reading and one thread writing to a fifo .. you don't need any synchronization . The trick is to keep at least one 'dummy' element always in the list, and fifo is empty if head == tail . The head and tail pointers can be manipulated for push and pop, such that there is no need for synchronization..
Using this .. you can make several Q's .. which will not need any synchronization
If new/delete is taking time .. you can have Q's to hold reusable elements.
best of luck .
remember .. Exactly one reader, and Exactly one writer .. no more, no less .
the trick is createa LOT of Q's like this , Q to recycle objects also .. and
you'll not need any thread synchronization stuff ...
If your Q's do run empty .. just a sleep() / wakeup() functionality is needed.
and in case i haven't already said .. Exactly one reader, and Exactly one writer.