多线程黑客

发布于 2024-09-15 15:02:12 字数 1045 浏览 4 评论 0原文

我正在编写一个数值程序，应该有点快，而且也是多线程的。我有一个代表数字的类，我想在其中使用随机数生成器。现在，我并不真的要求我的 RNG 是真正的 RNG，我只需要它生成在 0 和 NMAX 之间均匀分布的整数。

所以我在我的课堂上有这个：

// use just an int here and forget about multithreading.
static uint32 rand = NMAX/4; 
// this will be called multithreadedly
static uint32 GetRand() { return rand = ( rand + 1 ) % NMAX; }

现在，在单线程世界中，这完全适合我的目的。

由于这是多线程的，我假设唯一可能发生的坏事是偶尔（例如 <1% 的时间）更新会被丢弃。这意味着两个线程读取 rand，将其更新到寄存器中，返回更新后的值，然后使用相同的值写入两次。这完全没问题。

我的问题是还会发生比这更糟糕的事情吗？我完全同意每个线程使用自己的 rand 变量，但这只是一个巨大的痛苦。我绝对不能做的是让类的每个实例都使用自己的 rand 变量，因为那样会使用太多内存。

更新：

那么，我为什么要这样做呢？完整的故事是它的浮点类使用 1 或 2 个字节。所以它必须要快，这似乎是最好的方法。事实上，我想我会将其从 ( rand + 1 ) % NMAX 更新为 ( rand + [some prime] ) % NMAX 因为它似乎工作得更好。这是其中一种情况的示例，其中更健壮的解决方案需要更多代码，使事物变得不那么通用并且更加依赖，使代码不太清晰并且更容易破坏，而所有这些都是为了“应该使用适当的同步”的想法。

主要是我担心编译器可能会做一些奇怪的优化，这样对 rand 的更新不仅会被删除，而且 rand 会变成完全垃圾。现在我想了一下，即使也没关系（这个数字的使用方式），因为下一次使用 GetRand 无论如何都会 %NMAX 它，该错误最多只会导致一次使用GetRand 超出给定范围 [0,NMAX)。感谢您的任何答复。

原文

I am writing a numerical program that should be somewhat fast, and is also multithreaded. I have a class that represents a number, and I want to use in it a random number generator. Now, I don't really require my RNG to be a true RNG, I just need it to generate integers with even distribution between 0 and NMAX.

So I have this in my class:

// use just an int here and forget about multithreading.
static uint32 rand = NMAX/4; 
// this will be called multithreadedly
static uint32 GetRand() { return rand = ( rand + 1 ) % NMAX; }

Now, in a single threaded world, this is totally fine for my purposes.

Since this is multithreaded, I'm assuming that the only possible bad thing that could happen is that occasionally (like <1% of the time) the update gets dropped. Meaning two threads read rand, update it in a register, return the updated value, and then write it, twice, with the same value. This is totally fine.

My question is could anything worse than that happen? I'm totally fine with each thread using its own rand variable but that's just a huge pain to make happen. What I definitely can't do is make it so that each instance of the class uses its own rand variable, as that would use way too much memory.

UPDATE:

So, why do I want to do this? Full story is its a floating point class that uses 1 or 2 bytes. So it has to be fast and such, and this just seems the best way. In fact I think I'll update it from ( rand + 1 ) % NMAX to something like ( rand + [some prime] ) % NMAX since it seems to work better. This is an example of one of those cases where a more robust solution would require more code, make things less generic and more dependency ridden, make code less clear and easier to break, and all for the idea that "proper synchronization should be used".

Mostly I'm worried of some weird optimization that the compiler might do, so that an update to rand is not just dropped, but rand becomes total garbage. Now that I think about it though, even that would be okay (the way this number is used), since the next use of GetRand would %NMAX it anyway, the error would only cause at most one use of GetRand to be outside the given range of [0,NMAX). Thanks for any answers.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

小苏打饼 2024-09-22 15:02:12

我对每个线程都非常满意
使用自己的 rand 变量，但那就是
只是要实现这一点是一个巨大的痛苦。

这样做并不一定那么困难。某些编译器（例如 GCC）支持线程本地存储 ，它允许每个线程拥有自己的给定变量的副本。

话虽如此，除了你提到的问题之外，我只能想到你目前的做法存在一个问题。如果每个线程在不同的核心上运行，并且随机值存储在每个核心的非共享缓存中，则更新可能无法在无限期的时间内在核心之间传播。您可以通过使用内存屏障（可以通过使用锁创建）来避免这种情况，但这可能会损害性能。

回复收藏 0 原文

梦在深巷 2024-09-22 15:02:12

为了讨论的目的，我们假设以下实现：

使用 Mersenne Twister (mt19937)，它每次调用都会生成 624 个随机数的批次。
类的每个实例（仅在单个线程内使用）从批次中读取一个数字并递增全局索引计数器，以便下一次调用（来自任何实例）将获取批次中的下一个数字。当全局索引到达数组末尾时，RNG将被锁定，并且将生成新一批624个随机数，之后全局索引被重置。

我的改进建议是每个实例一次获取 16 个数字。这 16 个数字不必存储（复制）在实例内部：您只需将全局索引增加 16（使它们对其他实例不可用），以便实例可以一一使用它们。

回复收藏 0 原文

恍梦境° 2024-09-22 15:02:12

您可以使用 TLS，以便每个线程都有“自己的”变量。

：

__declspec(thread) static uint32 rand = NMAX/4; 
// this will be called multithreadedly
static uint32 GetRand() { return rand = ( rand + 1 ) % NMAX; }

在您的特定情况下，很容易修复代码以使其线程安全。

static long rand = NMAX/4; 
// this will be called multithreadedly
static uint32 GetRand() { return InterlockedIncrement(&rand) % NMAX; }

You may use TLS so that each thread will have "its own" variable.

__declspec(thread) static uint32 rand = NMAX/4; 
// this will be called multithreadedly
static uint32 GetRand() { return rand = ( rand + 1 ) % NMAX; }

In your particular case it's very easy to fix the code to make it thread-safe.

static long rand = NMAX/4; 
// this will be called multithreadedly
static uint32 GetRand() { return InterlockedIncrement(&rand) % NMAX; }

回复收藏 0 原文

笛声青案梦长安 2024-09-22 15:02:12

如果两个线程同时调用 GetRand，则可能会发生经典的不同步错误。例如，rand = 10。在两个线程调用 GetRand 后，rand 预计为 12，但实际上可能是 11。如果这对您来说没问题，则可以不更改此代码。但我认为最好使用同步，因为没有同步，代码和结果看起来都有点奇怪。另一种编程可能会认为这是错误。

编辑。

rand = ( rand + 1 ) % NMAX;

最坏的情况：两个或多个线程从内存中读取相同的 rand 变量 rand。每个线程在本地进行计算 ( rand + 1 ) % NMAX。然后所有线程将相同的结果放回内存。这不会破坏 rand 变量值，也不会导致该变量超出范围，并且数字生成器将继续计算有效数字。

If two threads call GetRand at the same time, the classic unsynchronized error may happen. For example, rand = 10. After two threads call GetRand, rand is expected to be 12, but actually it may be 11. If this OK for you, you can leave this code without changes. But I think it is better to use synchronization, because without it both the code and its result look a bit strange. Another programming can think that is is bug.

Edit.

rand = ( rand + 1 ) % NMAX;

Worst case: two or more threads read the same rand variable rand from the memory. Each thread locally makes calculation ( rand + 1 ) % NMAX. Then all threads put the same result back to the memory. This doesn't corrupt the rand variable value, this doesn't cause this variable to be out of scope and number generator will continue to compute valid numbers.

回复收藏 0 原文

~没有更多了~