使用 python 多重处理,每个进程具有不同的随机种子

发布于 2025-01-04 02:48:38 字数 1004 浏览 1 评论 0原文

我希望并行运行多个模拟实例,但每个模拟都有自己独立的数据集。

目前我的实现如下:

P = mp.Pool(ncpus) # Generate pool of workers
for j in range(nrun): # Generate processes
    sim = MDF.Simulation(tstep, temp, time, writeout, boundaryxy, boundaryz, relax, insert, lat,savetemp)
    lattice = MDF.Lattice(tstep, temp, time, writeout, boundaryxy, boundaryz, relax, insert, lat, kb, ks, kbs, a, p, q, massL, randinit, initvel, parangle,scaletemp,savetemp)
    adatom1 = MDF.Adatom(tstep, temp, time, writeout, boundaryxy, boundaryz, relax, insert, lat, ra, massa, amorse, bmorse, r0, z0, name, lattice, samplerate,savetemp)        
    P.apply_async(run,(j,sim,lattice,adatom1),callback=After) # run simulation and ISF analysis in each process
P.close()
P.join() # start processes  

其中 simadatom1lattice 是传递给启动函数 run 的对象模拟。

然而,我最近发现,我同时运行的每个批次(即,每个 ncpus 都超出了总 nrun 模拟运行)给出了完全相同的结果。

这里有人可以告诉我如何解决这个问题吗?

I wish to run several instances of a simulation in parallel, but with each simulation having its own independent data set.

Currently I implement this as follows:

P = mp.Pool(ncpus) # Generate pool of workers
for j in range(nrun): # Generate processes
    sim = MDF.Simulation(tstep, temp, time, writeout, boundaryxy, boundaryz, relax, insert, lat,savetemp)
    lattice = MDF.Lattice(tstep, temp, time, writeout, boundaryxy, boundaryz, relax, insert, lat, kb, ks, kbs, a, p, q, massL, randinit, initvel, parangle,scaletemp,savetemp)
    adatom1 = MDF.Adatom(tstep, temp, time, writeout, boundaryxy, boundaryz, relax, insert, lat, ra, massa, amorse, bmorse, r0, z0, name, lattice, samplerate,savetemp)        
    P.apply_async(run,(j,sim,lattice,adatom1),callback=After) # run simulation and ISF analysis in each process
P.close()
P.join() # start processes  

where sim, adatom1 and lattice are objects passed to the function run which initiates the simulation.

However, I recently found out that each batch I run simultaneously (that is, each ncpus runs out of the total nrun of simulations runs) gives the exact same results.

Can someone here enlighten how to fix this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

寄居人 2025-01-11 02:48:38

只是想我会添加一个实际的答案以使其他人清楚。

引用aix 在此问题中的答案:

发生的情况是,在 Unix 上,每个工作进程都继承相同的
来自父进程的随机数生成器的状态。这是
为什么它们会生成相同的伪随机序列。

使用 random.seed() 方法(或 scipy/numpy 等效方法)设置正确播种。另请参阅此 numpy 线程

Just thought I would add an actual answer to make it clear for others.

Quoting the answer from aix in this question:

What happens is that on Unix every worker process inherits the same
state of the random number generator from the parent process. This is
why they generate identical pseudo-random sequences.

Use the random.seed() method (or the scipy/numpy equivalent) to set the seed properly. See also this numpy thread.

煞人兵器 2025-01-11 02:48:38

这是一个未解决的问题。尝试为每个进程生成唯一的种子。您可以将以下代码添加到函数的开头来解决该问题。

np.random.seed((os.getpid() * int(time.time())) % 123456789)

This is an unsolved problem. Try to generate a unique seed for each process. You can add below code to beginning of your function to overcome the issue.

np.random.seed((os.getpid() * int(time.time())) % 123456789)
一腔孤↑勇 2025-01-11 02:48:38

该问题的解决方案是在函数run中使用scipy.random.seed(),它为调用的随机函数分配一个新的种子运行

类似的问题(我从中获得了解决方案)可以在 multiprocessing.Pool 似乎可以在 Windows 中运行,但不能在 ubuntu 中运行?

A solution for the problem was to use scipy.random.seed() in the function run which assign a new seed for random functions called from run.

A similar problem (from which i obtained the solution) can be found in multiprocessing.Pool seems to work in Windows but not in ubuntu?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文