python多处理随着内存的最大化而停止

发布于 2025-01-19 12:33:14 字数 2598 浏览 1 评论 0原文

我正在运行此代码：

    # Get big (0.5GB) list of data
    all_ftrajs = get_feature_trajs(traj_top_paths, hp_dict)
    # Function I want to bootstrap
    bs_func = partial(bs_func, all_ftrajs=all_ftrajs)

    rng = get_rng(seed)
    n_workers = min(n_cores, bs_samples)
    results = []
    if n_workers > 1:
        with Pool(n_workers) as pool:

            for i in range(bs_samples):
                # bootstrap list indices 
                _, bs_ix = sample_trajectories(all_ftrajs, rng, bs_samples > 1)
                
                # accumulate results
                results.append(pool.apply_async(func=bs_func,
                                                args=(hp_dict, bs_ix, seed,
                                                      bs_dir.joinpath(f"{i}.pkl"), hp_idx),
                                                kwds=kwargs))
            # Get results
            for r in results:
                r.get()

            # close off pool
            pool.close()
            pool.join()

此特定的代码引导程序通过分析numpy数组的列表 - all_ftrajs - 在引导其元素之后，通过分析numpy阵列的列表 - ally_ftrajs，通过100次分析（bs_samples = 100）

def bs_func(hp_dict: Dict[str, List[Union[str, int]]],
             bs_ix: np.ndarray, seed: Union[int, None],
             out_dir: Path, hp_idx: int,
             lags: List[int], all_ftrajs: List[np.ndarray]):
    # Bootstrap the list of numpy arrays
    feat_trajs = [all_ftrajs[i] for i in bs_ix]

    # do the analysis
    try:
        tica, kmeans = discretize_trajectories(hp_dict, feat_trajs, seed)
        disc_trajs = kmeans.dtrajs
        mods_by_lag = estimate_msms(disc_trajs, lags)
        outputs = score_msms(mods_by_lag)
        outputs.ix = hp_idx
        write_outputs(outputs, out_dir)
    except Exception as e:
        logging.info(e)
    return True

。在一系列不同的实验（试验）上进行自举。经过少量试验后，程序冻结：

所有过程仍然活跃。我的机器上有12个逻辑内核，并且使用了2个尺寸的2。
交换内存和RAM最大化（分别为2/16GB）。
在串行中运行此操作时，所有CPU的使用量为0.0％

- 根据HTOP，单个过程使用不超过10％的内存。

我尝试过：

将大数据列表作为参数传递到bs_func。我使用partial构造灵感来自此问题的答案：多处理中的内存对象
在串行中运行它（这项工作）

我正在运行Python 3.8，并且我在Ubuntu 20.04.4 LTS，64位。我有15.5GIB的内存，并且我使用的是AMD Ryzen 5 3600 6核处理器X 12（从我的“大约”部分复制了）

问题：

我的代码是否有明显的修复来使其正常工作？
还有其他解决方案不需要太多的代码重构吗？

谢谢，最良好的祝福，

罗布

原文

I'm running this code:

    # Get big (0.5GB) list of data
    all_ftrajs = get_feature_trajs(traj_top_paths, hp_dict)
    # Function I want to bootstrap
    bs_func = partial(bs_func, all_ftrajs=all_ftrajs)

    rng = get_rng(seed)
    n_workers = min(n_cores, bs_samples)
    results = []
    if n_workers > 1:
        with Pool(n_workers) as pool:

            for i in range(bs_samples):
                # bootstrap list indices 
                _, bs_ix = sample_trajectories(all_ftrajs, rng, bs_samples > 1)
                
                # accumulate results
                results.append(pool.apply_async(func=bs_func,
                                                args=(hp_dict, bs_ix, seed,
                                                      bs_dir.joinpath(f"{i}.pkl"), hp_idx),
                                                kwds=kwargs))
            # Get results
            for r in results:
                r.get()

            # close off pool
            pool.close()
            pool.join()

This particular code bootstraps some analysis 100 times (bs_samples = 100) by analysing a list of numpy arrays - all_ftrajs - after bootstrapping its elements:

def bs_func(hp_dict: Dict[str, List[Union[str, int]]],
             bs_ix: np.ndarray, seed: Union[int, None],
             out_dir: Path, hp_idx: int,
             lags: List[int], all_ftrajs: List[np.ndarray]):
    # Bootstrap the list of numpy arrays
    feat_trajs = [all_ftrajs[i] for i in bs_ix]

    # do the analysis
    try:
        tica, kmeans = discretize_trajectories(hp_dict, feat_trajs, seed)
        disc_trajs = kmeans.dtrajs
        mods_by_lag = estimate_msms(disc_trajs, lags)
        outputs = score_msms(mods_by_lag)
        outputs.ix = hp_idx
        write_outputs(outputs, out_dir)
    except Exception as e:
        logging.info(e)
    return True

I do this bootstrapping on a series of different experiments (trials). After a handful of trials the program freezes:

all processes are still active. I have 12 logical cores on my machine and I'm using a pool of size 2.
swap memory and RAM is maxed out (2/16GB respectively).
all cpu usage is 0.0%

when running this in serial - the single process uses no more than 10% memory according to htop.

I have tried:

passing the big data list as a parameter to bs_func. I use the partial construction inspired by the answer to this question: Shared-memory objects in multiprocessing
running it in serial (this works)

I'm running python 3.8 and I'm on ubuntu 20.04.4 LTS, 64bit. I've got 15.5GiB memory, and I'm using AMD Ryzen 5 3600 6-core processor x 12 (this is copied from my 'about' section)

Questions: