`multiprocessing.pool.starmap()` works wrong when I want to write my custom vector env for DRL

18 Views Asked by Aramiis At 27 March 2024 at 11:40

I try to write a customized vectorized environment for DRL algorithm so I can monitor some information I concern.

I initialize the multiprocessing pool when the class CustomVectorEnv is initialized, with the same num of environment number.

However, I met a bug. when I call function step_wrapper, everything seems normal, the environment interacts correctly with my algorithm, but when the code results = self.pool.starmap(step_wrapper, zip(self.envs, self._actions)) finishes in function step, all environments become the same as when it was initialized, and the states are not updated.

I have checked that function reset has never been called, no new environments are created, just the environments initialized along with class CustomVectorEnv, and the environments are just refreshed after the line of code I mentioned but normall within the function stp_wrapper.

What may be the problem?
I would appreciate it if you could provide me any imformation.

import multiprocessing as mp
import gymnasium as gym
import numpy as np
from gymnasium.vector.utils import concatenate, create_empty_array, iterate
from gymnasium.vector.vector_env import VectorEnv

def step_wrapper(env, action):
    results = env.step(action.cpu())
    return results

def make_env_single(env_id, idx, render_mode=None, FD=None):
    env = gym.make(env_id, render_mode=render_mode, FD=FD)
    env = gym.wrappers.FlattenObservation(env)  # deal with dm_control's Dict observation space
    env = gym.wrappers.RecordEpisodeStatistics(env)
    env = gym.wrappers.ClipAction(env)

    return env

class CustomVectorEnv(VectorEnv):
    def __init__(
            self,
            env_id, capture_video, run_name, gamma, num_envs=1,
            observation_space: Space = None,
            action_space: Space = None,
            copy: bool = True,
    ):
        self.envs = [make_env_single(env_id, i) for i in range(num_envs)]
        ...

        self.pool = mp.Pool(processes=self.num_envs)

    def step(self, actions):
        self._actions = iterate(self.action_space, actions)
        observations, rewards, terminateds, truncateds, infos = [], [], [], [], {}

        #the line where error occurs
        results = self.pool.starmap(step_wrapper, zip(self.envs, self._actions))
        
        ...

Original Q&A

`multiprocessing.pool.starmap()` works wrong when I want to write my custom vector env for DRL

There are 0 best solutions below

Related Questions in PYTHON

Related Questions in MULTIPROCESSING

Related Questions in REINFORCEMENT-LEARNING

Related Questions in OPENAI-GYM

Trending Questions

Popular # Hahtags

Popular Questions