I try to write a customized vectorized environment for DRL algorithm so I can monitor some information I concern.
I initialize the multiprocessing pool when the class CustomVectorEnv is initialized, with the same num of environment number.
However, I met a bug. when I call function step_wrapper, everything seems normal, the environment interacts correctly with my algorithm, but when the code results = self.pool.starmap(step_wrapper, zip(self.envs, self._actions)) finishes in function step, all environments become the same as when it was initialized, and the states are not updated.
I have checked that function reset has never been called, no new environments are created, just the environments initialized along with class CustomVectorEnv, and the environments are just refreshed after the line of code I mentioned but normall within the function stp_wrapper.
What may be the problem?
I would appreciate it if you could provide me any imformation.
import multiprocessing as mp
import gymnasium as gym
import numpy as np
from gymnasium.vector.utils import concatenate, create_empty_array, iterate
from gymnasium.vector.vector_env import VectorEnv
def step_wrapper(env, action):
results = env.step(action.cpu())
return results
def make_env_single(env_id, idx, render_mode=None, FD=None):
env = gym.make(env_id, render_mode=render_mode, FD=FD)
env = gym.wrappers.FlattenObservation(env) # deal with dm_control's Dict observation space
env = gym.wrappers.RecordEpisodeStatistics(env)
env = gym.wrappers.ClipAction(env)
return env
class CustomVectorEnv(VectorEnv):
def __init__(
self,
env_id, capture_video, run_name, gamma, num_envs=1,
observation_space: Space = None,
action_space: Space = None,
copy: bool = True,
):
self.envs = [make_env_single(env_id, i) for i in range(num_envs)]
...
self.pool = mp.Pool(processes=self.num_envs)
def step(self, actions):
self._actions = iterate(self.action_space, actions)
observations, rewards, terminateds, truncateds, infos = [], [], [], [], {}
#the line where error occurs
results = self.pool.starmap(step_wrapper, zip(self.envs, self._actions))
...