Weights and Biases not logging gradients properly with Stable-Baselines3

71 Views Asked by At

I am training a reinforcement learning model on a custom environment and logging with Weights & Biases. Everything seems to log properly, except the gradient and parameter histograms. No matter how frequently I log, the graphs are always constant over all timesteps.The X-axis also does not start at 0, strangely.

Gradient Histograms

However, I expect them to look something like this, where the gradients vary over time:

enter image description here

My model learns and its behavior changes greatly over time. I have tested many different hyper-parameters such as the learning rate, number of epochs to backprop over every rollout, gradient clipping, and more, and trained for over 100,000 episodes comprising millions of total steps. So, I don't think that the gradients are actually the exact same at every time step for literally every layer.

This should be the relevant part of my code:

    config = {
        "total_timesteps": model_parameters["n_steps"]*num_cpu*5,
        "log_interval": 1,
    }

    run = wandb.init(
        project="MyProject",
        sync_tensorboard=True,  # auto-upload sb3's tensorboard metrics
        save_code=True,  # optional
        name=run_name # optional
    )

    wandbCb = callback=WandbCallback(
        gradient_save_freq=1,
        model_save_path=f"models/{run_name}",
        verbose=2,
    )
    RewardCb = RewardCallback(eval_freq=model_parameters["n_steps"]*num_cpu)
    callbacks = CallbackList([
        wandbCb,
        RewardCb,
    ])

    print("Learning...")
    model.learn(total_timesteps=config["total_timesteps"],
                log_interval=config["log_interval"],
                progress_bar=True,
                callback=callbacks,
    )
    run.finish()

Does anyone have insight why these gradients don't change? Thank you.

0

There are 0 best solutions below