How to define an MDP as a python function?

98 Views Asked by At

I’m interested in defining a Markov Decision Process as a python function. It would need to interface with PyTorch API for reinforcement learning, however that constraint shapes the function’s form, inputs and outputs.

For context, my problem involves optimally placing items in a warehouse, not knowing the value of future items which might arrive. Anticipating these arrivals would limit greedy behavior of algorithm, effectively reserving some high value locations for high value items which might arrive as learned by the RL model.

How can I best define such a function? (Not asking about business logic but about requirements of its form, inputs outputs etc) What is PyTorch expecting of an MDP?

1

There are 1 best solutions below

0
jbuddy_13 On
  1. Use CleanRL
  2. Make custom environment using Gymnasium https://gymnasium.farama.org/tutorials/gymnasium_basics/environment_creation.html