I've got my HPA like this. The pods are scaling well but its not scaling down and the memory is below threshold level. Applications are not releasing the memory. Waiting from past 2 hrs but still the apps are not releasing the memory and the pods are not coming down. How to overcome this issue.
Currently there is not much load on the memory. There is no running queries found in Trino.
After restarting the pods, the memory got released. How the trino application can release the memory within few mins.
I don't know anything about Trino, but lots of applications do not release memory, especially Java applications. Java applications (and fundamentally any application with GC) manage their own memory and so releasing memory is difficult and generally non-productive.
In general, I find HPA based on memory to be the wrong decision unless you really know what you are doing. I'd autoscale based on CPU if you need to autoscale.
EDIT: Some quick investigation does seem to show that Trino is a Java application. So it will never release memory. Memory based HPA is never going to work.