I have a deployed endpoint on Vertex AI with auto-scaling being enabled. But I want to manually adjust the min-replicas and max-replicas for the deployed endpoint. How to do so?
I was usiong this command from console, but the flags "min-replicas" & "max-replicas" aren't available for the updaet command:
gcloud ai endpoints update #{model_name} --update-deployment #{model_name} --min-replica-count=#{new_node_count} --max-replica-count=#{new_node_count}