I'm currently working on porting my app from GAE Python 2 to Python 3. I'd like the process/threading and scaling characteristics in Python 3 to match the Python 2 behavior. Specifically, I want the number of processes, threads, and 60 second timeout to match.
I set in app.yaml:
entrypoint: gunicorn -b :$PORT main:app -t 60 -w 1 --threads 8
As shown, the timeout is 60 seconds.
Also set is 1 worker and many threads, because multiple workers causes out-of-memory errors on requests, which did not occur in Python 2 runtime. Furthermore, from the Python 2 docs, it seems that they might have just used 1 worker and multiple threads: https://cloud.google.com/appengine/docs/standard/python/config/appref
Am I on track here? Did GAE Python 2 in fact use 1 process and many threads in threadsafe mode?
I 'believe' Python2.7 supported concurrent requests based on this documentation
Regarding your comment - multiple workers causes out-of-memory errors - if you increase number of workers, then you probably have to use a higher class (see documentation)