How to share in memory resources between different workers while using Flask with Gunicorn

167 Views Asked by At

I have a flask App. I am creating 4 workers with Gunicorn. So, on running this,

I will have total 4 flask instances, 4 PIDs and 4 workers. I want to use a python dictionary across all these workers.

I have tried using list from multiprocessing Manager, but it works only for Same PID, it doesn't work for different PID(s).

import multiprocessing
manager = multiprocessing.Manager()
shared_list = manager.list()

Every time a PID is different, I will get a different result.

I prefer to do this without using any Database, Redis, Or such things.

1

There are 1 best solutions below

0
Danila Ganchar On

You can use multiprocessing.Manager() + preload flag. Here is an example:

# debug.py
import multiprocessing
import os
import random

from flask import Flask, jsonify

app = Flask(__name__)
shared_list = multiprocessing.Manager().list()  # list, dict etc


@app.route('/generate')
def generate():
    # generate random number and append to shared list
    value = random.randint(1, 9999)
    shared_list.append(value)
    return jsonify(dict(
        pid=os.getpid(),
        value=value,
    ))


@app.route('/status')
def status():
    # return sum of shared list by pid
    return jsonify(dict(
        pid=os.getpid(),
        total=sum(shared_list),
    ))

Run app: gunicorn -w 4 --preload debug:app. Let's check counters:

# test.py
import requests
import trio
import httpx


async def generate(ix: int):
    async with httpx.AsyncClient() as client:
        print(f'start api request {ix}')
        response = await client.get('http://localhost:8000/generate')
        print(f'api request {ix} done. {response.json()}')


async def run():
    async with trio.open_nursery() as nursery:
        for ix in range(100):
            nursery.start_soon(generate, ix)


# generate random numbers
trio.run(run)

# check worker counters
stats = {}
while len(stats) < 4:
    data = requests.get('http://localhost:8000/status').json()
    pid = data['pid']
    stats[pid] = data['total']

print('_' * 100)
print(stats)

run python test.py. You'll see that total value is always the same for different workers / pid's

...
api request 99 done. {'pid': 827939, 'value': 78}
api request 97 done. {'pid': 827940, 'value': 7324}
api request 96 done. {'pid': 827937, 'value': 8816}
____________________________________________________________________________________________________
{827938: 46634762, 827940: 46634762, 827939: 46634762, 827937: 46634762}