Why does Celery base Task have same instance while using execution pool as gevent?

29 Views Asked by At

Assume I have these celery tasks

Class BaseExampleTask(Task): 
    def __call__(self, *args, **kwargs): 
        self.task_name = self.name
        ctx = kwargs.get('ctx', {})
        self.xxx = ctx.get('XXX', None)


@app.task(name='discovery_task', queue='discovery_queue', base=BaseExampleTask, bind=True)
def discovery_task(self, ctx={}):
    // do some stuffs
    print(self.xxx)

I have several of these celery tasks that are all I/O intensive, so I start them by

celery -A django_app worker --concurrency=50 --pool=gevent --loglevel=info -Q discovery_queue -n discovery_worker

The problem is, I when I run multiple task, I always get the same instance of self.xxx in discovery_task(), to be precise, I always get the self instance of currently executing celery task. These multiple tasks are called using group() to run in parallel as below

grouped_tasks = []
for i in range(1, 10):
    kwargs = {}
    task = initiate_scan.si(**kwargs)
    grouped_tasks.append(task)
celery_group = group(grouped_scans)
celery_group.apply_async()

Why is that all discovery_tasks that are currently running have same instance of BaseExampleTask if --pool=gevent is used. when I remove --pool=gevent, they all have difference instance of BaseExampleTask thereby having different self.

How do I fix this, and since my tasks are I/O bound, how can I still use gevent and have different instance of BaseExampleTask?

I am quite confused tho, when I remove --pool=gevent, they all have difference instance of BaseExampleTask thereby having different self, which is what I am expecting.

0

There are 0 best solutions below