Assume I have these celery tasks
Class BaseExampleTask(Task):
def __call__(self, *args, **kwargs):
self.task_name = self.name
ctx = kwargs.get('ctx', {})
self.xxx = ctx.get('XXX', None)
@app.task(name='discovery_task', queue='discovery_queue', base=BaseExampleTask, bind=True)
def discovery_task(self, ctx={}):
// do some stuffs
print(self.xxx)
I have several of these celery tasks that are all I/O intensive, so I start them by
celery -A django_app worker --concurrency=50 --pool=gevent --loglevel=info -Q discovery_queue -n discovery_worker
The problem is, I when I run multiple task, I always get the same instance of self.xxx in discovery_task(), to be precise, I always get the self instance of currently executing celery task. These multiple tasks are called using group() to run in parallel as below
grouped_tasks = []
for i in range(1, 10):
kwargs = {}
task = initiate_scan.si(**kwargs)
grouped_tasks.append(task)
celery_group = group(grouped_scans)
celery_group.apply_async()
Why is that all discovery_tasks that are currently running have same instance of BaseExampleTask if --pool=gevent is used. when I remove --pool=gevent, they all have difference instance of BaseExampleTask thereby having different self.
How do I fix this, and since my tasks are I/O bound, how can I still use gevent and have different instance of BaseExampleTask?
I am quite confused tho, when I remove --pool=gevent, they all have difference instance of BaseExampleTask thereby having different self, which is what I am expecting.