I am trying to run several Spiders on a Cron based task.
I want to use scrapyd arguments to use the same spider with different settings.
What I am trying to do is:
I try to set the RETRY_HTTP_CODES = [500, 503, 504, 400, 403, 404, 408] and a spidermon custom monitor SPIDERMON_SPIDER_CLOSE_MONITORS = ('crawler.monitors.SpiderCloseMonitorSuite',) within my scrapyd schedule curl.
Somehow it doesn't take my custom settings tho.
I played around with escaping and other bash stuff, but in the end it didn't work. I was thinking of that it is not even possible?
curl http://localhost:6800/schedule.json -d project=M0 -d spider=m_pp -d setting=LOG_LEVEL='DEBUG' -d setting=RETRY_HTTP_CODES=[500,503,504,400,403,408,] -d setting=SPIDERMON_SPIDER_CLOSE_MONITORS="('crawler.monitors.SpiderCloseMonitorSuite',)" -d _version="r857-M360-416-disable-c"
Any help or workaround are welcome.
This method requires python ≥ 3.8
The approach I use is to do this:
Then I run:
curl http://localhost:6800/schedule.json -d test=True -d project=mycrawler -d spider=my_crawler