Let's say that I need to do complex calculations for 100 users. My current configuration looks like this:
producer
class Producer
class << self
def publish(target, options = {})
connection = Bunny.new(some_params).start
channel = connection.create_channel
exchange = channel.fanout("#{target}_exchange", durable: true)
exchange.publish(options.to_json)
end
end
end
MassComplexCalculations worker
module UsersWorkers
class MassComplexCalculations
include Sneakers::Worker
from_queue "#{ENV['RAILS_ENV']}.users.mass_complex_calculations_queue",
exchange: "#{ENV['RAILS_ENV']}.users.mass_complex_calculations_exchange"
def work(options)
parsed_options = JSON.parse(options)
ActiveRecord::Base.connection_pool.with_connection do
User.where(id: parsed_options['ids']).each do |user|
::Services::Users::ComplexCalculations.call(user)
end
end
ack!
end
end
end
run worker
Producer.publish("#{ENV['RAILS_ENV']}.users.mass_complex_calculations", ids: User.limit(100).ids)
I do not quite understand how AMQP allocates resources to perform tasks and how I can help. Is it right, that it would be better to run each calculation in a separate worker? For example:
CHANGED MassComplexCalculations worker
module UsersWorkers
class MassComplexCalculations
include Sneakers::Worker
from_queue "#{ENV['RAILS_ENV']}.users.mass_complex_calculations_queue",
exchange: "#{ENV['RAILS_ENV']}.users.mass_complex_calculations_exchange"
def work(options)
parsed_options = JSON.parse(options)
ActiveRecord::Base.connection_pool.with_connection do
parsed_options['ids'].each do |id|
Producer.publish("#{ENV['RAILS_ENV']}.users.personal_complex_calculations", id: id)
end
end
ack!
end
end
end
NEW PersonalComplexCalculations worker
module UsersWorkers
class PersonalComplexCalculations
include Sneakers::Worker
from_queue "#{ENV['RAILS_ENV']}.users.personal_complex_calculations_queue",
exchange: "#{ENV['RAILS_ENV']}.users.personal_complex_calculations_exchange"
def work(options)
parsed_options = JSON.parse(options)
user = User.find(parsed_options['id'])
ActiveRecord::Base.connection_pool.with_connection do
::Services::Users::ComplexCalculations.call(user)
end
ack!
end
end
end
In my understanding, there may be two options:
- the first implementation may work slower because it will call the service in order for each user, while in the second option we will have 100 simultaneous working workers which will do their job in parallel
- there is no difference
So which approach is better? Or maybe even one of them is completely wrong?
Thanks in advance.
Neither of your assumptions hold. You are not guaranteed to have 100 parallel workers as sneakers has a default thread pool size that you are not necessarily overriding:
https://github.com/jondot/sneakers/blob/master/lib/sneakers/worker.rb#L20
And if you do not have an ActiveRecord connection pool of at least 100 connections configured, your code will also block and wait because of resource starvation here.
in GENERAL, doing this sort of task in parallel is likely to be faster most of the time - but this is not guaranteed.