I'm writing a background job to process a large number of users and create a UserFlag for each one. I have 18,000 users to process. I currently have the user ids and I've written the job as so:
class BatchProcessUsersJob < ApplicationJob
queue_as :default
def perform()
user_ids = [
# 18,000 ids
]
UserFlag.add_flag_to_users(user_ids, UserFlag::PRODUCT_TYPE_2)
end
end
on the UserFlag
def self.add_flag_to_users(user_ids, flag)
users_with_flag = UserFlag.where(user_id: user_ids).where(flag: flag).pluck(:user_id)
users_without_flags = user_ids - users_with_flag
return if users_without_flags.empty?
users_without_flags.uniq.each do |user_id|
UserFlag.find_or_create_by(user_id: user_id, flag: flag)
end
end
My question is whether or not the job can handle 18,000 users in this way. This doesn't seem safe and seems like it would be susceptible to timing out or.. something else. Additionally, it feels very messy to be processing that many users in this way; it feel bad. Is there a different way I should be approaching this?
You can split your 18k ids array in multiple groups and pass on to the
add_flag_to_usersmethod for batch processing of these groups.For splitting of the
user_idsarray you can either use ruby specific each_slice method or rails specific in_groups_of method.For example, using each_slice method your above code will look something like below which is splitting the array to groups of 500 ids: