I need to be able to read the first (header) row in big xlsx file (350k x 12 cells, ~30MB) very fast in Ruby on Rails app. I am using Roo gem at the moment, which is fine for smaller files. But for files this big it takes 3-4 minutes. Is there a way to do this in seconds?
xlsx = Roo::Spreadsheet.open(file_path)
sheet = xlsx.sheet(0)
header = sheet.row(1)
Edit:
- I tried other gems:
- rubyXL took several minutes
- creek was the fastest with 30s. But still unusable in controller
Edit2:
- I ended up using creek in a job and polling for the result in controller. Thx Tom Lord for suggesting creek
The ruby gem
roodoes not support file streaming; it reads the whole file into memory. Which, as you say, works fine for smaller files but not so well for reading small sections of huge files.You need to use a different library/approach. For example, you can use the gem:
creek, which describes itself as:And, taking the example from the project's README, it's pretty straightforward to translate the code you wrote for
roointo code that usescreek:Note: A quick google of your StackOverflow question title led me to this blog post as the top search result. It's always worth searching on Google first.