I'm trying to get spreadsheet data from zipped .xlsx files. I'm using rubyzip to access the contents of the zipfile
Zip::File.open(file_path) do |zip_file|
zip_file.each do |entry|
*process entry*
end
end
My problem is that rubyzip gives a Zip::Entry object, which, I cant get to work with gems like roo or creek.
I've done something similar, but with .csv file. This was as simple as CSV.parse(entry.get_input_stream.read). However, that just gives me a string of encoded gibberish when using it on an .xlsx file.
I've looked around and the closest answer I got was temporarily extracting the files, but I want to avoid doing this since the files can get pretty large.
Does anyone have any suggestions? Thanks in advance.
So what you need to do is convert the stream into an
IOobject thatRoocan understand.To determine if the object passed to
Roo::Spreadsheet.openis a "stream"Roouses the following method:Since a
Zip::InputStreamdoes not respond toseekyou cannot use this object directly. To get around this we simply need an object that does respond toseek(like aStringIO)We can just
readthe input stream into theStringIOdirectly:Or the
Ziplibrary also provides a method to copy aZip::InputStreamto anotherIOobject through theIOExtrasmodule, which I think reads fairly nicely as well.Knowing all of the above we can implement as follows: