I have a project where I need to copy the contents of the .xlsx file I received in Web API Controller (in the form of the Stream from MultipartReader) to SQL Server Database. I'm using SqlBulkCopy for copying itself (I already did a similar task for .csv files), but all of the solutions I was able to find suffer from one or more of the following problems:
- Require saving the file to the disk first (not possible in my case)
- Don't have any way of reading the file asynchronously
- Load entire file into memory first (I'm expecting to deal with fairly large files, so this is not acceptable for me)
- Are commercially licensed
Are there any ways of doing this?
Jeroen is correct, in that it is not possible to handle Excel files in a purely streaming manner. While it might require loading the entire .xlsx file in memory, the efficiency of the library can have an even larger impact on the memory usage than the file size. I say this as the author of the most efficient Excel reader for .NET: Sylvan.Data.Excel.
In benchmarks comparing it to other libraries, you can see that not only is it significantly faster than other implementations, but it also uses only a tiny fraction of the memory that other libraries consume.
With the exception of "Load entire file into memory first", it should satisfy all of your requirements. It can process data out of a
MemoryStream, it doesn't need to write to disk. It implementsDbDataReaderwhich providesReadAsync. TheReadAsyncimplementation defaults to the base DbDataReader implementation which defers to the synchronousRead()method, but when the file is buffered in aMemoryStreamthis doesn't present a problem, and allows theSqlBulkCopy.WriteToServerAsyncto process it asynchronously. Finally, it is MIT licensed, so you can do whatever you want with it.The most complicated part of this is probably the "schema provider" which is used to provide header name mappings and define the column types, which are required for SqlBulkCopy to operate correctly.
I also maintain the Sylvan.Data.Csv library, which provides very similar capabilities for CSV files, and is a fully asynchronous streaming CSV reader impelementation. The API it provides is nearly identical to the Sylvan ExcelDataReader. It is also the fastest CSV reader for .NET.
If you end up trying these libraries and have any troubles, open an issue in the github repo and I can take a look.