I'm programming an application in vb.net (.Net framework >= 4.5) that retrieves data from a database through a web service (as a datatable object) and parses it in an Excel file (unfortunately) with a specific format through the interop library.
The application was working fine but recently I started to get this OutOfMemory error:
Exception from HRESULT: 0x8007000E (E_OUTOFMEMORY)
The error happens when I parse the data from a datatable to an excel.range object (rng) using:
rng.value = ConvertDataTableTo2DArray(DT)
The function ConvertDataTableTo2DArray takes a datatable object and returns a 2d array with the information. I've been doing this for a while because it seems to be the fastest way to do this kind of operation (from what I've researched online).
Public Shared Function ConvertDataTableTo2DArray(dataTable As DataTable) As Object(,)
Dim rows As Integer = dataTable.Rows.Count
Dim columns As Integer = dataTable.Columns.Count
Dim dataArray(rows - 1, columns - 1) As Object
For i As Integer = 0 To rows - 1
For j As Integer = 0 To columns - 1
dataArray(i, j) = dataTable.Rows(i)(j)
Next
Next
Return dataArray
End Function
Consulting other questions posted here and there, I've found out that Excel.Range objects have a limited size, so my initial approach to solving the issue was to divide the datatable by batches of x amount of rows and declare again a new range for each batch. This does not solve the issue at all, so either I'm not clearing the data previously used correctly or the cause of the error is not exactly what I thought it was.
Some additional information:
- I'm working with rows of 22 columns.
- The error seems to happen around row 120.
- I've tried batches of 100, 50 and 20 rows with the same error in the same group of rows (i.e. for 20 row/batch it happened on the 6th batch).
- I'm using a Dell laptop with an I7 and 32GB of RAM. I've checked and the process fails with more than 10 GB of unused RAM.
My code (unfortunately names are in spanish):
Public Shared Sub CopiarDataTableAExcel(dataTable As DataTable, excelHoja As Excel.Worksheet, inicioCelda As String)
' Obtener el rango de inicio en base a la celda especificada
Dim rangoInicio As Excel.Range = excelHoja.Range(inicioCelda)
' Obtener el rango final en base al tamaño de la DataTable
Dim rangoFin As Excel.Range = excelHoja.Cells(rangoInicio.Row + dataTable.Rows.Count - 1, rangoInicio.Column + dataTable.Columns.Count - 1)
' Obtener el rango completo que abarca desde el inicio hasta el fin
Dim rangoCompleto As Excel.Range = excelHoja.Range(rangoInicio, rangoFin)
Try
' Copiar los datos de la DataTable al rango completo en Excel
rangoCompleto.Value = ConvertDataTableTo2DArray(dataTable)
Catch ex As Exception
If ex.Message.Contains("E_OUTOFMEMORY") Then
Dim filas_por_particion As Int16 = 20
Dim particiones As Int16 = Math.Ceiling(dataTable.Rows.Count / filas_por_particion)
For prt = 1 To particiones
Dim num_filas As Integer = filas_por_particion
If prt = particiones Then
num_filas = dataTable.Rows.Count - (prt * filas_por_particion) ' numero de filas menos las filas ya escritas
End If
'Dim nuevo_rango_inicio As Excel.Range = excelHoja.Range("A" & ((prt - 1) * num_filas + 2).ToString)
Dim nuevo_rango_inicio As Excel.Range = excelHoja.Cells((prt - 1) * num_filas + 2, 1)
Dim nuevo_rango_fin As Excel.Range = excelHoja.Cells(nuevo_rango_inicio.Row + num_filas, nuevo_rango_inicio.Column + dataTable.Columns.Count - 1)
Dim nuevo_rango As Excel.Range = excelHoja.Range(nuevo_rango_inicio, nuevo_rango_fin)
Dim auxDt As DataTable = dataTable.Clone
For fila = (prt - 1) * num_filas To prt * num_filas - 1
auxDt.ImportRow(dataTable.Rows(fila))
Next
nuevo_rango.Value = ConvertDataTableTo2DArray(auxDt)
Next
Else
MsgBox("Se ha producido el siguiente error: " & Chr(13) & ex.Message)
End If
End Try
End Sub
In the first section of the code up to rangoCompleto.Value = ConvertDataTableTo2DArray(dataTable), you can find my initial approach. In the catch statement, I introduced this "batched" copy that still fails. From my understanding, each declaration of nuevo_rango* variables should clear the previous memory used so the new excel.range, nuevo_rango, should be using a new range with only x (num_filas) amount of rows and 22 columns. while debugging I've checked that my batched data is taking the expected size, that is why I don't understand the OutOfMemory error.
UPDATE:
Following @djv's advice and the information in the link he provided, I tried this:
Public Shared Sub CopiarDataTableAExcel(dataTable As DataTable, excelHoja As Excel.Worksheet, inicioCelda As String)
' Obtener el rango de inicio en base a la celda especificada
Dim rangoInicio As Excel.Range = excelHoja.Range(inicioCelda)
' Obtener el rango final en base al tamaño de la DataTable
Dim rangoFin As Excel.Range = excelHoja.Cells(rangoInicio.Row + dataTable.Rows.Count - 1, rangoInicio.Column + dataTable.Columns.Count - 1)
' Obtener el rango completo que abarca desde el inicio hasta el fin
Dim rangoCompleto As Excel.Range = excelHoja.Range(rangoInicio, rangoFin)
Try
' Copiar los datos de la DataTable al rango completo en Excel
rangoCompleto.Value = ConvertDataTableTo2DArray(dataTable)
Catch ex As Exception
If ex.Message.Contains("E_OUTOFMEMORY") Then
Dim filas_por_particion As Int16 = 100
Dim particiones As Int16 = Math.Ceiling(dataTable.Rows.Count / filas_por_particion)
For prt = 1 To particiones
Dim num_filas As Integer = filas_por_particion
If prt = particiones Then
num_filas = (prt * filas_por_particion) - dataTable.Rows.Count ' numero de filas menos las filas ya escritas
End If
DT2Excel(excelHoja, num_filas, prt, dataTable)
GC.Collect()
GC.WaitForPendingFinalizers()
Next
Else
MsgBox("Se ha producido el siguiente error: " & Chr(13) & ex.Message)
End If
End Try
End Sub
Public Shared Function DT2Excel(exl As Excel.Worksheet, num_filas As Int16, num_iter As Int16, dt As DataTable) As Boolean
Try
Dim rng_inicio As New Tuple(Of Int16, Int16)((num_iter - 1) * num_filas + 2, 1)
Dim nuevo_rango_inicio As Excel.Range = exl.Cells(rng_inicio.Item1, rng_inicio.Item2)
Dim rng_fin As New Tuple(Of Int16, Int16)(nuevo_rango_inicio.Row + num_filas, nuevo_rango_inicio.Column + dt.Columns.Count - 1)
Dim nuevo_rango_fin As Excel.Range = exl.Cells(rng_fin.Item1, rng_fin.Item2)
Dim nuevo_rango As Excel.Range = exl.Range(nuevo_rango_inicio, nuevo_rango_fin)
Dim auxDt As DataTable = dt.Clone
For fila = (num_iter - 1) * num_filas To num_iter * num_filas - 1
auxDt.ImportRow(dt.Rows(fila))
Next
nuevo_rango.Value = ConvertDataTableTo2DArray(auxDt)
nuevo_rango = Nothing
Return True
Catch ex As Exception
Return False
End Try
End Function
It no longer throws an error but it is still not copying after row 104. I even encapsulated the part using the ranges inside another function to avoid the noted debugging errors but I feel like that will only work if I encapsulate all the Excel objects, which would defeat the purpose of the function.
It seems like the Excel.range object is still not freeing memory... I guess I could instantiate the workbook and open the file in every iteration but that defeats the purpose of doing the operation fast.
Try to release the ranges inside the function
It may not matter. Exiting the function should release the references and the GC should handle it properly, but it's worth a try.
Also inside the method
CopiarDataTableAExcelin the For loop, do the same. Use aFinallyblock to clear the references and GC. It's a little odd though, that you are seemingly doing the actual work in an exception handler after an out of memory exception.