How to avoid OutOfMemory errors while using excel.range in Interop.Excel

124 Views Asked by At

I'm programming an application in vb.net (.Net framework >= 4.5) that retrieves data from a database through a web service (as a datatable object) and parses it in an Excel file (unfortunately) with a specific format through the interop library.

The application was working fine but recently I started to get this OutOfMemory error:

Exception from HRESULT: 0x8007000E (E_OUTOFMEMORY)

The error happens when I parse the data from a datatable to an excel.range object (rng) using:

rng.value = ConvertDataTableTo2DArray(DT)

The function ConvertDataTableTo2DArray takes a datatable object and returns a 2d array with the information. I've been doing this for a while because it seems to be the fastest way to do this kind of operation (from what I've researched online).

        Public Shared Function ConvertDataTableTo2DArray(dataTable As DataTable) As Object(,)
            Dim rows As Integer = dataTable.Rows.Count
            Dim columns As Integer = dataTable.Columns.Count
            Dim dataArray(rows - 1, columns - 1) As Object

            For i As Integer = 0 To rows - 1
                For j As Integer = 0 To columns - 1
                    dataArray(i, j) = dataTable.Rows(i)(j)
                Next
            Next

            Return dataArray
        End Function

Consulting other questions posted here and there, I've found out that Excel.Range objects have a limited size, so my initial approach to solving the issue was to divide the datatable by batches of x amount of rows and declare again a new range for each batch. This does not solve the issue at all, so either I'm not clearing the data previously used correctly or the cause of the error is not exactly what I thought it was.

Some additional information:

  • I'm working with rows of 22 columns.
  • The error seems to happen around row 120.
  • I've tried batches of 100, 50 and 20 rows with the same error in the same group of rows (i.e. for 20 row/batch it happened on the 6th batch).
  • I'm using a Dell laptop with an I7 and 32GB of RAM. I've checked and the process fails with more than 10 GB of unused RAM.

My code (unfortunately names are in spanish):

        Public Shared Sub CopiarDataTableAExcel(dataTable As DataTable, excelHoja As Excel.Worksheet, inicioCelda As String)
            ' Obtener el rango de inicio en base a la celda especificada
            Dim rangoInicio As Excel.Range = excelHoja.Range(inicioCelda)

            ' Obtener el rango final en base al tamaño de la DataTable
            Dim rangoFin As Excel.Range = excelHoja.Cells(rangoInicio.Row + dataTable.Rows.Count - 1, rangoInicio.Column + dataTable.Columns.Count - 1)

            ' Obtener el rango completo que abarca desde el inicio hasta el fin
            Dim rangoCompleto As Excel.Range = excelHoja.Range(rangoInicio, rangoFin)
            Try
                ' Copiar los datos de la DataTable al rango completo en Excel
                rangoCompleto.Value = ConvertDataTableTo2DArray(dataTable)
            Catch ex As Exception
                If ex.Message.Contains("E_OUTOFMEMORY") Then
                    Dim filas_por_particion As Int16 = 20
                    Dim particiones As Int16 = Math.Ceiling(dataTable.Rows.Count / filas_por_particion)
                    For prt = 1 To particiones
                        Dim num_filas As Integer = filas_por_particion
                        If prt = particiones Then
                            num_filas = dataTable.Rows.Count - (prt * filas_por_particion) ' numero de filas menos las filas ya escritas
                        End If
                        'Dim nuevo_rango_inicio As Excel.Range = excelHoja.Range("A" & ((prt - 1) * num_filas + 2).ToString)
                        Dim nuevo_rango_inicio As Excel.Range = excelHoja.Cells((prt - 1) * num_filas + 2, 1)
                        Dim nuevo_rango_fin As Excel.Range = excelHoja.Cells(nuevo_rango_inicio.Row + num_filas, nuevo_rango_inicio.Column + dataTable.Columns.Count - 1)

                        Dim nuevo_rango As Excel.Range = excelHoja.Range(nuevo_rango_inicio, nuevo_rango_fin)
                        Dim auxDt As DataTable = dataTable.Clone

                        For fila = (prt - 1) * num_filas To prt * num_filas - 1
                            auxDt.ImportRow(dataTable.Rows(fila))
                        Next
                        nuevo_rango.Value = ConvertDataTableTo2DArray(auxDt)
                    Next
                Else
                    MsgBox("Se ha producido el siguiente error: " & Chr(13) & ex.Message)
                End If
            End Try
        End Sub

In the first section of the code up to rangoCompleto.Value = ConvertDataTableTo2DArray(dataTable), you can find my initial approach. In the catch statement, I introduced this "batched" copy that still fails. From my understanding, each declaration of nuevo_rango* variables should clear the previous memory used so the new excel.range, nuevo_rango, should be using a new range with only x (num_filas) amount of rows and 22 columns. while debugging I've checked that my batched data is taking the expected size, that is why I don't understand the OutOfMemory error.


UPDATE:

Following @djv's advice and the information in the link he provided, I tried this:

        Public Shared Sub CopiarDataTableAExcel(dataTable As DataTable, excelHoja As Excel.Worksheet, inicioCelda As String)
            ' Obtener el rango de inicio en base a la celda especificada
            Dim rangoInicio As Excel.Range = excelHoja.Range(inicioCelda)

            ' Obtener el rango final en base al tamaño de la DataTable
            Dim rangoFin As Excel.Range = excelHoja.Cells(rangoInicio.Row + dataTable.Rows.Count - 1, rangoInicio.Column + dataTable.Columns.Count - 1)

            ' Obtener el rango completo que abarca desde el inicio hasta el fin
            Dim rangoCompleto As Excel.Range = excelHoja.Range(rangoInicio, rangoFin)
            Try
                ' Copiar los datos de la DataTable al rango completo en Excel
                rangoCompleto.Value = ConvertDataTableTo2DArray(dataTable)
            Catch ex As Exception
                If ex.Message.Contains("E_OUTOFMEMORY") Then
                    Dim filas_por_particion As Int16 = 100
                    Dim particiones As Int16 = Math.Ceiling(dataTable.Rows.Count / filas_por_particion)
                    For prt = 1 To particiones
                        Dim num_filas As Integer = filas_por_particion
                        If prt = particiones Then
                            num_filas = (prt * filas_por_particion) - dataTable.Rows.Count ' numero de filas menos las filas ya escritas
                        End If
                        DT2Excel(excelHoja, num_filas, prt, dataTable)
                        GC.Collect()
                        GC.WaitForPendingFinalizers()
                    Next
                Else
                    MsgBox("Se ha producido el siguiente error: " & Chr(13) & ex.Message)
                End If
            End Try
        End Sub


        Public Shared Function DT2Excel(exl As Excel.Worksheet, num_filas As Int16, num_iter As Int16, dt As DataTable) As Boolean

            Try
                Dim rng_inicio As New Tuple(Of Int16, Int16)((num_iter - 1) * num_filas + 2, 1)
                Dim nuevo_rango_inicio As Excel.Range = exl.Cells(rng_inicio.Item1, rng_inicio.Item2)
                Dim rng_fin As New Tuple(Of Int16, Int16)(nuevo_rango_inicio.Row + num_filas, nuevo_rango_inicio.Column + dt.Columns.Count - 1)
                Dim nuevo_rango_fin As Excel.Range = exl.Cells(rng_fin.Item1, rng_fin.Item2)

                Dim nuevo_rango As Excel.Range = exl.Range(nuevo_rango_inicio, nuevo_rango_fin)
                Dim auxDt As DataTable = dt.Clone

                For fila = (num_iter - 1) * num_filas To num_iter * num_filas - 1
                    auxDt.ImportRow(dt.Rows(fila))
                Next
                nuevo_rango.Value = ConvertDataTableTo2DArray(auxDt)
                nuevo_rango = Nothing
                Return True
            Catch ex As Exception
                Return False
            End Try

        End Function

It no longer throws an error but it is still not copying after row 104. I even encapsulated the part using the ranges inside another function to avoid the noted debugging errors but I feel like that will only work if I encapsulate all the Excel objects, which would defeat the purpose of the function.

It seems like the Excel.range object is still not freeing memory... I guess I could instantiate the workbook and open the file in every iteration but that defeats the purpose of doing the operation fast.

2

There are 2 best solutions below

3
djv On

Try to release the ranges inside the function

Public Shared Function DT2Excel(exl As Excel.Worksheet, num_filas As Int16, num_iter As Int16, dt As DataTable) As Boolean
    Dim nuevo_rango_inicio As Excel.Range
    Dim nuevo_rango_fin As Excel.Range
    Dim nuevo_rango As Excel.Range
    Try
        Dim rng_inicio As New Tuple(Of Int16, Int16)((num_iter - 1) * num_filas + 2, 1)
        nuevo_rango_inicio = exl.Cells(rng_inicio.Item1, rng_inicio.Item2)

        Dim rng_fin As New Tuple(Of Int16, Int16)(nuevo_rango_inicio.Row + num_filas, nuevo_rango_inicio.Column + dt.Columns.Count - 1)
        nuevo_rango_fin = exl.Cells(rng_fin.Item1, rng_fin.Item2)

        nuevo_rango = exl.Range(nuevo_rango_inicio, nuevo_rango_fin)

        Dim auxDt As DataTable = dt.Clone

        For fila = (num_iter - 1) * num_filas To num_iter * num_filas - 1
            auxDt.ImportRow(dt.Rows(fila))
        Next
        nuevo_rango.Value = ConvertDataTableTo2DArray(auxDt)
        
        Return True
    Catch ex As Exception
        Return False
    Finally
        nuevo_rango = Nothing
        nuevo_rango_inicio = Nothing
        nuevo_rango_fin = Nothing
        GC.Collect()
        GC.WaitForPendingFinalizers()
    End Try

End Function

It may not matter. Exiting the function should release the references and the GC should handle it properly, but it's worth a try.

Also inside the method CopiarDataTableAExcel in the For loop, do the same. Use a Finally block to clear the references and GC. It's a little odd though, that you are seemingly doing the actual work in an exception handler after an out of memory exception.

1
Jouni K On

I had a similar situation and the same error. It turned out some of the data elements started with the = character, and this was causing the out of memory error. I fixed it by adding an apostrophe ' to the start of those strings before writing them to excel. E.g. =sample_text becomes '=sample_text. This way Excel doesn't attempt to interpret them as formulas, but as normal text strings instead.