C# open xml unable to read string values

53 Views Asked by At

Hi I am using open xml to read excel in my .Net application. I have method which accepts row and columns as input parameters and returns value for that particular cell. Below is my implementation.

public List<ServicePortDto> GetServicePorts(IFormFile formFile, Dictionary<string, int> starRowForPorts)
{
    using (SpreadsheetDocument spreadSheetDocument = SpreadsheetDocument.Open(formFile.OpenReadStream(), false))
    {
        WorkbookPart workbookPart = spreadSheetDocument.WorkbookPart;
        List<int> portRows = new();
        int counter = 0;
        IEnumerable<Sheet> sheets = spreadSheetDocument.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>();
        foreach (var sheet in sheets)
        {
            int startRowPort = starRowForPorts.Where(x=>x.Key == sheet.Name.Value).Select(x => x.Value).FirstOrDefault();
            string relationshipId = sheet.Id.Value;
            WorksheetPart worksheetPart = (WorksheetPart)spreadSheetDocument.WorkbookPart.GetPartById(relationshipId);
            Worksheet workSheet = worksheetPart.Worksheet;
            SheetData sheetData = workSheet.GetFirstChild<SheetData>();
            IEnumerable<Row> rows = sheetData.Descendants<Row>();
            rows = rows.Where(x=>x.RowIndex >= startRowPort);
            foreach (Row row in rows)
            {
                if(row.RowIndex >= startRowPort)
                {
                    string portName = GetCell(workSheet, "B", startRowPort).InnerText;

                    if (portName != null)
                    {
                        counter = counter + 1;
                    }
                    else
                    {
                        break;
                    }
                }
            }
        }
    }
    return new List<ServicePortDto>();
}

Below is GetCell method.

private static Cell GetCell(Worksheet worksheet, string columnName, int rowIndex)
{
    Row row = worksheet.GetFirstChild<SheetData>().Elements<Row>().FirstOrDefault(r => r.RowIndex == rowIndex);

    if (row != null)
    {
        return row.Elements<Cell>().FirstOrDefault(c => string.Compare(c.CellReference.Value, columnName + rowIndex, true) == 0);
    }
    return null;
}

The problem is with above method is if there is any string data in cell then it reads as some integer. I am not sure why this is happening. Integer values reads properly. Only issue with String/Text fields. May I know what I am missing here? Can someone please help me with this? Any help would be appreciated. Thanks

2

There are 2 best solutions below

1
FortyTwo On BEST ANSWER

In Open XML, cell values are stored as shared strings or directly as inline strings, and you need to handle them accordingly.

private static string GetCellValue(Cell cell, SharedStringTablePart stringTablePart)
{
    if (cell.DataType != null && cell.DataType.Value == CellValues.SharedString)
    {
        if (stringTablePart != null)
        {
            SharedStringItem sharedStringItem = stringTablePart.SharedStringTable.Elements<SharedStringItem>().ElementAt(int.Parse(cell.InnerText));
            return sharedStringItem.Text?.Text;
        }
    }
    else if (cell.CellValue != null)
    {
        return cell.CellValue.Text;
    }
    return null;
}
2
Panagiotis Kanavos On

Instead of the Open XML SDK use one of the high level Excel libraries in NuGet like ExcelDataReader, EPPlus, ClosedXML and many others (some of them a lot faster). The Open XML SDK is very low level, almost at the level of XML elements.

Using EPPlus for example, all this code could be replaced with just :

using var fileStream=formFile.OpenReadStream();
using ExcelPackage package = new ExcelPackage(fileStream);
ExcelWorksheet sheet = package.Workbook.Worksheets[0];

var range=sheet.Cells["b:b"];
foreach(var cell in range)
{
    ...
}

You can use LINQ to count non-empty cells:

var filledCount=sheet.Cells["b:b"].Where(c=>c.Value!=null).Count();

ExcelDataReader creates an IDbDataReader wrapper over an Excel sheet that can be used to read data into a DataTable, sent to a database etc:

using var reader = ExcelReaderFactory.CreateReader(fileStream);
do
{
    while (reader.Read())
    {
        if (!reader.IsDbNull(1))
        {
        ...
        }
    }
}

You can use that reader with SqlBulkCopy to import the data into SQL Server:

using var bcp = new SqlBulkCopy(connection);
bcp.DestinationTableName = "SomeTable"; 
...
bcp.WriteToServer(reader);