how to improve PhpSpreadsheet data retrieve timing from local file?

127 Views Asked by At

I'm using PhpSpreadsheet to read a specific value from a local xlsx file, but it's taking 510ms, which is too slow. I need it to be under 20ms, similar to reading from a CSV file with the same data in a CSV format. Any tips to speed up the code?

<?php
require '/vendor/autoload.php'; // Include PhpSpreadsheet library

use PhpOffice\PhpSpreadsheet\IOFactory;

$xlsxFile = 'local.xlsx'; // path to XLSX file
$targetRowNumber = 302; // row number to access
$targetColumn = 3; // column number (C = 3)

try {
    $trystarttime = microtime(true);
    // Load the XLSX file
    $spreadsheet = IOFactory::load($xlsxFile);

    // Get the value of the specified cell (column 2, row 301)
    $value = $spreadsheet->getActiveSheet()->getCellByColumnAndRow($targetColumn, $targetRowNumber)->getValue();
    
    echo "Value in column $targetColumn, row $targetRowNumber: " . $value . "\n";

    $tryendtime = microtime(true);
    $durationInMilliseconds = (number_format(($tryendtime - $trystarttime) * 1000, 2)) . "ms to fetch the required row value" . PHP_EOL;

    echo $durationInMilliseconds;

} catch (Exception $e) {
    echo "Error: " . $e->getMessage() . "\n";
    echo "Trace: " . $e->getTraceAsString() . "\n";
}
?>
2

There are 2 best solutions below

0
Mark Baker On BEST ANSWER

If you expect to be able to load a complex file format like Xlsx as efficiently as you can read a plaintext CSV file, then rethink your assumptions. That being said, if you only need to read a single cell value, then there is a lot that you can do to improve performance.

If you only need data from the file, and not structure or style information, then load it with ReadDataOnly enabled.

$reader = new \PhpOffice\PhpSpreadsheet\Reader\Xlsx();
$reader->setReadDataOnly(true);
$spreadsheet = $reader->load($inputFileName);

If the file has several worksheets, and you're only interested in the data from one of those worksheets, then tell the loader to load only that one worksheet.

$reader = new \PhpOffice\PhpSpreadsheet\Reader\Xlsx();
$reader->setLoadSheetsOnly('Worksheet 1');
$spreadsheet = $reader->load($inputFileName);

If you know the cell Address (or Addresses) that you want to read, then use a Read Filter to read only those cells from the file.

/**  Define a Read Filter class implementing \PhpOffice\PhpSpreadsheet\Reader\IReadFilter  */
class MyReadFilter implements \PhpOffice\PhpSpreadsheet\Reader\IReadFilter
{
    public function readCell($targetColumn, $targetRowNumber, $worksheetName = '') {
        if ($row == $targetRowNumber) {
            if ($columnAddress == $targetColumn) {
                return true;
            }
        }
        return false;
    }
}

/**  Create an Instance of our Read Filter  **/
$filterSubset = new MyReadFilter('C', 103);

$reader = new \PhpOffice\PhpSpreadsheet\Reader\Xlsx();
/**  Tell the Reader that we want to use the Read Filter  **/
$reader->setReadFilter($filterSubset);
/**  Load only the rows and columns that match our filter to Spreadsheet  **/
$spreadsheet = $reader->load($inputFileName);

These features are all documented.

1
Jeff On

If you know the file type you may be able to win some performance by creating (in this case) an XLSX reader directly

$reader = new \PhpOffice\PhpSpreadsheet\Reader\Xlsx();
$spreadsheet = $reader->load($inputFileName);

See also the documentation at readthedocs.io