Sending files in blocks

111 Views Asked by At

I have a problem sending an Excel file in blocks. I have this code that is supposed to do this, and it works fine up to a certain point. However, the issue is that it only writes the first 100 records. After that, the file size remains the same as if it had all 10,000 records, but when opened, it only has 100 records.

@Override
public void generateAndSendExcelChunks(ServletOutputStream outputStream) throws IOException {
    Workbook workbook = new SXSSFWorkbook();

    Sheet sheet = workbook.createSheet("Datos");
    int totalRows = 10000;
    int rowsPerChunk = 100;
    int currentRow = 0;

    while (currentRow < totalRows) {
        int endRow = currentRow + rowsPerChunk;
        endRow = Math.min(endRow, totalRows);
        generateChunkData(sheet, currentRow, endRow);
        workbook.write(outputStream);
        outputStream.flush();
        sheet = workbook.createSheet("Datos");
        currentRow = endRow;
    }

    workbook.close();
}

private void generateChunkData(Sheet sheet, int startRow, int endRow) {
    for (int i = startRow; i < endRow; i++) {
        Row row = sheet.createRow(i);
        Cell cell = row.createCell(0);
        cell.setCellValue("Valor " + (i + 1));
    }
}

The line sheet = workbook.createSheet("Datos"); doesn't work because it tries to create a sheet with a name that already exists. If I try to get a reference to that sheet, it gives a "stream closed" error on this line: workbook.write(outputStream);. The intended functionality is to store all the information in the same sheet without overwriting anything.

2

There are 2 best solutions below

1
Axel Richter On

What you are trying to do is not possible. At least not using Apache POI up to now.

Office Open XML files, such as *.xlsx files too, are ZIP archives containing the data in a special internally file structure. The ZIP archive contains *.xml files and other files too in a special directory structure. The different parts are related to each other. The relations are stored in *.rel files storing special relation-XML. One can see this if one simply unzips the *.xlsx file for example. 7-Zip can do this by default, using other ZIP-Software one needs renaming the file *.zip.

After knowing this and have seen this structure, think about how to fill this chunk by chunk considering all the relations. Conclusion: Very hard to nearly impossible.

Opening a Workbook object of any kind opens that ZIP archive and reads internally stored files into memory. And Workbook.write writes the complete internally ZIP content into the file. So after Workbook.write the workbook object is not more usable until it gets opened again. Thus using it in a loop like: put first content in, write, put second content in, write, ... is not possible. It would must be like: open workbook, put first content in, write, close workbook, open workbook, put second content in, write, close workbook, ... But that contradicts your needs as opening the workbook always reads all content into memory.

But why not simply using SXSSFWorkbook as thought by the programmers? See SXSSF (Streaming Usermodel API). Using the default properties this does exactly what you are trying to do. It only remains 100 rows of sheet data in memory. Other sheet data gets written in a temporary sheet XML file. Only other data and relations remaining in memory permanently. When done SXSSFWorkbook.write composes the ZIP archive using the temporary stored sheet data.

Of course SXSSFWorkbook.write also needs to write the whole *.xlsx-ZIP-archive then. So, if your problem is about delivering a big file to a client, then store the *.xlsx file server-side using SXSSFWorkbook.write. Then think about it as a big binary file to deliver (provide to download) to the client. There should be default approaches to solve that kind of task as that is a default kind of task. Self programming the division into chunks is not necessary and is a bad idea in my opinion.

0
Alexander Mladzhov On

You can use another approach. You can split the excel file into binary blocks and send them. The receiving side can then put them together based on name of each block

Sender:

public class ExcelSplitter {
    public static void main(String[] args) {
        String inputFilePath = "input.xlsx";
        String outputFolderPath = "output";
        int blockSize = 1024; // Size of each binary block in bytes

        try {
            // Load the Excel file
            FileInputStream fileInputStream = new FileInputStream(inputFilePath);
            Workbook workbook = WorkbookFactory.create(fileInputStream);
            
            // Iterate over each sheet in the workbook
            for (int sheetIndex = 0; sheetIndex < workbook.getNumberOfSheets(); sheetIndex++) {
                Sheet sheet = workbook.getSheetAt(sheetIndex);
                String sheetName = sheet.getSheetName();

                // Convert the sheet to binary blocks
                ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
                DataOutputStream dataOutputStream = new DataOutputStream(byteArrayOutputStream);
                for (Row row : sheet) {
                    for (Cell cell : row) {
                        // Write cell data as bytes to the output stream
                        dataOutputStream.write(cell.toString().getBytes());
                    }
                }
                dataOutputStream.close();

                // Split the binary data into blocks
                byte[] binaryData = byteArrayOutputStream.toByteArray();
                int numBlocks = (int) Math.ceil((double) binaryData.length / blockSize);
                for (int blockIndex = 0; blockIndex < numBlocks; blockIndex++) {
                    int startIndex = blockIndex * blockSize;
                    int endIndex = Math.min((blockIndex + 1) * blockSize, binaryData.length);
                    byte[] blockData = new byte[endIndex - startIndex];
                    System.arraycopy(binaryData, startIndex, blockData, 0, blockData.length);

                    // Save each binary block as a separate file
                    String outputFilePath = outputFolderPath + "/" + sheetName + "_block" + blockIndex + ".bin";
                    FileOutputStream fileOutputStream = new FileOutputStream(outputFilePath);
                    fileOutputStream.write(blockData);
                    fileOutputStream.close();
                }
            }

            // Close the workbook and input stream
            workbook.close();
            fileInputStream.close();

            System.out.println("Excel file split into binary blocks successfully.");
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

Receiver:

public class ExcelReconstructor {
    public static void main(String[] args) {
        String inputFolderPath = "input_chunks";
        String outputFilePath = "output.xlsx";
        int blockSize = 1024; // Size of each binary block in bytes

        try {
            // List all binary chunk files in the input folder
            File folder = new File(inputFolderPath);
            File[] chunkFiles = folder.listFiles((dir, name) -> name.toLowerCase().endsWith(".bin"));
            
            // Sort the chunk files based on the block index
            List<File> sortedChunkFiles = new ArrayList<>();
            for (int i = 0; i < chunkFiles.length; i++) {
                sortedChunkFiles.add(null);
            }
            for (File chunkFile : chunkFiles) {
                String fileName = chunkFile.getName();
                int blockIndex = Integer.parseInt(fileName.substring(fileName.lastIndexOf("_block") + 6, fileName.lastIndexOf(".")));
                sortedChunkFiles.set(blockIndex, chunkFile);
            }

            // Create the output file
            FileOutputStream fileOutputStream = new FileOutputStream(outputFilePath);

            // Concatenate the binary data from each chunk file
            for (File chunkFile : sortedChunkFiles) {
                FileInputStream fileInputStream = new FileInputStream(chunkFile);
                byte[] blockData = new byte[blockSize];
                int bytesRead = fileInputStream.read(blockData);
                while (bytesRead != -1) {
                    // Write the block data to the output file
                    fileOutputStream.write(blockData, 0, bytesRead);
                    bytesRead = fileInputStream.read(blockData);
                }
                fileInputStream.close();
            }

            // Close the output file stream
            fileOutputStream.close();

            System.out.println("Binary chunks reconstructed into Excel file successfully.");
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}