On improving file uploads with respect to memory and performance - struts 1

963 Views Asked by At

We have struts 1.1 legacy MVC code, in a "non spring" J2EE application. The following code is in an action/form-bean. This code represents the recipient of the HTTP Post with multipart.

private org.apache.struts.upload.FormFile dataFile;
...

byte[] file_bytes = null;
try {
is = this.dataFile.getInputStream();
file_bytes = IOUtils.toByteArray(is);
...
} ...

For some reason, the earlier integrations with document repositories, byte array was used. A decade back, I guess generally people weren't uploading larger files through (http) websites that much. In more recent use cases, users (may) upload documents of size larger than 1GB. I do not think keeping the byte array of this much size in memory is a good idea. Depending upon the need, there could even be more than one users simultaneously doing the upload, some doing larger files.

Recently we wrote a spring boot rest api application which serves as an adapter to access cloud storage, in which the upload endpoint looks like this.

@RequestMapping(value = "/{doctype}/{docid}/upload", method = RequestMethod.POST, consumes = { "multipart/form-data" })

public ResponseEntity uploadFile(
   @PathVariable String doctype,
   @PathVariable String docid, 
   @RequestParam("file") MultipartFile file,
   @RequestParam("metadata") String docMetadata) {    
   ...
}

It is important to note this - From our struts v1.1 MVC JEE application, we would eventually call the above spring boot rest endpoint (to upload documents to cloud storage). By reading the above (rest endpoint) code, I think, we can stream content into this rest endpoint (using regular java code).

Along this line, I am thinking of one of the two options.

OPTION 1. Keep the struts upload as is and only do the change to avoid holding the byte array in memory, through, either.

1.1. keep the input stream (from the request) until the code that invokes the document repository adapter, then stream it - I am not sure if this would work.

1.2. in struts action/form-bean, store the stream into disk, and then while invoking the document repository adapter method, read into stream and send...

OPTION 2. Change from struts action to a regular servlet, to receive the upload file, and use apache commons file upload, and use either the disk storing or the stream option. Then stream into the spring boot rest api endpoint.

I am confused between these two as to the advantages each could bring (as opposed to holding the whole byte array in memory)

1

There are 1 best solutions below

0
Murali D On

Here solution to your problem is, do file uploads without using huge memory. And for this you already got solution in your question. Let me add my solutions option with more details.

  1. Get the file and put it on disk. For this take advantage of Java IO decorator pattern. Means HttpRequest --> BufferedReader --> FileWriter, make sure BufferedReader has fixed byte size. Here you can take advantage of Java NIO (Channels, Buffer...etc) for more efficient non blocking IO processing. It helps in accepting more file uploaded with same configuration. https://examples.javacodegeeks.com/core-java/nio/java-nio-large-file-transfer-tutorial/

2.Reactive programming. Spring 5 reactor supports end to end (Client --> RESETAPI --> Database) reactive processing in application. So you can take advantage of Spring webflux and WebClient for non blocking, async stream uploading file. I suggest this option. https://www.baeldung.com/spring-webclient-upload-file

Now take either of option and instead of enhancing old/legacy struts, go with latest Spring based end point.