What I am currently doing is using the getObject method from S3 to obtain a ResponseInputStream of a compressed file, and then processing these compressed files through some stream methods. Similar to the following code:
ZipArchiveInputStream zipIn = s3.getZipIn();
while ((entry = zipIn.getNextZipEntry()) != null) {
if (entry.isDirectory()) {
continue;
}
long curFileSize = entry.getSize();
ByteArrayOutputStream byteOut = new ByteArrayOutputStream();
zipIn.transferTo(byteOut);
//do something
String fileName = entry.getName();
}
I think my current approach will download the entire compressed file before executing my logic.
But now I have a special requirement, which is to only obtain the relative path file name of each file, without the actual content of each file. I know that some meta information of this compressed format, such as zip, will exist in certain header or tail partitions. I have seen many simple ways to read file names from local files, but I am not sure if there is a similar function for skipping downloads in my way of obtaining streams from the network, so that I can complete this task without consuming a large amount of network bandwidth.
I know S3 supports partial downloads. Is there any reasonable solution or library to do this?