Remove embedded files from PDF using PDFBox

165 Views Asked by At

I'm trying to remove all embedded files from PDF using PDFBox, inside a Java application. I'm using PDFBox with Java. Not command line tools.

There are many examples how to read embedded files using PDFBox. But it wasn't that simple to modify the code to remove the files.

Any working solution for that?

1

There are 1 best solutions below

1
Lonzak On

Correct answer from Tilman given in a comment (but an answer is more visible thus the copy):

IIRC there are two types of embedded files, document level and annotation level. Removing annotation level should be easy, i.e. remove such PDAnnotationFileAttachment annotations from the page. Removing the document level files should be possible with doc.getDocumentCatalog().getNames().setEmbeddedFiles(null);, but I don't have much time currently.