I am using following code to detect whether a file is PDF/A-1b file or not?
public boolean isPDF_A1BFile(File file) throws IOException {
PreflightParser parser = new PreflightParser(file);
parser.parse(Format.PDF_A1B);
PreflightDocument preflightDocument = parser.getPreflightDocument();
preflightDocument.validate();
ValidationResult validationResult = preflightDocument.getResult();
return validationResult.isValid(); //Return false in every case
}
But it is always returning false irrespective of file is PDF/A-1b or not. I am using this pdf/a-1b file. I have validated using preflight tool in acrobat and it is saying that the file is PDF/A-1b compliance. Sharing the screenshot for the same
Can anyone please tell me whats wrong in my code or am I missing something?
Also, is there any way where I can check that the file is PDF/A-2B compliance or not?
The file is tolerated by some PDF applications, as many will fix such discrepancies but pdf box is detecting many oddities, I did not try to spend much time but the comments seemed potentially valid thus the file is potentially non conformant.
So on the face of it I simply rebuilt the file using "clean" in MuPDF and reran for validation in PDF box.
C:\Apps\PDF\inspectors\Apache\preflight-app-3.0.0-alpha3.jar Doc1-withHelvetica-pdfa1ba.pdfHOWEVER catch 22, now it fails others validations as it reports
So recycle by remove PDF/A compatibility and see what's wrong by regenerate as PDF/A and now the report is there is at least 1 bad font definition for Calibri (not surprising as it was previously a word document printout.) What is not obvious is there is a rogue Calibri space character at the end of the line that contains Helvetica Bold and on removal, then that reports other problems so another run through the Editors and finally with all the dross removed, both agree no more problems.