“Content is not allowed in prolog” when parsing XML

3.2k Views Asked by At

I am parsing XML via the following java code:

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setValidating(true);
factory.setNamespaceAware(true);
DocumentBuilder builder = factory.newDocumentBuilder();     
XPath xPath = XPathFactory.newInstance().newXPath();
SimpleErrorHandler simpleErrorHandlerObj = new SimpleErrorHandler(RequestDoc);
builder.setErrorHandler(simpleErrorHandlerObj);
InputSource is = new InputSource(new StringReader(CXMLHandlerObj.incoming_cxml));
domdoc = builder.parse(is);

The XML is not getting parsed and is getting an error.

This is an excerpt of the XML for which I am getting an error:

  <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE cXML SYSTEM "http://xml.cxml.org/schemas/cXML/1.2.048/cXML.dtd">
<cXML payloadID="x" timestamp="2021-04-15T23:59:59+00:00" version="1.2.048"><Header>
      <From>
         <Credential
            domain="NetworkId">
            <Identity>x-T</Identity>
         </Credential>
         
         
      <Credential
            domain="SystemID"><Identity>x</Identity></Credential></From>
      <To>
         
         <Credential
                domain="NetworkID"><Identity>x-T</Identity></Credential><Correspondent>
            <Contact
                    role="correspondent">
               <Name
                        xml:lang="EN">x</Name>
               <PostalAddress>
                  <Street>x</Street>
                  <City>x</City>
                  <Country
                    isoCountryCode="NL">x</Country>
               </PostalAddress>
               <Email
            name="routing">[email protected]</Email>
            </Contact>
         </Correspondent>
         XML is continuing..

When parsing the XML, I am getting these errors:

org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:339)

It seems the XML has whitespaces. In my Java code I am already clearing whitespaces via this code:

content = content.replaceAll(">[\\s\r\n]*<", "><");

When sending this XML from the original source, Sap Ariba (Website portal) then I get the errors. When I download the XML, copy it with whitespaces and send it via Postman tool then it get parsed. It could be that there also be a whitespace in front of the XML but I don't know. I am already in contact with Sap Ariba but is there also a way to fix this via java code?

0

There are 0 best solutions below