PHP strip_tags doesn't allow <img> images in DOM

64 Views Asked by At

I'm trying to have the posts description for a Wordpress blog to show only some HTML elements including the images (img). So I have canceled the the_content() in the theme file, and replaced it with a DOM. Please see the code below:

// the_content(); 
$dom = new DOMDocument;
@$dom->loadHTML(strip_tags(mb_convert_encoding(get_the_content(), 'HTML-ENTITIES', 'UTF-8'), '<img>|<p>|<div>|<table>|<thead>|<tbody>|<tfoot>|<tr>|<th>|<td>|<ul>|<ol>|<li>|<strong>|<em>|<h3>|<h4>|<h5>|<h6>|<b>|<i>|<span>'), LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new DOMXPath($dom);
//removes all attributes
$nodes = $xpath->query('//@*');
foreach ($nodes as $node) {
    $node->parentNode->removeAttribute($node->nodeName);
}
//removes <p>&nbsp;</p>
$nodeList = $xpath->query("//p[normalize-space(.)=\"\xC2\xA0\"]"); # &nbsp;
foreach($nodeList as $node) 
{
    $node->parentNode->removeChild($node);
}

//remove all elements attributes except for img > src
foreach ($xpath->query('//@*[not(name()="src")]') as $attr) {
     //@*[not(name()="src" or name()="href")]
     $attr->parentNode->removeAttribute($attr->nodeName);
}

//remove empty html tags
while (($node_list = $xpath->query('//*[not(*) and not(@*) and not(text()[normalize-space()])]')) && $node_list->length) {
    foreach ($node_list as $node) {
       $node->parentNode->removeChild($node);
    }
}
echo wpautop($dom->saveHTML(), true);

The above works, however the images are not displayed as they should. This is the original post content I'm using as reference (ie get_the_content()):

  <p>
    <span style="font-size:22px">
      <strong>power supply:Transmitter: 23A 12V, the battery can be used for one year (calculated by 20 times/1 day), the buyer needs to configure it by himself</strong>
    </span>
  </p>
  <p>
    <span style="font-size:22px">
      <strong>Receiver: 2*1.5V (AA battery), need to be equipped by the buyer</strong>
    </span>
  </p>
  <p>
    <br/>
  </p>
  <p>
    <img src="//ae01.alicdn.com/kf/S3f9212e14ba742fda4daffb1af2d4607v.jpg"/>
    <img src="//ae01.alicdn.com/kf/S9dee11d0fc6a4b1f916c3c8c2db8a63e9.jpg"/>
    <img src="//ae01.alicdn.com/kf/Sdedb55f6814f4822a6bf4eefb8b9e2c3s.jpg"/>
    <img src="//ae01.alicdn.com/kf/Sddf41078da704192aeadea2502c87ee6N.jpg"/>
    <img src="//ae01.alicdn.com/kf/S2f14c98509de47508fa16919fd792f99q.jpg"/>
    <img src="//ae01.alicdn.com/kf/Scdd2596168cc4edbbc157e5913c5512de.jpg"/>
  </p>

As seen, there are img elements. This is the echoed result:

  <p>
    <span>
      <strong>power supply:Transmitter: 23A 12V, the battery can be used for one year (calculated by 20 times/1 day), the buyer needs to configure it by himself</strong>
    </span>
  </p>
  <p>
    <span>
      <strong>Receiver: 2*1.5V (AA battery), need to be equipped by the buyer</strong>
    </span>
  </p>

Where is the issue here? Everything else seems fine.

0

There are 0 best solutions below