I am developing a plugin for my WordPress site. I want to select all non-empty paragraph elements.
Here is my code :
function my_php_custom_function($content){
// Create a new DOMDocument instance
$dom = new DOMDocument();
// Load the HTML content into the DOMDocument
$dom->loadHTML(mb_convert_encoding($content, 'HTML-ENTITIES', 'UTF-8'), LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
// Create a DOMXPath object to query the DOM
$xpath = new DOMXPath($dom);
// Find all non-empty p elements in the content
$p_elements = $xpath->query('//p[string-length(normalize-space()) > 0]');
}
add_filter('the_content','my_php_custom_function')
$p_elements in this variable I am getting those paragraphs also which I have just created by pressing enter. When I check on DOM, it is showing as <p> </p>
You're likely using some sort of WYSIWYG editor for your content, which in some cases produce elements only containing
To get non-empty P elements and also ignoring P elements containing only
your XPath could look like the following:Updated answer:
Apparently, the representation in the DOMDocument of the
converts fully (viabin2hex()toc2a0. Using this knowledge, we can input it as the hexidecimal conversion instead (\xC2\xA0).This would render your query to look somewhat like the following:
While not pretty (due to all the escaping), it works in my small tests.