10
first element
10
first element
10
first element

PHP preg_match_all extract id and name, where id in tag is optional

101 Views Asked by At

I have following code:

<?php
$html = '<div>
    <div class="block">
        <div class="id">10</div>
        <div class="name">first element</div>
    </div>
    <div class="block">
        <div class="name">second element</div>
    </div>
    <div class="block">
        <div class="id">30</div>
        <div class="name">third element</div>
    </div>
</div>';

preg_match_all('/<div class="block">[\s]+<div class="id">(.*?)<\/div>[\s]+<div class="name">(.*?)<\/div>[\s]+<\/div>/ms', $html, $matches);

print_r($matches);

I want to get array with id and name, but the second position doesn't have id, so my preg match skipped this one. How can I generate array without skip and print sth like this [ ... [id => 0 // or null, name => 'second element'] ...]?

1

There are 1 best solutions below

0
fusion3k On

Use DOMDocument to solve this task; there are a lot of good reasons not to use regular expressions.

Assuming your HTML code is stored in $html variable, create an instance of DOMDocument, load the HTML code, and initialize DOMXPath:

$dom = new DOMDocument();
libxml_use_internal_errors(1);
$dom->loadHTML($html, LIBXML_NOBLANKS);
$dom->formatOutput = True;
$xpath = new DOMXPath($dom);

Use DOMXPath to search for all <div> nodes with class "name" and prepare an empty array for the results:

$nodes = $xpath->query('//div[@class="name"]');
$result = array();

For each node found, run an additional query to find the optional node with class "id", then add a record to the results array:

foreach ($nodes as $node) {
    $id = $xpath->query('div[@class="id"]', $node->parentNode);
    
    $result[] = array(
        'id' => $id->count() ? $id->item(0)->nodeValue : null,
        'name' => $node->nodeValue
    );
}

print_r($result);

This is the result:

Array
(
    [0] => Array
        (
            [id] => 10
            [name] => first element
        )

    [1] => Array
        (
            [id] => 
            [name] => second element
        )

    [2] => Array
        (
            [id] => 30
            [name] => third element
        )

)