How to Extract data from html to PHP Array using preg_match or other method

42 Views Asked by At

I have html page from old website that has list of some places using below format.

<p><b>Ado’s Kitchen &amp; Bar&nbsp; </b>1143 13th St., 720-465-9063; <a href="http://www.span-ishatthehill.com">span-ishatthehill.com.</a> Laid back restaurant with global menu. Open for breakfast and lunch daily and dinner Mon.-Sat.</p>
    

<p><strong>Blackbelly Market</strong> 1606 Conestoga St. #3, 303-247-1000; <a href="http://www.blackbelly.com">blackbelly.com</a>. Locavore dining, butchery and bar. Open daily for happy hour and dinner; see website for market hours.</p>

I am going to use this data for listing page. so i need to get this data in correct formate like

$arr = [
'name'=>'', //in <b> tag
'address'=>'', //after <b> tag
'phone'=>'', //after address. address is end with comma 
'website'=>'', //after number number, number is ended with semicolon and in a tag
'description'=>'', //after <a> tag
]

I tried to use preg_match but can not extract content those are no in a tag, eg address or phone number etc.

$htmlContent = 'content here';
preg_match('/<b>(.*?)<\/b>/s', $htmlContent, $match); /*for address */
    preg_match('/< strong >(.*?)<\/strong >/s', $htmlContent, $match); /*for address */

preg_match('/<a href="(.*?)">(.*?)<\/a>/s', $htmlContent, $match); /*for website */

using this code i can get website address or address (from tag) but how to get phone, address and other details?

Thanks

1

There are 1 best solutions below

1
martin On BEST ANSWER

you can use a single regular expression to catch the data. like this:

preg_match('#<p><b>(?<name>.*)</b>(?<address>.*),(?<phone>.*);.*<a.*href="(?<website>.*)".*>.*</a>(?<description>.*)</p>#', $htmlContent, $match);

then you can retrieve the matches like this:

$name = $match['name'];
$address = $match['address'];
$phone = $match['phone'];
...

if you want to see in more detail how this regular expression works here is the link: [1]: https://regex101.com/r/EYpXwi/1