I am trying to scrape this webpage. In this webpage I have to get the job title and its location. Which I am able to get from my code. But the problem is coming that when I am sending it in XML, then only one detail is going from the array list.
I am using goutte CSS selector library and also please tell me how to scrap pagination in goutte CSS selector library.
here is my code:
$httpClient = new \Goutte\Client();
$response = $httpClient->request('GET', 'https://www.simplyhired.com/search?q=pharmacy+technician&l=American+Canyon%2C+CA&job=X5clbvspTaqzIHlgOPNXJARu8o4ejpaOtgTprLm2CpPuoeOFjioGdQ');
$job_posting_location = [];
$response->filter('.LeftPane article .SerpJob-jobCard.card .jobposting-subtitle span.JobPosting-labelWithIcon.jobposting-location span.jobposting-location')
->each(function ($node) use (&$job_posting_location) {
$job_posting_location[] = $node->text() . PHP_EOL;
});
$joblocation = 0;
$response->filter('.LeftPane article .SerpJob-jobCard.card .jobposting-title-container h3 a')
->each( function ($node) use ($job_posting_location, &$joblocation, $httpClient) {
$job_title = $node->text() . PHP_EOL; //job title
$job_posting_location = $job_posting_location[$joblocation]; //job posting location
// display the result
$items = "{$job_title} @ {$job_posting_location}\n\n";
global $results;
$result = explode('@', $items);
$results['job_title'] = $result[0];
$results['job_posting_location'] = $result[1];
$joblocation++;
});
function convertToXML($results, &$xml_user_info){
foreach($results as $key => $value){
if(is_array($value)){
$subnode = $xml_user_info->addChild($key);
foreach ($value as $k=>$v) {
$xml_user_info->addChild("$k",htmlspecialchars("$v"));
}
}else{
$xml_user_info->addChild("$key",htmlspecialchars("$value"));
}
}
return $xml_user_info->asXML();
}
$xml_user_info = new SimpleXMLElement('<root/>');
$xml_content = convertToXML($results,$xml_user_info);
$xmlFile = 'details.xml';
$handle = fopen($xmlFile, 'w') or die('Unable to open the file: '.$xmlFile);
if(fwrite($handle, $xml_content)) {
echo 'Successfully written to an XML file.';
}
else{
echo 'Error in file generating';
}
what i got in xml file --
<?xml version="1.0"?>
<root><job_title>Pharmacy Technician
</job_title><job_posting_location> Vallejo, CA
</job_posting_location></root>
what i want in xml file --
<?xml version="1.0"?>
<root>
<job_title>Pharmacy Technician</job_title>
<job_posting_location> Vallejo, CA</job_posting_location>
<job_title>Pharmacy Technician 1</job_title>
<job_posting_location> Vallejo, CA</job_posting_location>
<job_title>Pharmacy Technician New</job_title>
<job_posting_location> Vallejo, CA</job_posting_location>
and so on...
</root>
You overwrite the values in the
$resultsvariable. You're would need to do something like this to append:However here is no need to put the data into an array at all, just create the XML directly with DOM.
Both your selectors share the same start. Iterate the card and then fetch related data.