I'm making a web crawler, but I'm getting an error because I can't use more than one dom element. I think I need to manipulate the dom element, but I have no idea how to do it.
Im using Symfony DomCrawler and Sunra PhpSimple HtmlDomParser
Code:
$crawler = $this->crawler;
$crawler->addHtmlContent(HtmlDomParser::file_get_html($url, false, null, 0));
// Getting the URL data
$crawler
->filter('a')
->each(function (crawler $node) use ($url): void {
$url_fr_hrf = $node->attr('href');
if(str_starts_with($url_fr_hrf, '/') OR str_starts_with($url_fr_hrf, '#')): $url_fr_hrf = $url . $node->attr('href'); endif;
$this->datas = [
'url' => $url_fr_hrf,
];
// Checking Urls
if(substr_count($this->datas['url'], '/') > 4 && parse_url($this->datas['url'], PHP_URL_HOST) === parse_url($url, PHP_URL_HOST)):
// Not searcing for the under links
else:
$check = $this->db->db->prepare("SELECT * FROM crawler WHERE url = ?");
$check->execute([$this->datas['url']]);
$check_f = $check->fetch(PDO::FETCH_ASSOC);
if($check_f['url'] === $this->datas['url']):
// Url already exists
else:
$insert = $this->db->db->prepare("INSERT INTO crawler SET url = ?");
$insert->execute([$this->datas['url']]);
endif; endif;
$this->url = $this->datas['url'];
sleep(0.5);
});
//echo $url . PHP_EOL;
$ins = $this->db->db->prepare("SELECT * FROM crawler"); $ins->execute();
while ($links = $ins->fetch(PDO::FETCH_ASSOC)):
$this->request($links['url']);
endwhile;
Error: Uncaught InvalidArgumentException: Attaching DOM nodes from multiple documents in the same crawler is forbidden. in...
Please help me solve this error