On my website I have a page for the cart, that is: http://www.example.com/cart
and another for the cartoons: http://www.example.com/cartoons
. How should I write on my robots.txt file to ignore only the cart page?
The cart page does not accept an ending slash on the URL, so if I do:
Disallow: /cart
, it will ignore /cartoon
too.
I don't know if it's possible and it will be correctly parsed by the spider bots something like /cart$
. I dont want to force Allow: /cartoon
because may be another pages with the same prefix.
You could explicitly allow and disallow both paths. More specific paths will take a higher precedent if they are longer in length:
More info is available at: https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt