Anchor tags in URL being crawled by Nutch

48 Views Asked by At

For some reason nutch is crawling URLs with the anchor tag (#). I even updated the regex-filter to not include the # in URLs during the crawl, but that doesn't seem to be working.

Any ideas what may be going on?

0

There are 0 best solutions below