Anchor tags in URL being crawled by Nutch

48 Views Asked by dstach At 04 December 2021 at 00:20

For some reason nutch is crawling URLs with the anchor tag (#). I even updated the regex-filter to not include the # in URLs during the crawl, but that doesn't seem to be working.

Any ideas what may be going on?

Original Q&A

Anchor tags in URL being crawled by Nutch

There are 0 best solutions below

Related Questions in NUTCH

Trending Questions

Popular # Hahtags

Popular Questions