Nutch 1.19 / Solr 9.4.0 How to point Nutch to the Solr instance?

65 Views Asked by At

I've been trying to get Solr and Nutch setup, following the tutorial here.

However, I'm stuck at the end of the tutorial where it says: After that you need to point Nutch to the Solr instance:

(Nutch 1.15 and later) edit the file conf/index-writers.xml, see IndexWriters

How exactly should I edit the file to point Nutch to the Solr instance?

I looked at the linked IndexWriters page but couldn't find any answers.

I used the default core name "nutch".

I was able to get Nutch to crawl but the data is not seen in the Solr core.

1

There are 1 best solutions below

0
Jakob Berlin On

according to mentioned doc it could end up in a simple file as:

<writers xmlns="http://lucene.apache.org/nutch"
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     xsi:schemaLocation="http://lucene.apache.org/nutch index-writers.xsd"><writer id="indexer_solr_1" class="org.apache.nutch.indexwriter.solr.SolrIndexWriter">
<parameters>
  <param name="url" value="YOUR_SOLR_URL_HERE_INCL_SLASH_SOLR"/>
</parameters>
<mapping>
  <copy/>
  <rename/>
  <remove/>
</mapping></writer></writers>