Solr highlighting words in synonyms.txt along with query terms

310 Views Asked by At

I am a newbie to Solr and trying to find out what is the general solution being used for the multi-word synonym issue with highlighting:
1. When we search for a hair brush, the word toothbrush is also being highlighted as toothbrush is in the synonyms.txt file.

HAIR BRUSH,HAIRBRUSH,HAIR-BRUSH,HAIRBRUSHES,HAIR BRUSHES 
TOOTH BRUSH,TOOTHBRUSH,TOOTH-BRUSH,TOOTHBRUSHES,TOOTH BRUSHES 
  1. Could you please let me know if this is because the SynonymGraphFilterFactory is being used in both indexing-time and query-time?
  2. If not, what needs to be done so that the terms not matching the query term is not highlighted.

The schema.xml configuration for the fieldType is as follows:

<fieldType name="text_en" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
           <charFilter class="solr.PatternReplaceCharFilterFactory" 
                    pattern="[({.,\[\]/})]" replacement=" "/>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="false"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/>
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" preserveOriginal="1"  catenateAll="1"  />
        <filter class="solr.ASCIIFoldingFilterFactory" preserveOriginal="true"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.SnowballPorterFilterFactory" language="English" />
      </analyzer>
      <analyzer type="query">
           <charFilter class="solr.PatternReplaceCharFilterFactory" 
                    pattern="[({.,\[\]/})]" replacement=" "/>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="false"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt" /> 
        <filter class="solr.ASCIIFoldingFilterFactory" preserveOriginal="true"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.SnowballPorterFilterFactory" language="English" />
        </analyzer>
    </fieldType>

We are using Solr: 6.5.1

0

There are 0 best solutions below