We are indexing our objects into Solr and let users to sort by different name. The sort field is defined as specified below in schema.xml:
<fieldType name="sortabletext" class="solr.TextField" sortMissingLast="true" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.TrimFilterFactory" />
</analyzer>
</fieldType>
In case I have the following name in my data
- Test
- West
- itest
- end
while using the Solr sorting by name, the upper case comes first followed by lower cases like
- Test
- West
- end
- itest
I think this is happening since since the ASCII uppercase codes are smaller than lower case but from user side this is not a good experience.Is there way I can customize this behavior similar to if I run the similar query on the database?
Standard
TextFields don't sort intuitively because they are analyzed into tokens and the pre-analyzed (raw) field value isn't stored because it is generally very long.Luckily, solr offers a
SortableTextFieldwhich will store the first 1024 (though this is configurable) characters of the pre-analyzed field value as a doc values field which it will use when sorting theSortableTextField