I'm looking for a way to create a static index to serve filtering/sorting queries for a combination of property values in the nested structure (a collection of objects) along with the structure-container. It seems to be not trivial due to the following reasons:
- If properties of the nested structure/collection are separated out into individual fields of the index (individual collections), then it makes impossible using the
ANDcondition when filtering on 2+ properties of the nested structure/collection. - The fanout index complexity (see an example), which makes any solution to run too slow.
Given the following persistent model:
public class Document
{
public string Title { get; set; }
public List<UserChange> RecentModifications { get; set; }
}
where
public class UserChange
{
public string UserId { get; set; }
public DateTime Timestamp { get; set; }
}
Question: How to build the index for Document to filter/sort by a combination of all the fields: Title, UserId and Timestamp?
Possible use cases:
- get all documents containing the word 'contract' for a certain user and a date range
- sort documents containing the word 'contract' by last modification made by the user.
P.S. I understand that the indexing restrictions can by bypassed by restructuring the persistence model - storing a structure for recently modified documents in the User document, but it'd impose some other limitations, which I'd like to avoid.
The problem can be solved by using Indexes with Dynamic Fields. It allows to keep logical data structure and avoids creating a fanout index.
Solution
Create the following index for the
Documentcollection from the above:Query on
MyIndexlikeExplanation
The specified index with dynamic fields will generate extra fields and terms for each record in the following format:
The format is important, because when you use a dictionary in your high-level query like
myDic[key], it gets transformed tomyDic_keyin the generated RQL. Hence, it will allow us to use these fields in queries.If you query with using the usual
Queryrather thanDocumentQuery(see docs), then you need a proper data type for LINQ to work. For that purpose I createdDocumentIndDtoclass, where myRecentModificationshas become a dictionary, so I could use it in a high-level query and get correct RQL likeFor more details, see my discussion on the subject with Oren Eini (aka Ayende Rahien).