AEM Oak Index Wonky Cost Estimation

Question

AEM Oak Index Wonky Cost Estimation

286 Views Asked by EsteBusta At 11 January 2021 at 19:12

i have a query like this:

SELECT * FROM [cq:Page] WHERE ISDESCENDANTNODE("/content/some/specific/path") 
AND LOWER([jcr:content/depth1/depth2/depth3/depth4/prop1]) = "someValue" 
AND LOWER([jcr:content/depth1/depth2/depth3/depth4/prop2]) = "someOtherValue"

And i've implemented an oak lucene-property index as follow:

 <myIndex
    jcr:primaryType="oak:QueryIndexDefinition"
    async="async"
    compatVersion="{Long}2"
    fulltextEnabled="{Boolean}false"
    includedPaths="[/content/some/specific/path]"
    type="lucene">
    <indexRules jcr:primaryType="nt:unstructured">
        <cq:Page jcr:primaryType="nt:unstructured">
            <properties jcr:primaryType="nt:unstructured">
                <prop1
                    jcr:primaryType="nt:unstructured"
                    name="jcr:content/depth1/depth2/depth3/depth4/prop1"
                    propertyIndex="{Boolean}true"/>
                <prop2
                    jcr:primaryType="nt:unstructured"
                    name="jcr:content/depth1/depth2/depth3/depth4/prop2"
                    propertyIndex="{Boolean}true"/>
            </properties>
        </cq:Page>
    </indexRules>
</myIndex>

Although "myIndex" is defined correctly it is more expensive (somehow) than the ootb index "ntBaseLucene", i've checked this ootb index and it is bigger in size and does not have this prop1 and prop2 which i am looking for as part of my query.

Here are some description of both as far as i could check:

MyIndex: (not being pick up)

Small in size
nodeType specific (cq:Page)
type Lucene
Cost is More expensive (while checking with queryDescription tool)

OOTB ntBaseLucene index (being pick up)

Big in size (at least 4 times bigger than mine)
No nodeType specific (nt:Base)
type Lucene
Cost is cheaper than mine (while checking with queryDescription tool)

Is there any reasonable reason why my index is more expensive than the ootb one? Any input is really appreciated.

Original Q&A

There are 2 best solutions below

**Paul Chibulcuteanu** · Answer 1 · 2021-03-03T12:39:33.807000

Have you tried generating the index via OAK index tool generator? It's free: http://oakutils.appspot.com/generate/index

Generated with the tool, the index looks something like:

  - compatVersion = 2
  - async = "async"
  - jcr:primaryType = oak:QueryIndexDefinition
  - evaluatePathRestrictions = true
  - type = "lucene"
  + indexRules 
   + cq:Page 
    + properties 
     + prop2 
      - name = "function*lower*@jcr:content/depth1/depth2/depth3/depth4/prop2"
      - propertyIndex = true
     + prop2_2 
      - name = "jcr:content/depth1/depth2/depth3/depth4/prop2"
      - propertyIndex = true
      - notNullCheckEnabled = true
     + prop1_3 
      - name = "jcr:content/depth1/depth2/depth3/depth4/prop1"
      - propertyIndex = true
      - notNullCheckEnabled = true
     + prop1 
      - name = "function*lower*@jcr:content/depth1/depth2/depth3/depth4/prop1"
      - propertyIndex = true

Besides the definition above make sure to add includedPaths but also excludedPaths to point to at least generic excluded paths that other lucene indexes have:

"/var","/etc/replication","/etc/workflow/instances","/jcr:system"

**guvlnce** · Answer 2 · 2022-09-13T05:33:59.387000

it's a bit old but for the other people I just want to mention that if we want to target our custom index then we can use option index syntax. Please follow this URL By default, queries will use the index with the lowest expected cost (as in relational databases). But some cases, it is needed to specify which index(es) should be considered for a query: https://jackrabbit.apache.org/oak/docs/query/query-engine.html#Query_Option_Index_Tag "If you want to ensure a specific type of index is used for specific type of queries, for example custom indexes are used for custom queries. If there are multiple Lucene fulltext indexes with different aggregation rules, and the index data overlaps. In the query, you want to specify which aggregation rule to use. To temporarily work around limitations of the index implementation, for example incorrect cost estimation. To use index tags, add tags (a multi-valued String property) to the index(es) of choice, for example: /oak:index/lucene/tags = [x, y] Note each index can have multiple tags, and the same tag can be used in multiple indexes. The syntax to limit a query to a certain tag is: option(index tag ):

/jcr:root/content//element(*, nt:file)[jcr:contains(., 'test')] option(index tag x)

select * from [nt:file] where ischildnode('/content') and contains(*, 'test') option(index tag [x])

AEM Oak Index Wonky Cost Estimation

There are 2 best solutions below

Related Questions in INDEXING

Related Questions in LUCENE

Related Questions in AEM

Related Questions in JACKRABBIT-OAK

Trending Questions

Popular # Hahtags

Popular Questions