How to optimize Querybuilder query

214 Views Asked by At

I have this query -

group.p.or=true
type=cq:Page
p.limit=10
group.1_group.path=/content/path/path1
group.1_group.1_group.p.or=true
group.1_group.1_group.1_property.value=false
group.1_group.1_group.2_property=jcr:content/pageTemplateType
group.1_group.1_group.1_property=jcr:content/pageTemplateType
group.1_group.1_group.2_property.operation=unequals
group.1_group.1_group.1_property.operation=exists
group.1_group.1_group.2_property.value=template
group.1_group.path.self=true

group.2_group.path=/content/path/path-2
group.2_group.1_group.p.or=true
group.2_group.1_group.1_property.value=false
group.2_group.1_group.2_property=jcr:content/pageTemplateType
group.2_group.1_group.1_property=jcr:content/pageTemplateType
group.2_group.1_group.2_property.operation=unequals
group.2_group.1_group.1_property.operation=exists
group.2_group.1_group.2_property.value=template
group.2_group.path.self=true

What I am trying to do is that, query multiple paths and return the paths which has the property pageTemplateType value not equal to 'template' or the property pageTemplateType does not exists.

This query works fine but it takes long time more than 1 second. But if I just remove the self i.e group.2_group.path.self=true or group.1_group.path.self=true then it takes around only 0.02 second. So I do not understand how to optimize it, how to use self efficiently.

1

There are 1 best solutions below

0
Alexander Berndt On BEST ANSWER

Unfortunately a query with path.self=true will NEVER be fast in AEM. Better try to use another query, without the path.self=true.

Internally the QueryBuilder uses the PredicateEvaluator's to construct an XPath query PredicateEvaluator.getXPathExpression(...). This translation is done with "best effort". Then the results of the XPath-query is filtered by all remaining predicates (which couldn't be fully converted to Xpath) PredicateEvaluator.includes(...). See Predicate API

Now the PathPredicateEvaluator has a problem. The JCR-Path (at least in JackRabbit) does not support the descendant-or-self axis. So the XPath query /content/path/path1/descendant-or-self::node() is not supported. As result the XPath Query will search the entire repository, and the Path-Predicate uses the afterwards filtering. Probably it had been better, to search from the parent-node, instead from the repository-root. But thats the way it is implemented.

You can check that in the query debugger. http://localhost:4502/libs/cq/search/content/querydebug.html


You can test this behaviour with 2 simplified queries (in the query-debugger)

type=cq:Page
path=/content/path/path1

The above query is straight translated into the following XPath:

/jcr:root/content/path/path1//element(*, cq:Page)

Now with self=true:

type=cq:Page
path=/content/path/path1
path.self=true

The above query is translated in the following XPath

//element(*, cq:Page)

And after iterating over all pages, the XPath-result-set is filtered with the following Java-Predicate:

{path=path: path=/content/path/path1, self=true}

Proposal to your issue:

  1. Maybe you can use JCR-SQL2 queries, or a custom XPath-Query. For so complicated queries the QueryBuilder is not the best-choice anyway. But maybe you need to use it from the front-end. Then better stick with QueryBuilder, as only the QueryBuilder has a REST-API. The other options require Java-Code.

  2. You searching for Page-Attributes. Instead of searching for a cq:Page, you can search for cq:PageContent. cq:PageContent-nodes will always be child-nodes (even for the root node). Then you don't need to include self. You only need to remove /jcr:content from the result-path (the page will be the parent of the result-node)

Here is the query for approach 2)

group.p.or=true
type=cq:PageContent
p.limit=10
group.1_group.path=/content/path/path1
group.1_group.1_group.p.or=true
  group.1_group.1_group.1_property=pageTemplateType
  group.1_group.1_group.1_property.operation=exists
  group.1_group.1_group.1_property.value=false

  group.1_group.1_group.2_property=pageTemplateType
  group.1_group.1_group.2_property.operation=unequals
  group.1_group.1_group.2_property.value=template


group.2_group.path=/content/path/path-2
group.2_group.1_group.p.or=true
  group.2_group.1_group.1_property=pageTemplateType
  group.2_group.1_group.1_property.operation=exists
  group.2_group.1_group.1_property.value=false

  group.2_group.1_group.2_property=pageTemplateType
  group.2_group.1_group.2_property.operation=unequals
  group.2_group.1_group.2_property.value=template

A starter XPath for approach 1) which searches /content/path/path1 and /content/path/path-2 would be the following (first use a XPath Union, and then the property condition):

(
    /jcr:root/content/path/element(path1, cq:Page)
  | /jcr:root/content/path/path1//element(*, cq:Page)
  | /jcr:root/content/path/element(path-2, cq:Page)
  | /jcr:root/content/path/path-2//element(*, cq:Page)
)
[
  jcr:content/@pageTemplateType != 'template'
  or not(jcr:content/@pageTemplateType)
]

PS: If the above XPath works depends on you indexes and the content. Probably it needs some improvement.