How to scale up window function based query on enterprise couchbase?

31 Views Asked by At

Background

This question is too broad probably, but the context here is, i have a couchbase 7.1 enterprise server, and i am running with 3 index node and 3 data node setup.

I am having secondary index on my collection and not having any primary index.

I have written a query with window function row_number to partition and order the documents inside collection

  1. all the where clause entries (columns) in my query is part of my secondary index.
  2. my partition by clause (column on which i am partitioning) is also part of my secondary index.

still the query execution is taking more time around 4-5 seconds. I am having around 200k documents on the collection on which i am querying.

Query : (Please note this is anonymised query to give perspective and not actual query)

WITH data_from_index AS (
SELECT *, ROW_NUMBER(PARTITION BY col1 ORDER BY col2) as num
FROM regions
WHERE country = "usa" AND state="ny" AND language="spanish"
),
ranking AS (
SELECT *
FROM data_from_index
WHERE num=1
),
sorted AS (
SELECT * FROM ranking 
ORDER BY col3
OFFSET 0 LIMIT 500
)
SELECT * FROM sorted;

In above query analysis what i found is the first clause which reads data from collection itself is taking 95% of the time, and it is able to use the index which we have, but it is selecting around 100k records out of my total 200k records, and then in subsequent clauses, it is doing sorting and paging on those 100k.

Question

What are the other things which I should keep in mind to optimize query performance in couchbase, apart from things mentioned in background section, and if I am missing something fundamental here, please point out.

1

There are 1 best solutions below

0
vsr On

Window functions will not reduce the any final result. PARTITION BY, ORDER BY required sort. Which depends on the result set size. Best way diagnosis is:

  1. Run the query in Query workbench
  2. Go to Plan Text Tab
  3. Take look each operator ItemsIn, ItemsOut, various times and see https://www.couchbase.com/blog/optimize-n1ql-performance-using-request-profiling/