Performance or index usage of the queries on an optional key in MongoDB

Question

Performance or index usage of the queries on an optional key in MongoDB

72 Views Asked by Bear Bile Farming is Torture At 22 March 2024 at 23:57

When an optional key in a collection is queried , performance varied with respect to the following two queries.

Good response : When it queried with a specific value - a value other than null, it resulted good response time.
Poor response : When it queried with the operator $exists, it resulted poor response time.

This difference in performance should be due to difference in the execution plans chose by query optimiser. However I am not able to understand the reasoning behind it.

Therefore I request if you can explain the reason for this difference in performance, it would help me to increase and confirm my understanding.

Collection Statistics

Number of documents : 100,000
Name of the optional key : field_A
Present status of field_A : It does not exists in any of the documents.
Index : { field_A : 1}

The two queries:

Query 1 - with Good response time

Query document : { field_A : }
Execution time : 0.5 seconds
Number of documents returned : 0

Query 2 - with poor response time

Query document : { field_A : { $exists : true } }
Execution time : 5 seconds
Number of documents returned : 0

Original Q&A

There are 2 best solutions below

**Wernfried Domscheit** · Answer 1 · 2024-03-23T07:24:55.927000

Most likely it uses the index.

An index (more precisely a B*Tree index) hss a sorted list of all your values. So, in your case the index hold 100k times the value undefined or similar. You have to read all these values to determine whether the value is undefined or not.

On the other scenario, when you look for some_value, then you know the result already after reading the first entry from the index, because it would appear before the first undefined value in case the value would exists.

Don't take above explanation too literally, it just explains the principle.

Just a note, many database do not store any value in index if the value does not exist. In your case the index size would be 0 Bytes. In such database your index would be useless. However, MongoDB stores also undefined values in the index.

**We do the best for You** · Answer 2 · 2024-03-23T08:07:51.583000

The case is basically about “indexing an optional key”.

An index on an optional key is complete: It is complete in the sense that the number of documents in the index and collection will be the same. It is also to note that for the documents wherein this key is either “not existing” or “null” will also be included in the index. The reason behind this kind of indexing is that by default indexes are non-unique or in other words the duplicated key values are also included in the index.

Now on to this case:

In this case the optional key is “field_A”. Let us take a sample collection as below. It has 4 documents and the index on the key “field_A” will also have 4 documents. The 3 documents be indexed under the same value - null and 1 document with a value 1. This behaviour has been explained under the section “Sparse index” in the manual:

Sample collection :
[
  { field_A: 1, field_B: ‘a’ },
  { field_B: ‘b’ },
  { field_B: ‘c’ },
  { field_B: ‘d’ } ]

Sample index - field_A :
[ { field_A: null } pointed to the document { field_B: ‘a’ }  ,
  { field_A: null } pointed to the document { field_B: ‘b’ }  ,
  { field_A: null } pointed to the document { field_B: ‘c’ }  ,
  { field_A: 1 } pointed to the document    { field_A: 1, field_B: ‘d’ } ]

First query - { field_A: "some_value" }

Now the query under this context, { field_A: "some_value" } will be supported by this index. In a sample case, while querying for a value 1, as the indexes are ordered, it will find the value 1 quickly and will return the document. Therefore this query should be seen performant.

Second query - { field_A: { $exists: true } }

This query will not use this index therefore it will perform a collection scan. Thus it will take time. The reason for not using the index has been explained in the manual under section “$exists” as below :

“Queries that use { $exists: true } on fields that use a non-sparse index or that use { $exists: true } on fields that are not indexed examine all documents in a collection.”

In this case the index will not be used since the index is a non-sparse index. This query calls for a sparse-index.

Sparse index https://www.mongodb.com/docs/manual/core/index-sparse/#std-label-index-type-sparse:~:text=%E2%86%92%20MongoDB%20Manual-,Sparse%20Indexes,-Sparse%20indexes%20only

$exists https://www.mongodb.com/docs/manual/reference/operator/query/exists/#:~:text=a%20FETCH.-,Queries%20that%20use,-%7B%20%24exists%3A%20true%20%7D

Thanks WeDoTheBest4You

Performance or index usage of the queries on an optional key in MongoDB

There are 2 best solutions below

Related Questions in MONGODB

Related Questions in MONGODB-QUERY

Related Questions in AGGREGATION-FRAMEWORK

Related Questions in MONGODB-INDEXES

Trending Questions

Popular # Hahtags

Popular Questions