MongoDB aggregation of multiple collections with $lookup with CosmosDB (Mongo API Enabled)

81 Views Asked by At

Firstly let me note, I am aware the data modelling is questionable here in a non-relational DB. But it is inherited and we are attempting a tactical solution at this point.

We have 3 collections; collectionA, collectionB and collectionC. These have relationships of CollectionA is one-to-many to CollectionB and collectionB is one-to-many to collectionC. The child document holds a ref to the parent in all cases.

Now we have a single use case requiring us to retrieve an aggregate of all 3 in a performant manner. But we are getting differing results with our solution against a CosmosDB instance and a MongoDB test container (closest likeness).

Our attempt to do this as MongoDB API novices is through an aggregation pipeline consisting of 2 stages of $lookup. Importantly, we are using CosmosDB with Mongo API enabled, which has limited support for the actual Mongo API.

Attempting the following aggregation pipeline against CosmosDB actually gives us the result we'd hoped:

[
  {
    $lookup: {
      from: "collectionB",
      localField: "_id",
      foreignField: "CollectionBId",
      as: "collectionBList",
    },
  },
  {
    $lookup: {
      from: "collectionC",
      localField: "collectionBList._id",
      foreignField: "collectionCId",
      as: "collectionBList.collectionCList",
    },
  },
]

Result sample doc:

{
  _id: ...,
  collectionBList: [
    {
      ...
      collectionCList: [...]
    }
  ]
}

However, if we were to attempt the same with a mongo:4.2.0 test container we get a different result:

{
  _id: ...,
  collectionBList: {
      collectionCList: [...]
  }
}

Note collectionBList is an object.

Looking at MongoDB documentation for $lookup aggregation stage, it states "If the specified name already exists in the input document, the existing field is overwritten.", which I guess explains why the input document attribute collectionBList is overwritten for the second stage. But it appears CosmosDB behaves differently.

Part of my question is, have we interpreted this correctly and it is a known difference?

But my main question is, how would you achieve this aggregation with MongoDB aggregation stages?

Many thanks for your input!

0

There are 0 best solutions below