MongoDB query by datetime field 1h interval

395 Views Asked by At

I've been trying to extract one record each 1 hour from a collection with 1 record each 10 seconds on the datetime (Date). The document is like this:

datetime: 2021-11-26T15:08:20.000+00:00
owner_id: "INTERNAL"
motor_id: 24

The datetime document field is Date() type.

On this collection I've 1 record each 10 seconds on datetime field (several days), but I would like to extract just one record each 1 hour datetime interval. How could I archive this?

Thanks in advance for any tip, help or direction to the right documentation.

I was trying to follow this: https://mongoing.com/docs/reference/operator/aggregation/filter.html with no success. Not sure how to increment the hour.

EDIT: Some more information about the collection:

collection1 with 26.303 records like this:

{
  "_id": {
    "$oid": "644b34a3756b9c1dc169e20d"
  },
  "datetime": {
    "$date": {
      "$numberLong": "1637939300000"
    }
  },
  "customer_id": "INTERNAL",
  "owner_id": "INTERNAL",
  "GMHP": -0.54,
  "AMHP": 198.19,
  "OMHP": 428.883544921875,
  "AROP": 148.57066345214844,
  "OROP": 102.5451889038086
}

Each document has a 10 seconds interval stored in datetime field. I've plans to change those documents to time-series in the future. My goal is to select the documents on each hour, e.g:

1st document datetime: 2021-11-26T15:08:20.000+00:00 2nd document datetime: 2021-11-26T15:09:20.000+00:00 and so on...

I could safely ignore all fields between T15:08:30 to T15:09:10.

2

There are 2 best solutions below

1
nimrod serok On BEST ANSWER

One option is to simply use $dateToParts. If you have a document for each 10 seconds, the $match step will return the first document for each hour:

db.collection.aggregate([
  {$set: {
      datetimeParts: {
        $dateToParts: {
          date: "$datetime"
        }
      }
  }},
  {$match: {
      "datetimeParts.minute": 0,
      "datetimeParts.second": {$lt: 10}
  }},
  {$unset: "datetimeParts"}
])

See how it works on the playground example

0
Aislan Diego On

After some struggle I was trying to make the query sent by nimrod serok to work both on Mongo and DocumentDB. Then I realized how limited problematic is DocumentDB when using PyMongo or vanilla MongoDB driver.

The $set is not supported on DocumentDB, so I had to replace it by $addFields. After that I realized dateToParts is also not supported because this function expects a string format and I had a Date() type.

If by any chance anyone wants to achieve similar results on DocumentDB here is a workaround. Probably there are better solutions for that, however it worked for me with the same results as Mongo query.

(Im using PyMongo)

the entry_date determines the first datetime from the collection, it's optional. You can replace the minute and second variables to fit your range needs.

I hope it can help the DocumentDB users as well.

entry_date = db.collection.find_one(filter=filtering, sort=[('datetime', 1)])
    entry_date = entry_date['datetime']
    minute = entry_date.minute
    second = entry_date.second
    report = db.collection.aggregate([
        {
            "$addFields": {
                "hour": {"$hour": "$datetime"},
                "minute": {"$minute": "$datetime"},
                "second": {"$second": "$datetime"},
                "year": {"$year": "$datetime"},
                "month": {"$month": "$datetime"},
                "day": {"$dayOfMonth": "$datetime"}
            }
        },
        {
            "$match": {
                "minute": minute,
                "second": second
            }
        },
        {
            "$project": {
                "_id": 0,
                "datetime": 1,
                "hour": 1,
                "minute": 1,
                "second": 1,
                "year": 1,
                "month": 1,
                "day": 1
            }
        }
    ])