I have a MongoDB database where are storage different files, those files can be png, jpg, pdf, here an example:
{
"_id" : "id of the document",
"metadata" : {
"s" : "documents",
"u" : "id of the owner",
"d" : ""
},
"filename" : "El-jugador-Fedor-Dostoyevski.pdf",
"contentType" : "application/pdf",
}
I deleted some fields not relevant, the type of the document is given by the contentType field, what I want is to get the count of each type of file using Scala and ReactiveMongo
I already did it but doing three consultations, in this way:
def contentTypeStats(implicit ec: ExecutionContext): Future[ContentTypesDTO] = {
collectionFactory.collection().flatMap(collection => {
val filterDocuments = BSONDocument("metadata.s" -> BSONDocument("$ne" -> "thumbnail"))//don't want count thumbnails, only documents
val filterPNG = BSONDocument(filterDocuments, "contentType" -> "image/png")
val filterJPG = BSONDocument(filterDocuments, "contentType" -> "image/jpeg")
val filterPDF = BSONDocument(filterDocuments, "contentType" -> "application/pdf")
val countPNGFuture: Future[Long] = collection.count(Some(filterPNG))
val countJPGFuture: Future[Long] = collection.count(Some(filterJPG))
val countPDFFuture: Future[Long] = collection.count(Some(filterPDF))
for {
countPNG <- countPNGFuture
countJPG <- countJPGFuture
countPDF <- countPDFFuture
} yield {
ContentTypesDTO(
pngCount = countPNG,
jpgCount = countJPG,
pdfCount = countPDF
)
}
})
}
I would like to do this with only one consult, in MongoDB I do it this way:
db.getCollection('myCollection').aggregate([
{$match: {'metadata.s': {$ne: 'thumbnail'}}},
{$group: {_id: "$contentType", count: {$sum: 1}} }
])
That return me this:
/* 1 */
{
"_id" : "image/png",
"count" : 5.0
}
/* 2 */
{
"_id" : "application/pdf",
"count" : 9.0
}
/* 3 */
{
"_id" : "image/jpeg",
"count" : 8.0
}
I try this way:
def contentTypeStats(implicit ec: ExecutionContext): Future[ContentTypesDTO] = {
collectionFactory.collection().flatMap(collection => {
import collection.AggregationFramework._
val result: Future[Option[BSONDocument]] = collection.aggregatorContext[BSONDocument](
pipeline = List(
Match(BSONDocument("metadata.s" -> BSONDocument("$ne" -> "thumbnail"))),
Group(BSONDocument("_id" -> "$contentType"))("count" -> SumAll)
)
).prepared.cursor.headOption
result.map {
case Some(doc) =>
println(doc.getAsOpt[String]("_id"))//here always return None
ContentTypesDTO(
pngCount = doc.getAsOpt[Long]("count").getOrElse(0L),
jpgCount = doc.getAsOpt[Long]("count").getOrElse(0L),
pdfCount = doc.getAsOpt[Long]("count").getOrElse(0L)
)//all have the same number
}
})
}
That method return None when ask for the _id and the count field give randomly some of the previous results (5, 8, 9), it should be a way to access the specific count field of each _id that should be either image/png or image/jpeg or application/pdf but how if I can get the _id
There were a couple of problems in my solution
here I was mapping the
_idas aBSONDocument, that's why a could not get that_id, so the solution was to mapped as aBSONStringin this wayThe
Groupmethod always create the_idfield with the first parameter. The second problem was returning the result, here.prepared.cursor.headOptiononly return me the first BSONDocument from the created group. To fixed using the Cursor class from reactivemongo.prepared.cursor.collect[List](-1, Cursor.FailOnError[List[BSONDocument]]())that return me a List of BSONDocument After that also change theContentTypesDTOto just beAnd using map on the
resultto get aSeq[ContentTypesDTO]Here the final solution:That method returns this: