Lets understand MongoDB Aggregations

Posted By :Hemant Samriya |31st March 2022

Definition: Through the aggregation feature, MongoDB facilitates you to process multiple docs and perform operations on them. You can do operations like Group the values of the multiple docs together, perform the operation on the grouped data and it returns meaningful results or information, and we can analyze the data also.

There are 3 ways which can be used to perform aggregations:
    1. Aggregation Pipeline
    2. Single Purpose aggregation methods.
    3. Map-Reduce Operation [Deprecated from MongoDB 5.0]

  • Aggregation Pipeline:
    • We use the aggregate() method to implement this technique, in this method, we pass an array of stages. The array of stages starts from the first stage and its output is resultant in the next stage. This process runs till the last stage, this process work as a pipeline.
              syntax: db.collectionName.aggregate([
                                { $match : { … },
                                { $group : { … },
                                { $sort : { … },
                              ], options)

              What is an option here: for example, for the aggregation stages, up to 100MB can be used, if it exceeds this number then it will throw an error. To resolve this issue we can use the allowDiskUse option here. 
                  ex: db.collectionName.aggregate(pipeline, { allowDiskUse : true })
    • aggrgate() function contains these 3 objects:
      • stage:
                        i. $match:
        It filters on the basis of the condition and reduce the amount of the document to the next stage.
                         syntax: { $match: { <condition> } }
                        ii. $project:
        Select the fields from the documents and you can depict the fields according to your requirement.
        { $project: { <requirement(s)> } }
                        iii. $group:
        It groups the documents on the basis of the values in the documents.
                        syntax: {
                                  _id: <expression>,
                                  <field1>: { <accumulator> : <expression> },
                                  <fieldN>: { <accumulator> : <expression> },
                        iv. $sort:
        It sorts the documents.
                        syntax: { $sort: { <field1>: <sorting order>, <fieldN>: <sorting order> ... } }
                        v. $skip:
        It helps you to skip the N number of documents and return the remaining documents.
                        syntax: { $skip: <integer value> }
                        vi. $limit:
        It limits the starting N number of documents and return those documents. 
        { $limit: <integer value> }
                        vii. $unwind:
        it splits the element of an array in the documents and return the document with each element.
        { $unwind: <field $[new field name]> }
                        viii. $out:
        It writes the results in the new collection and must be the last stage.
                        syntax: { $out: { db: "<db-name>", coll: "<new-collection-name>" } }
      • Expression: The expression is the name of the field in the coming documents.
      • Accumulator: these are basically used in $group stage.
                        i. sum: return the sums of the number values.
                        ii. count: return the count of the total number of documents.
                        iii. avg: return the average of the given values.
                        iv. min: return the min value from the documents.
                        v. max: return the max value from the documents.
                        vi. first: return the first documents from the grouping.
                        vii last: return the last documents from the grouping.
                                    _id: "$id", "total": {$sum:"$fare"}
                        here, $group is a stage, $id and $fare is the expression which is the field of the doc, and $sum is the accumulator.
  • Single Purpose aggregation methods:
    • This way, just bolsters the collection to perform the operations or calculate the results. This is quite a simple way, but has lacked in capabilities.
      • ex: count(), distinct() etc.


  • Map-Reduce Operations:
    • It is deprecated from MongoDB 5.0v, this way uses in the bulk of data or can say large data sets and return computed aggregate result. We have mapReduce() function to perform this operation. it takes four parameters:
          a. Map function: it maps all data in the Key-Value pair.
          b. Reduce function: it performs an operation on the paired data.
          c. Query: can use to filter/query the docs.
          d. Out: generate a new collection.
          So, all the process is run separately, which is effective in a large data set.
    • syntax: db.collectionName.mapReduce(
      • function() {emit(this.key, this.value);},
      • function(key, value){return <calculated_result>},
      • {
      • query : {<condition>},
      • out: "<coll_name>"
      • }
      • )


About Author

Hemant Samriya

Hemant is an experienced backend developer, specializing in Java. He possesses proficiency in various skills, including Java (up to Java 9), MongoDB, MySQL and tools like Postman, Azure, AWS Dashboard, and Searchkit. He is well-versed in IDE tools such as IntelliJ (primary), Eclipse (STS), and VSCode. He is also experienced in web technologies like JavaScript, HTML, CSS, and JSON. In terms of frameworks, he has expertise in Spring Boot (JPA, DATA, MVC, Security) and Hibernate. Additionally, he has hands-on experience with various AWS services such as Lambda, EC2, S3, and CloudWatch. Hemant is also familiar with ElasticSearch/OpenSearch as a search engine and utilizes version control through GitHub. He has contributed to several projects, including Konfer, HP1T, KRB, and many others.

Request For Proposal

[contact-form-7 404 "Not Found"]

Ready to innovate ? Let's get in touch

Chat With Us