Aggregation in MongoDB

Posted By :Razat Vir Singh |31st January 2023

MongoDB Aggregation

 

Using lookup, group, project and unwind to make important queries, 

 

What's an aggregation channel?

 

An Aggregation Pipeline is a series of blocks of calculation that you apply one by one to a set of documents.

 

Each channel stage performs some new calculation or manipulation on the documents to which it's passed and also passes them on to the coming stage.

 

The stages can find, sludge, join or manipulate the documents and there's a channel driver for just about everything that you want to do.

 

That being said, I would estimate that 90 of the channeling that we do consists of $match,$project, $lookup, $unwind, and $group.

 

So, let’s get into it. Match finding and filtering documents

 

The match does what it says. It passes only the documents that match the query that's used onto the coming stage in the channel. This works the same as the sludge query that you pass to MongoDB’s discovery() system.

 

There's a whole range of aggregation expressions that can be used to make your$ match more flexible, for exp.

 

{

  $match: {

    product-type: {

      $in: ['shoe', 'shirt']

    }

  }

}

 

Project re-shaping documents

 

project is a way of re-shaping the documents that you have at a particular stage in the channel. protuberance doesn’t sludge or find any documents, so there will be the same number of documents in the channel ahead and after this stage, but the documents will look different.

 

You can brand/ remove fields, or produce new advised fields. This is great for simplifying the documents and making sure that you have only the data that you need.

 

You're also suitable to elect the fields that you do want or the field that you don’t want by using 0 or 1 as the protuberance value. Note, when using 1 to only keep fields, it will always be kept unless you specify else!

 

{

  $project: {

    created: 1,

    products: 1

  }

}

 

Keeps everything except the created and products fields of the input croakers.

{

  $project: {

    created: 0,

    products: 0

  }

}

 

Lookup costing from different collections

 

The capability to look up documents in other collections is one of the most important aspects of aggregation.

 

Let’s say that you have a collection of orders, and you want to see the information about the products relating to each order. Using regular query functions for this is extremely hamstrung because you would need to run a query for every order to get its product information. It's far more effective to get all the information using one aggregation query

 

{

  $lookup: {

    ** the collection you want to get docs from

    from: 'Product',

    ** The new field in the local docs

    ** with what the lookup finds (can be anything!)

    as: 'products', 

    ** the field on the current collection

    ** used in the search

    local field: 'productIds',

    ** the field on the collection you're

    ** searching to match the 'localField'

    foreignField: '_id', 

  }

}

 

The$ lookup adds a new field( products) containing an array of documents where the specified localField and foreignField are matching.

 

Unwind breaking out of arrays

 

The$ unwind driver takes an array field and makes a set of identical documents, one for every element in the array.

 

Using$ unwind with$ lookup

 

It's common to see$ lookup used in confluence with the$ unwind channel driver.$ unwind takes an array property on a document and turns it into a new document for every element in the array.

 

Let’s say that you have a collection of products, and you want to find the Suppliers of the products in the collection from the Supplier collection

 

Product Documents

{ _id: 1, supplierId: 1 },

{ _id: 2, supplierId: 4 },

{ _id: 3, supplierId: 2 },

You can use $lookup to find the supplier like this:

{

  $lookup: {

    from: 'Supplier',

    as: 'supplier',

    localField: 'supplierId',

    foreignField: '_id'

  }

},

 

The problem then's that a lookup returns an array of documents

Product documents with$ lookup'd suppliers

{

  _id: 1,

  supplierId: 1,

  supplier: [{

    _id: 1, name: 'Rahul'

  }]

},

{

  _id: 2,

  supplierId: 4,

  supplier: [{

    _id: 4, name: 'Sumit'

  }]

},

{

  _id: 3,

  supplierId: 2,

  supplier: [{

    _id: 2, name: 'Manish'

  }]

},

Because we know that, it is a unique field we also know that the array that's created by the lookup will only ever have one element, so we can unwind the documents.

{ $unwind: '$supplier' },

 

Note:  You need to prefix the field that you want to unwind with a ‘$ ’

This rolls out the arrays and leaves us with what we wanted

Added up product documents with suppliers

{

  _id: 1,

  supplierId: 1,

  supplier: {

    _id: 1, name: 'Rahul'

  }

},

{

  _id: 2,

  supplierId: 4,

  supplier: {

    _id: 4, name: 'Sumit'

  }

},

{

  _id: 3,

  supplierId: 2,

  supplier: {

    _id: 2, name: 'Manish'

  }

},

Group collecting documents into groups

You can use $group to bunch documents together grounded on a field value that's common to all of the documents. It can also be used for useful effects like casting all of the values of a specific field.

 

{

  $group: {

    _id: 'productType',

    count: { $sum : 1 }

  }

}

 

Sum the retail prices for all documents in the channel

{

  $group: {

    _id: null,

    totalRetailPrice: { $sum: '$retailPrice' }

  }

}

NOTE:  If you want to group all documents into the channel at the point the group is executed into one document, also you can set the group, id to null

The group only passes along the values that you specify within the stage. In the former exemplifications, the only particulars on the documents after the group stage would be, id and count in the first, or, id and maxReference in the alternate.

If you want to keep the other field on the object you need to decide how the group should deal with them. There are several ways to accumulate the fields together

 

{

  $group: {

    _id: 'productType',

    ** Keep the highest value

    maxReference: { $max: '$reference' },

    ** Make an array of distinct values for material

    materials: { $addToSet: '$material' },

    ** Keep the last value (useful if documents are sorted)

    mostRecentlyCreated: { $last: '$created' },

  }

}

 

Conclusion

This is just the launch, with the colorful drivers, expressions, accumulators, and all the other tools that MongoDB provides you can recoup your data on your terms.


 


About Author

Razat Vir Singh

Razatvir Singh is an experienced MEAN stack developer with a diverse skill set and practical experience in using the latest tools and frameworks. His expertise includes Angular, Node.js, HTML, CSS, JavaScript, jQuery, and databases such as MongoDB and Opensearch. He has worked on various projects, such as E-Cart, a fully functional e-commerce website, MS-Excel Clone Project, a project where data is fetched from a popular sports website called "CricInfo.com" using Node.js packages and JavaScript, Konfer Project and he has also worked on other projects like Automation and Prediction of Stock Prices using Machine Learning. With his diverse skill set and expertise in the latest technologies, Razatvir is well-equipped to take on any MEAN stack development project and deliver high-quality solutions that meet the client's requirements.

Request For Proposal

[contact-form-7 404 "Not Found"]

Ready to innovate ? Let's get in touch

Chat With Us