0% completed
The aggregation pipeline in MongoDB is a powerful framework for data processing and transformation. It is modeled on the concept of data processing pipelines, where documents pass through a sequence of stages, each performing a specific operation. Understanding each stage is essential for effectively leveraging the full potential of the aggregation framework.
| Stage | Description |
|---|---|
$match | Filters the documents to pass only the documents that match the specified condition(s) to the next pipeline stage. |
$group | Groups input documents by the specified _id expression and for each distinct grouping, outputs a document. |
$project | Passes along the documents with only the specified fields to the next stage. |
$sort | Sorts all input documents and returns them to the pipeline in sorted order. |
$limit | Limits the number of documents passed to the next stage in the pipeline. |
$skip | Skips the first N documents passed to the next stage in the pipeline. |
$unwind | Deconstructs an array field from the input documents to output a document for each element. |
$lookup | Performs a left outer join to a collection in the same database to filter in documents from the “joined” collection for processing. |
$out | Writes the resulting documents of the aggregation pipeline to a collection. |
$count | Returns a count of the number of documents at this stage of the pipeline. |
First, let's insert some documents into the inventory collection to work with:
db.inventory.insertMany([ { item: "apple", qty: 10, category: "fruit" }, { item: "carrot", qty: 20, category: "vegetable" }, { item: "banana", qty: 5, category: "fruit" }, { item: "broccoli", qty: 15, category: "vegetable" }, { item: "grape", qty: 8, category: "fruit" } ])
The $match stage filters documents based on a specified condition. This stage limits the number of documents passed to the next stage of the pipeline by passing only those documents that match the condition.
{ $match: { <field>: <value> } }
db.inventory.aggregate([ { $match: { category: "fruit" } } ])
This command filters the documents to include only those where the category is "fruit". This limits the results to only the fruit items.
The $group stage groups documents by a specified expression and outputs one document per group. The output documents can contain computed fields that hold the result of an aggregation operation on the documents in each group.
{ $group: { _id: <expression>, <field>: { <accumulator>: <expression> } } }
$sum, $avg, $max).db.inventory.aggregate([ { $group: { _id: "$category", totalQty: { $sum: "$qty" } } } ])
This command groups documents by the category field and calculates the total quantity (totalQty) for each category. For example, it sums the quantity of all fruit and vegetable items separately.
The $project stage reshapes each document in the stream, such as by adding new fields or removing existing fields.
{ $project: { <field1>: <include|exclude>, <field2>: <include|exclude>, ... } }
db.inventory.aggregate([ { $project: { item: 1, qty: 1, _id: 0 } } ])
This command includes only the item and qty fields in the output documents and excludes the _id field. This reshapes the documents to show only the item names and their quantities.
The $sort stage sorts documents by a specified field in ascending or descending order.
{ $sort: { <field>: <1|-1> } }
db.inventory.aggregate([ { $sort: { qty: -1 } } ])
This command sorts the documents by the qty field in descending order. Items with the highest quantity will appear first.
The $limit stage limits the number of documents passed to the next stage.
{ $limit: <number> }
db.inventory.aggregate([ { $limit: 3 } ])
This command limits the output to the first 3 documents. This is useful for scenarios where you need only a subset of the results.
The $skip stage skips a specified number of documents before passing the remaining documents to the next stage.
{ $skip: <number> }
db.inventory.aggregate([ { $skip: 2 } ])
This command skips the first 2 documents and passes the remaining documents to the next stage. This is useful for pagination purposes.
The $unwind stage deconstructs an array field from the input documents to output a document for each element.
{ $unwind: "$<arrayField>" }
db.inventory.insertMany([ { item: "fruit basket", contents: ["apple", "banana", "grape"], qty: 3 }, { item: "vegetable basket", contents: ["carrot", "broccoli"], qty: 2 } ]) db.inventory.aggregate([ { $unwind: "$contents" } ])
This command deconstructs the contents array field from the inventory documents, outputting a document for each element in the array. For example, the document { item: "fruit basket", contents: ["apple", "banana", "grape"], qty: 3 } would be transformed into three documents, one for each fruit in the basket.
.....
.....
.....