0% completed
The aggregation pipeline in MongoDB is a powerful framework for data processing and transformation. It is modeled on the concept of data processing pipelines, where documents pass through a sequence of stages, each performing a specific operation. Understanding each stage is essential for effectively leveraging the full potential of the aggregation framework.
Stage | Description |
---|---|
$match | Filters the documents to pass only the documents that match the specified condition(s) to the next pipeline stage. |
$group | Groups input documents by the specified _id expression and for each distinct grouping, outputs a document. |
$project | Passes along the documents with only the specified fields to the next stage. |
$sort | Sorts all input documents and returns them to the pipeline in sorted order. |
$limit | Limits the number of documents passed to the next stage in the pipeline. |
$skip | Skips the first N documents passed to the next stage in the pipeline. |
$unwind | Deconstructs an array field from the input documents to output a document for each element. |
$lookup | Performs a left outer join to a collection in the same database to filter in documents from the “joined” collection for processing. |
$out | Writes the resulting documents of the aggregation pipeline to a collection. |
$count | Returns a count of the number of documents at this stage of the pipeline. |
First, let's insert some documents into the inventory
collection to work with:
db.inventory.insertMany([ { item: "apple", qty: 10, category: "fruit" }, { item: "carrot", qty: 20, category: "vegetable" }, { item: "banana", qty: 5, category: "fruit" }, { item: "broccoli", qty: 15, category: "vegetable" }, { item: "grape", qty: 8, category: "fruit" } ])
The $match
stage filters documents based on a specified condition. This stage limits the number of documents passed to the next stage of the pipeline by passing only those documents that match the condition.
{ $match: { <field>: <value> } }
db.inventory.aggregate([ { $match: { category: "fruit" } } ])
This command filters the documents to include only those where the category
is "fruit". This limits the results to only the fruit items.
The $group
stage groups documents by a specified expression and outputs one document per group. The output documents can contain computed fields that hold the result of an aggregation operation on the documents in each group.
{ $group: { _id: <expression>, <field>: { <accumulator>: <expression> } } }
$sum
, $avg
, $max
).db.inventory.aggregate([ { $group: { _id: "$category", totalQty: { $sum: "$qty" } } } ])
This command groups documents by the category
field and calculates the total quantity (totalQty
) for each category. For example, it sums the quantity of all fruit and vegetable items separately.
The $project
stage reshapes each document in the stream, such as by adding new fields or removing existing fields.
{ $project: { <field1>: <include|exclude>, <field2>: <include|exclude>, ... } }
db.inventory.aggregate([ { $project: { item: 1, qty: 1, _id: 0 } } ])
This command includes only the item
and qty
fields in the output documents and excludes the _id
field. This reshapes the documents to show only the item names and their quantities.
The $sort
stage sorts documents by a specified field in ascending or descending order.
{ $sort: { <field>: <1|-1> } }
db.inventory.aggregate([ { $sort: { qty: -1 } } ])
This command sorts the documents by the qty
field in descending order. Items with the highest quantity will appear first.
The $limit
stage limits the number of documents passed to the next stage.
{ $limit: <number> }
db.inventory.aggregate([ { $limit: 3 } ])
This command limits the output to the first 3 documents. This is useful for scenarios where you need only a subset of the results.
The $skip
stage skips a specified number of documents before passing the remaining documents to the next stage.
{ $skip: <number> }
db.inventory.aggregate([ { $skip: 2 } ])
This command skips the first 2 documents and passes the remaining documents to the next stage. This is useful for pagination purposes.
The $unwind
stage deconstructs an array field from the input documents to output a document for each element.
{ $unwind: "$<arrayField>" }
db.inventory.insertMany([ { item: "fruit basket", contents: ["apple", "banana", "grape"], qty: 3 }, { item: "vegetable basket", contents: ["carrot", "broccoli"], qty: 2 } ]) db.inventory.aggregate([ { $unwind: "$contents" } ])
This command deconstructs the contents
array field from the inventory
documents, outputting a document for each element in the array. For example, the document { item: "fruit basket", contents: ["apple", "banana", "grape"], qty: 3 }
would be transformed into three documents, one for each fruit in the basket.
.....
.....
.....