Learning MongoDB

0% completed

Previous
Next
MongoDB Map-Reduce

Map-Reduce is a data processing paradigm used for condensing large volumes of data into aggregated results. In MongoDB, Map-Reduce allows you to perform aggregation operations by defining a map function to process each document and a reduce function to combine the output of the map operation.

However, it is important to note that Map-Reduce is generally not recommended for new projects due to its complexity and inefficiency compared to the aggregation pipeline, which is faster and more efficient.

When to Use Map-Reduce

Map-Reduce can be used in the following scenarios:

  • Complex Aggregations: When you need to perform complex aggregations that are difficult to achieve with the aggregation framework.
  • Custom Data Processing: When you require custom data processing logic that goes beyond the built-in operators of the aggregation pipeline.
  • Large Data Sets: When working with large data sets that require distributed processing.

Despite these use cases, the aggregation pipeline is typically preferred for new applications due to its performance advantages.

Example Setup

First, let's insert some documents into the orders collection to work with:

db.orders.insertMany([ { order_id: 1, product: "apple", quantity: 10, price: 1.0, date: new Date("2023-01-01") }, { order_id: 2, product: "banana", quantity: 5, price: 0.5, date: new Date("2023-01-02") }, { order_id: 3, product: "orange", quantity: 8, price: 1.5, date: new Date("2023-01-03") }, { order_id: 4, product: "apple", quantity: 15, price: 1.0, date: new Date("2023-01-04") }, { order_id: 5, product: "banana", quantity: 7, price: 0.5, date: new Date("2023-01-05") } ])

Example 1: Basic Map-Reduce

Calculate the total quantity of each product sold.

Map Function

var mapFunction = function() { emit(this.product, this.quantity); };

Explanation

  • var mapFunction = function() { ... };: Defines the map function.
  • emit(this.product, this.quantity);: Emits a key-value pair where the key is the product and the value is the quantity for each document.

Reduce Function

var reduceFunction = function(key, values) { return Array.sum(values); };

*Explanation

  • var reduceFunction = function(key, values) { ... };: Defines the reduce function.
  • return Array.sum(values);: Sums the values (quantities) for each key (product).

Map-Reduce Command:

db.orders.mapReduce( mapFunction, reduceFunction, { out: "total_quantity_per_product" } )

Explanation

  • db.orders.mapReduce(mapFunction, reduceFunction, { out: "total_quantity_per_product" });: Executes the Map-Reduce operation on the orders collection, using the defined map and reduce functions, and outputs the results to the total_quantity_per_product collection.

This command results in documents like:

{ "_id": "apple", "value": 25 } { "_id": "banana", "value": 12 } { "_id": "orange", "value": 8 }

Example 2: Map-Reduce with Finalize Function

Calculate the average quantity sold per order for each product.

Map Function

var mapFunction = function() { emit(this.product, { count: 1, quantity: this.quantity }); };

Explanation

  • var mapFunction = function() { ... };: Defines the map function.
  • emit(this.product, { count: 1, quantity: this.quantity });: Emits a key-value pair where the key is the product and the value is an object containing the count of orders (initialized to 1) and the quantity for each document.

Reduce Function

var reduceFunction = function(key, values) { var result = { count: 0, quantity: 0 }; values.forEach(function(value) { result.count += value.count; result.quantity += value.quantity; }); return result; };

Explanation

  • var reduceFunction = function(key, values) { ... };: Defines the reduce function.
  • var result = { count: 0, quantity: 0 };: Initializes a result object to store the cumulative count and quantity.
  • values.forEach(function(value) { ... });: Iterates through each value associated with the key and sums the count and quantity.
  • result.count += value.count;: Adds the count from the current value to the result count.
  • result.quantity += value.quantity;: Adds the quantity from the current value to the result quantity.
  • return result;: Returns the accumulated result object.

Finalize Function

var finalizeFunction = function(key, reducedValue) { reducedValue.avgQuantity = reducedValue.quantity / reducedValue.count; return reducedValue; };

Explanation:

  • var finalizeFunction = function(key, reducedValue) { ... };: Defines the finalize function.
  • reducedValue.avgQuantity = reducedValue.quantity / reducedValue.count;: Calculates the average quantity per order by dividing the total quantity by the count.
  • return reducedValue;: Returns the final result object with the calculated average quantity.

Map-Reduce Command

db.orders.mapReduce( mapFunction, reduceFunction, { out: "average_quantity_per_order", finalize: finalizeFunction } )

Explanation

  • Executes the Map-Reduce operation on the orders collection, using the defined map, reduce, and finalize functions, and outputs the results to the average_quantity_per_order collection.

This command results in documents like:

{ "_id": "apple", "value": { "count": 2, "quantity": 25, "avgQuantity": 12.5 } } { "_id": "banana", "value": { "count": 2, "quantity": 12, "avgQuantity": 6.0 } } { "_id": "orange", "value": { "count": 1, "quantity": 8, "avgQuantity": 8.0 } }

Map-Reduce in MongoDB provides a flexible and powerful way to perform complex data transformations and aggregations. However, it is generally recommended to use the aggregation pipeline for new applications due to its performance and simplicity.

.....

.....

.....

Like the course? Get enrolled and start learning!
Previous
Next