Skip to content
geeksforgeeks
  • Tutorials
    • Python
    • Java
    • DSA
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps
    • Software and Tools
    • School Learning
    • Practice Coding Problems
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Go Premium
  • Data Science
  • Data Science Projects
  • Data Analysis
  • Data Visualization
  • Machine Learning
  • ML Projects
  • Deep Learning
  • NLP
  • Computer Vision
  • Artificial Intelligence
Open In App
Next Article:
How Job runs on MapReduce
Next article icon

Hadoop - Reducer in Map-Reduce

Last Updated : 31 Jul, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

MapReduce is a core programming model in the Hadoop ecosystem, designed to process large datasets in parallel across distributed machines (nodes). The execution flow is divided into two major phases: Map Phase and Reduce Phase.

Hadoop programs typically consist of three main components:

  • Mapper Class: Processes input data and generates intermediate key-value pairs.
  • Reducer Class: Aggregates and processes the intermediate results.
  • Driver Class: Configures and manages the job execution.

The Reducer is the second stage of MapReduce. It takes the intermediate key-value pairs generated by the Mapper and produces the final consolidated output, which is then written to HDFS (Hadoop Distributed File System).

Workflow of Reducer in MapReduce

Reducer-In-MapReduce

1. Intermediate Data (Mapper Output): The Mapper produces output in the form of (key, value) pairs.

2. Shuffle & Sort: Before passing the data to Reducer, Hadoop automatically performs two operations:

  • Shuffling: Transfers the relevant data from all Mappers to the appropriate Reducer.
  • Sorting: Groups the values based on their keys. Sorting ensures all values belonging to the same key are processed together.

Sorting and Shuffling are executed in parallel for efficiency.

3. Reduce Phase:

The Reducer receives (key, list of values) and applies user-defined computation logic such as aggregation, filtering, or summation. The output is then written back to HDFS.

Example – Faculty Salary Summation

Suppose we have faculty salary data stored in a CSV file. If we want to compute the total salary per department, we can:

  • Use the department name as the key.
  • Use the salary as the value.

The Reducer will aggregate all salary values for each department and produce the final result in the format:

Dept_Name Total_Salary
CSE 750000
ECE 620000
MECH 450000

Characteristics of Reducer in MapReduce

  • Default Reducer Count: By default, Hadoop assigns 1 Reducer for a job. This can be configured as per requirements.
  • One-to-One Mapping: Each unique key is assigned to exactly one Reducer.
  • The final output files are stored in HDFS under the job’s output directory, named as part-r-00000, part-r-00001, etc. according to the number of Reducers, along with a _SUCCESS file to indicate job completion.
  • Custom Output Filename: By default, output files have the pattern part-r-xxxxx. You can change this in the driver code:

job.getConfiguration().set("mapreduce.output.basename", "GeeksForGeeks");

Phases of Reducer

  • Shuffle: Moves Mapper output to the appropriate Reducer via HTTP.
  • Sort: Groups values belonging to the same key.
  • Reduce: Performs the actual computation (sum, average, filter, etc.).

Note: The final output from the Reducer is not sorted by default.

Setting Number of Reducers in MapReduce

Hadoop allows users to configure the number of Reducers:

  • Using Command Line:

mapred.reduce.tasks=<number_of_reducers>

  • Using JobConf in Driver Code:

job.setNumReduceTasks(2);

If set to 0, only the Map phase is executed (useful for Map-only jobs).

Best Practices for Setting Reducer Count

The number of Reducers significantly affects performance and resource utilization. Ideally, it should be tuned based on cluster size and workload:

Recommended formula:

NumReducers ≈ (0.95 or 1.75) × (Number of Nodes × Max Containers per Node)

  • 0.95 factor: Creates slightly fewer Reducers than slots → ensures all reducers run in parallel.
  • 1.75 factor: Creates more Reducers than slots → improves load balancing, though some Reducers may run sequentially.

Related Articles

  • Hadoop – Mapper in MapReduce
  • MapReduce Architecture in Hadoop
  • Combiners in MapReduce

Next Article
How Job runs on MapReduce

D

dikshantmalidev
Improve
Article Tags :
  • Data Engineering
  • MapReduce

Similar Reads

    Hadoop - Mapper In MapReduce
    In Hadoop’s MapReduce framework, the Mapper is the core component of the Map Phase, responsible for processing raw input data and converting it into a structured form (key-value pairs) that Hadoop can efficiently handle.A Mapper is a user-defined Java class that takes input splits (chunks of data fr
    4 min read
    How Job runs on MapReduce
    Running a job in Hadoop MapReduce may look simple from the user’s side, but behind the scenes, it involves a complex series of steps. From job submission to resource allocation and finally task execution, Hadoop handles everything through its distributed architecture.Let’s explore how it all begins.
    4 min read
    Hadoop MapReduce - Data Flow
    MapReduce is a Hadoop processing framework that efficiently handles large-scale data across distributed machines. Unlike traditional systems, it works directly on data stored across nodes in HDFS.Hadoop MapReduce follows a simple yet powerful data processing model that breaks large datasets into sma
    2 min read
    MapReduce Programming Model and its role in Hadoop.
    In the Hadoop framework, MapReduce is the programming model. MapReduce utilizes the map and reduce strategy for the analysis of data. In today’s fast-paced world, there is a huge number of data available, and processing this extensive data is one of the critical tasks to do so. However, the MapReduc
    6 min read
    The SON Algorithm and Map - Reduce
    In this article, we are going to discuss introduction of the SON algorithm and map- reduce. Also, we will cover the First Map and First reduce and Second Map and Second Reduce. So let's discuss it. The SON algorithm : The SON algorithm impart itself well to a parallel - computing environment. Each o
    3 min read
    MapReduce - Combiners
    Map-Reduce is a programming model that is used for processing large-size data-sets over distributed systems in Hadoop. Map phase and Reduce Phase are the main two important parts of any Map-Reduce job. Map-Reduce applications are limited by the bandwidth available on the cluster because there is a m
    6 min read
`; $(commentSectionTemplate).insertBefore(".article--recommended"); } loadComments(); }); }); function loadComments() { if ($("iframe[id*='discuss-iframe']").length top_of_element && top_of_screen articleRecommendedTop && top_of_screen articleRecommendedBottom)) { if (!isfollowingApiCall) { isfollowingApiCall = true; setTimeout(function(){ if (loginData && loginData.isLoggedIn) { if (loginData.userName !== $('#followAuthor').val()) { is_following(); } else { $('.profileCard-profile-picture').css('background-color', '#E7E7E7'); } } else { $('.follow-btn').removeClass('hideIt'); } }, 3000); } } }); } $(".accordion-header").click(function() { var arrowIcon = $(this).find('.bottom-arrow-icon'); arrowIcon.toggleClass('rotate180'); }); }); window.isReportArticle = false; function report_article(){ if (!loginData || !loginData.isLoggedIn) { const loginModalButton = $('.login-modal-btn') if (loginModalButton.length) { loginModalButton.click(); } return; } if(!window.isReportArticle){ //to add loader $('.report-loader').addClass('spinner'); jQuery('#report_modal_content').load(gfgSiteUrl+'wp-content/themes/iconic-one/report-modal.php', { PRACTICE_API_URL: practiceAPIURL, PRACTICE_URL:practiceURL },function(responseTxt, statusTxt, xhr){ if(statusTxt == "error"){ alert("Error: " + xhr.status + ": " + xhr.statusText); } }); }else{ window.scrollTo({ top: 0, behavior: 'smooth' }); $("#report_modal_content").show(); } } function closeShareModal() { const shareOption = document.querySelector('[data-gfg-action="share-article"]'); shareOption.classList.remove("hover_share_menu"); let shareModal = document.querySelector(".hover__share-modal-container"); shareModal && shareModal.remove(); } function openShareModal() { closeShareModal(); // Remove existing modal if any let shareModal = document.querySelector(".three_dot_dropdown_share"); shareModal.appendChild(Object.assign(document.createElement("div"), { className: "hover__share-modal-container" })); document.querySelector(".hover__share-modal-container").append( Object.assign(document.createElement('div'), { className: "share__modal" }), ); document.querySelector(".share__modal").append(Object.assign(document.createElement('h1'), { className: "share__modal-heading" }, { textContent: "Share to" })); const socialOptions = ["LinkedIn", "WhatsApp","Twitter", "Copy Link"]; socialOptions.forEach((socialOption) => { const socialContainer = Object.assign(document.createElement('div'), { className: "social__container" }); const icon = Object.assign(document.createElement("div"), { className: `share__icon share__${socialOption.split(" ").join("")}-icon` }); const socialText = Object.assign(document.createElement("span"), { className: "share__option-text" }, { textContent: `${socialOption}` }); const shareLink = (socialOption === "Copy Link") ? Object.assign(document.createElement('div'), { role: "button", className: "link-container CopyLink" }) : Object.assign(document.createElement('a'), { className: "link-container" }); if (socialOption === "LinkedIn") { shareLink.setAttribute('href', `https://www.linkedin.com/sharing/share-offsite/?url=${window.location.href}`); shareLink.setAttribute('target', '_blank'); } if (socialOption === "WhatsApp") { shareLink.setAttribute('href', `https://api.whatsapp.com/send?text=${window.location.href}`); shareLink.setAttribute('target', "_blank"); } if (socialOption === "Twitter") { shareLink.setAttribute('href', `https://twitter.com/intent/tweet?url=${window.location.href}`); shareLink.setAttribute('target', "_blank"); } shareLink.append(icon, socialText); socialContainer.append(shareLink); document.querySelector(".share__modal").appendChild(socialContainer); //adding copy url functionality if(socialOption === "Copy Link") { shareLink.addEventListener("click", function() { var tempInput = document.createElement("input"); tempInput.value = window.location.href; document.body.appendChild(tempInput); tempInput.select(); tempInput.setSelectionRange(0, 99999); // For mobile devices document.execCommand('copy'); document.body.removeChild(tempInput); this.querySelector(".share__option-text").textContent = "Copied" }) } }); // document.querySelector(".hover__share-modal-container").addEventListener("mouseover", () => document.querySelector('[data-gfg-action="share-article"]').classList.add("hover_share_menu")); } function toggleLikeElementVisibility(selector, show) { document.querySelector(`.${selector}`).style.display = show ? "block" : "none"; } function closeKebabMenu(){ document.getElementById("myDropdown").classList.toggle("show"); }
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • DSA Tutorial
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences