Skip to content
geeksforgeeks
  • Tutorials
    • Python
    • Java
    • DSA
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps
    • Software and Tools
    • School Learning
    • Practice Coding Problems
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Go Premium
  • DevOps Lifecycle
  • DevOps Roadmap
  • Docker Tutorial
  • Kubernetes Tutorials
  • Amazon Web Services [AWS] Tutorial
  • AZURE Tutorials
  • GCP Tutorials
  • Docker Cheat sheet
  • Kubernetes cheat sheet
  • AWS interview questions
  • Docker Interview Questions
  • Ansible Interview Questions
  • Jenkins Interview Questions
Open In App

Google File System

Last Updated : 04 Jan, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Google Inc. developed the Google File System (GFS), a scalable distributed file system (DFS), to meet the company's growing data processing needs. GFS offers fault tolerance, dependability, scalability, availability, and performance to big networks and connected nodes. GFS is made up of a number of storage systems constructed from inexpensive commodity hardware parts. The search engine, which creates enormous volumes of data that must be kept, is only one example of how it is customized to meet Google's various data use and storage requirements.

The Google File System reduced hardware flaws while gains of commercially available servers.

GoogleFS is another name for GFS. It manages two types of data namely File metadata and File Data.

The GFS node cluster consists of a single master and several chunk servers that various client systems regularly access. On local discs, chunk servers keep data in the form of Linux files. Large (64 MB) pieces of the stored data are split up and replicated at least three times around the network. Reduced network overhead results from the greater chunk size.

Without hindering applications, GFS is made to meet Google's huge cluster requirements. Hierarchical directories with path names are used to store files. The master is in charge of managing metadata, including namespace, access control, and mapping data. The master communicates with each chunk server by timed heartbeat messages and keeps track of its status updates.

More than 1,000 nodes with 300 TB of disc storage capacity make up the largest GFS clusters. This is available for constant access by hundreds of clients.

gfs master

Components of GFS

A group of computers makes up GFS. A cluster is just a group of connected computers. There could be hundreds or even thousands of computers in each cluster. There are three basic entities included in any GFS cluster as follows:

  • GFS Clients: They can be computer programs or applications which may be used to request files. Requests may be made to access and modify already-existing files or add new files to the system.
  • GFS Master Server: It serves as the cluster's coordinator. It preserves a record of the cluster's actions in an operation log. Additionally, it keeps track of the data that describes chunks, or metadata. The chunks' place in the overall file and which files they belong to are indicated by the metadata to the master server.
  • GFS Chunk Servers: They are the GFS's workhorses. They keep 64 MB-sized file chunks. The master server does not receive any chunks from the chunk servers. Instead, they directly deliver the client the desired chunks. The GFS makes numerous copies of each chunk and stores them on various chunk servers in order to assure stability; the default is three copies. Every replica is referred to as one.

Features of GFS

  • Namespace management and locking.
  • Fault tolerance.
  • Reduced client and master interaction because of large chunk server size.
  • High availability.
  • Critical data replication.
  • Automatic and efficient data recovery.
  • High aggregate throughput.

Advantages of GFS

  1. High accessibility Data is still accessible even if a few nodes fail. (replication) Component failures are more common than not, as the saying goes.
  2. Excessive throughput. many nodes operating concurrently.
  3. Dependable storing. Data that has been corrupted can be found and duplicated.

Disadvantages of GFS

  1. Not the best fit for small files.
  2. Master may act as a bottleneck.
  3. unable to type at random.
  4. Suitable for procedures or data that are written once and only read (appended) later.

I

ifrahshaxil8
Improve
Article Tags :
  • Misc
  • Google Cloud Platform
  • DevOps
  • Google
  • Distributed System
Practice Tags :
  • Google
  • Misc

Similar Reads

    DevOps Tutorial
    DevOps is a combination of two words: "Development" and "Operations." It’s a modern approach where software developers and software operations teams work together throughout the entire software life cycle.The goals of DevOps are:Faster and continuous software releases.Reduces manual errors through a
    7 min read

    Introduction

    What is DevOps ?
    DevOps is a modern way of working in software development in which the development team (who writes the code and builds the software) and the operations team (which sets up, runs, and manages the software) work together as a single team.Before DevOps, the development and operations teams worked sepa
    10 min read
    DevOps Lifecycle
    The DevOps lifecycle is a structured approach that integrates development (Dev) and operations (Ops) teams to streamline software delivery. It focuses on collaboration, automation, and continuous feedback across key phases planning, coding, building, testing, releasing, deploying, operating, and mon
    10 min read
    The Evolution of DevOps - 3 Major Trends for Future
    DevOps is a software engineering culture and practice that aims to unify software development and operations. It is an approach to software development that emphasizes collaboration, communication, and integration between software developers and IT operations. DevOps has come a long way since its in
    7 min read

    Version Control

    Version Control Systems
    A Version Control System (VCS) is a tool used in software development and collaborative projects to track and manage changes to source code, documents, and other files. Whether you are working alone or in a team, version control helps ensure your work is safe, organized, and easy to collaborate on.
    5 min read
    Merge Strategies in Git
    In Git, merging is the process of taking the changes from one branch and combining them into another. The merge command in Git will compare the two branches and merge them if there are no conflicts. If conflicts arise, Git will ask the user to resolve them before completing the merge.Merge keeps all
    4 min read
    Which Version Control System Should I Choose?
    While building a project, you need a system wherein you can track the modifications made. That's where Version Control System comes into the picture. It came into existence in 1972 at Bell Labs. The very first VCS made was SCCS (Source Code Control System) and was available only for UNIX. When any p
    5 min read

    Continuous Integration (CI) & Continuous Deployment (CD)

    What is CI/CD?
    CI/CD stands for Continuous Integration and Continuous Delivery/Deployment. It is the practice of automating the integration of code changes from multiple developers into a single codebase. It is a software development practice where the developers commit their work frequently to the central code re
    6 min read
    Understanding Deployment Automation
    In this article we will discuss deployment automation, categories in Automated Deployment, how automation can be implemented in deployment, how it is assisting DevOps and finally the benefits and drawbacks of Deployment Automation. So, let's start exploring the topic in detail. Deployment Automation
    4 min read

    Containerization

    What is Docker?
    Have you ever wondered about the reason for creating Docker Containers in the market? Before Docker, there was a big issue faced by most developers whenever they created any code that code was working on that developer computer, but when they try to run that particular code on the server, that code
    12 min read
    What is Dockerfile Syntax?
    Pre-requsites: Docker,DockerfileA Dockerfile is a script that uses the Docker platform to generate containers automatically. It is essentially a text document that contains all the instructions that a user may use to create an image from the command line. The Docker platform is a Linux-based platfor
    5 min read
    Kubernetes - Introduction to Container Orchestration
    In this article, we will look into Container Orchestration in Kubernetes. But first, let's explore the trends that gave rise to containers, the need for container orchestration, and how that it has created the space for Kubernetes to rise to dominance and growth. The growth of technology into every
    4 min read

    Orchestration

    Kubernetes - Introduction to Container Orchestration
    In this article, we will look into Container Orchestration in Kubernetes. But first, let's explore the trends that gave rise to containers, the need for container orchestration, and how that it has created the space for Kubernetes to rise to dominance and growth. The growth of technology into every
    4 min read
    Fundamental Kubernetes Components and their role in Container Orchestration
    Kubernetes or K8s is an open-sourced container orchestration technology that is used for automating the manual processes of deploying, managing and scaling applications by the help of containers. Kubernetes was originally developed by engineers at Google and In 2015, it was donated to CNCF (Cloud Na
    12 min read
    How to Use AWS ECS to Deploy and Manage Containerized Applications?
    Containers can be deployed for applications on the AWS cloud platform. AWS has a special application for managing containerized applications. Elastic Container Service (ECS) serves this purpose. ECS is AWS's container orchestration tool which simplifies the management of containers. All the containe
    4 min read

    Infrastructure as Code (IaC)

    Infrastructure as Code (IaC)
    Infrastructure as Code (IaC) is a method of managing and provisioning IT infrastructure using code rather than manual configuration. It allows teams to automate the setup and management of their infrastructure, making it more efficient and consistent. This is particularly useful in the DevOps enviro
    6 min read
    Introduction to Terraform
    Many people wonder why we use Terraform when there are already so many Infrastructure as Code (IaC) tools out there. So, before learning Terraform, let’s understand why it was created.Terraform was made to solve some common problems with existing IaC tools. Some tools, like AWS CloudFormation, only
    15 min read
    What is AWS Cloudformation?
    Amazon Web Services(AWS) offers cloud formation as a service by which you can provision and manage complicated services offered by AWS by using the code. CloudFormation will help you to manage the infrastructure and the services in the form of a declarative way. Table of ContentIntroduction to AWS C
    14 min read

    Monitoring and Logging

    Working with Prometheus and Grafana Using Helm
    Pre-requisite: HELM Package Manager Helm is a package manager for Kubernetes that allows you to install, upgrade, and manage applications on your Kubernetes cluster. With Helm, you can define, install, and upgrade your application using a single configuration file, called a Chart. Charts are easy to
    5 min read
    Working with Monitoring and Logging Services
    Pre-requisite: Google Cloud Platform Monitoring and Logging services are essential tools for any organization that wants to ensure the reliability, performance, and security of its systems. These services allow organizations to collect and analyze data about the health and behavior of their systems,
    5 min read
    Microsoft Teams vs Slack
    Both Microsoft Teams and Slack are the communication channels used by organizations to communicate with their employees. Microsoft Teams was developed in 2017 whereas Slack was created in 2013. Microsoft Teams is mainly used in large organizations and is integrated with Office 365 enhancing the feat
    4 min read

    Security in DevOps

    What is DevSecOps: Overview and Tools
    DevSecOps methodology is an extension of the DevOps model that helps development teams to integrate security objectives very early into the lifecycle of the software development process, giving developers the team confidence to carry out several security tasks independently to protect code from adva
    10 min read
    DevOps Best Practices for Kubernetes
    DevOps is the hot topic in the market these days. DevOps is a vague term used for wide number of operations, most agreeable defination of DevOps would be that DevOps is an intersection of development and operations. Certain practices need to be followed during the application release process in DevO
    11 min read
`; $(commentSectionTemplate).insertBefore(".article--recommended"); } loadComments(); }); }); function loadComments() { if ($("iframe[id*='discuss-iframe']").length top_of_element && top_of_screen articleRecommendedTop && top_of_screen articleRecommendedBottom)) { if (!isfollowingApiCall) { isfollowingApiCall = true; setTimeout(function(){ if (loginData && loginData.isLoggedIn) { if (loginData.userName !== $('#followAuthor').val()) { is_following(); } else { $('.profileCard-profile-picture').css('background-color', '#E7E7E7'); } } else { $('.follow-btn').removeClass('hideIt'); } }, 3000); } } }); } $(".accordion-header").click(function() { var arrowIcon = $(this).find('.bottom-arrow-icon'); arrowIcon.toggleClass('rotate180'); }); }); window.isReportArticle = false; function report_article(){ if (!loginData || !loginData.isLoggedIn) { const loginModalButton = $('.login-modal-btn') if (loginModalButton.length) { loginModalButton.click(); } return; } if(!window.isReportArticle){ //to add loader $('.report-loader').addClass('spinner'); jQuery('#report_modal_content').load(gfgSiteUrl+'wp-content/themes/iconic-one/report-modal.php', { PRACTICE_API_URL: practiceAPIURL, PRACTICE_URL:practiceURL },function(responseTxt, statusTxt, xhr){ if(statusTxt == "error"){ alert("Error: " + xhr.status + ": " + xhr.statusText); } }); }else{ window.scrollTo({ top: 0, behavior: 'smooth' }); $("#report_modal_content").show(); } } function closeShareModal() { const shareOption = document.querySelector('[data-gfg-action="share-article"]'); shareOption.classList.remove("hover_share_menu"); let shareModal = document.querySelector(".hover__share-modal-container"); shareModal && shareModal.remove(); } function openShareModal() { closeShareModal(); // Remove existing modal if any let shareModal = document.querySelector(".three_dot_dropdown_share"); shareModal.appendChild(Object.assign(document.createElement("div"), { className: "hover__share-modal-container" })); document.querySelector(".hover__share-modal-container").append( Object.assign(document.createElement('div'), { className: "share__modal" }), ); document.querySelector(".share__modal").append(Object.assign(document.createElement('h1'), { className: "share__modal-heading" }, { textContent: "Share to" })); const socialOptions = ["LinkedIn", "WhatsApp","Twitter", "Copy Link"]; socialOptions.forEach((socialOption) => { const socialContainer = Object.assign(document.createElement('div'), { className: "social__container" }); const icon = Object.assign(document.createElement("div"), { className: `share__icon share__${socialOption.split(" ").join("")}-icon` }); const socialText = Object.assign(document.createElement("span"), { className: "share__option-text" }, { textContent: `${socialOption}` }); const shareLink = (socialOption === "Copy Link") ? Object.assign(document.createElement('div'), { role: "button", className: "link-container CopyLink" }) : Object.assign(document.createElement('a'), { className: "link-container" }); if (socialOption === "LinkedIn") { shareLink.setAttribute('href', `https://www.linkedin.com/sharing/share-offsite/?url=${window.location.href}`); shareLink.setAttribute('target', '_blank'); } if (socialOption === "WhatsApp") { shareLink.setAttribute('href', `https://api.whatsapp.com/send?text=${window.location.href}`); shareLink.setAttribute('target', "_blank"); } if (socialOption === "Twitter") { shareLink.setAttribute('href', `https://twitter.com/intent/tweet?url=${window.location.href}`); shareLink.setAttribute('target', "_blank"); } shareLink.append(icon, socialText); socialContainer.append(shareLink); document.querySelector(".share__modal").appendChild(socialContainer); //adding copy url functionality if(socialOption === "Copy Link") { shareLink.addEventListener("click", function() { var tempInput = document.createElement("input"); tempInput.value = window.location.href; document.body.appendChild(tempInput); tempInput.select(); tempInput.setSelectionRange(0, 99999); // For mobile devices document.execCommand('copy'); document.body.removeChild(tempInput); this.querySelector(".share__option-text").textContent = "Copied" }) } }); // document.querySelector(".hover__share-modal-container").addEventListener("mouseover", () => document.querySelector('[data-gfg-action="share-article"]').classList.add("hover_share_menu")); } function toggleLikeElementVisibility(selector, show) { document.querySelector(`.${selector}`).style.display = show ? "block" : "none"; } function closeKebabMenu(){ document.getElementById("myDropdown").classList.toggle("show"); }
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • DSA Tutorial
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences