In this presentation, I will introduce the Hadoop Distributed File System, an Apache open source distributed file system designed to run on commodity hardware.
Topics I'll cover include:
- Origins of HDFS and Google File System / GFS
- How a file breaks up into blocks before being distributed to a cluster
- NameNode and DataNode basics
- Technical architecture of HrackDFS
- Sample HDFS commands
- Rack Awareness
- Synchrounous write pipeline
- How a client reads a file
Want to learn more about Hadoop and Big Data?