Hadoop Tutorial: Intro to HDFS

In this presentation, I will introduce the Hadoop Distributed File System, an Apache open source distributed file system designed to run on commodity hardware.

Topics I'll cover include:

  • Origins of HDFS and Google File System / GFS
  • How a file breaks up into blocks before being distributed to a cluster
  • NameNode and DataNode basics
  • Technical architecture of HrackDFS
  • Sample HDFS commands
  • Rack Awareness
  • Synchrounous write pipeline
  • How a client reads a file

Want to learn more about Hadoop and Big Data?

Check out more of our educational videos and Big Data training courses.

Published October 31, 2012