Apache Hadoop is the most popular framework for processing Big Data on clusters of servers. In this three (optionally, four) days course, attendees will learn about the business benefits and use cases for Hadoop and its ecosystem, how to plan cluster deployment and growth, how to install, maintain, monitor, troubleshoot and optimize Hadoop.
Before taking this course, students should have the following skills:Be comfortable with basic Linux system administrationBasic scripting skillsKnowledge of Hadoop and Distributed Computing is not required, but will be introduced and explained in the course.
3-4 Days/Lecture & Lab
This course is designed for Hadoop administrators.
- Planning and Installation
- HDFS Operations
- Data Ingestion
- MapReduce Operations and Administration
- YARN New Architecture and New Capabilities
- Advanced Topics
- Optional Tracks