Loading Course Schedule...
PT0063
Summary
Apache Hadoop is the most popular framework for processing Big Data on clusters of servers. In this three days course, attendees will learn about the business benefits and use cases for Hadoop and its ecosystem, how to plan cluster deployment and growth, how to install, maintain, monitor, troubleshoot and optimize Hadoop. They will also practice cluster bulk data load, get familiar with various Hadoop distributions, and practice installing and managing Hadoop ecosystem tools. The course finishes off with discussion of securing cluster with Kerberos
Prerequisites
Before taking this course, students should:Be comfortable with basic Linux system administrationHave basic scripting skillsKnowledge of Hadoop and Distributed Computing is not required, but will be introduced and explained in the course.
Duration
3 Days/Lecture & Lab
Audience
This course is designed for developers.
Topics
- Introduction
- Planning and installation
- HDFS operations
- MapReduce operations
- Advanced topics