Introduction to Hadoop Development Instructor-Led Training Course

Loading Course Schedule...

Add To Watch List Request In Your Area Request Private Class View PDF

PT0644

Training Summary

You will learn how to use Apache Hadoop and write MapReduce programs. You will begin with a quick overview of installing Hadoop, setting it up in a cluster, and then proceed to writing data analytic programs. The course will present the basic concepts of MapReduce applications developed using Hadoop, including a close look at framework components, use of Hadoop for a variety of data analysis tasks, and numerous examples of Hadoop in action. The course will further examine related technologies such as Hive, Pig, and Apache Accumulo. Apache Accumulo is a highly scalable structured store based on Google's BigTable, written in Java and operates over the Hadoop Distributed File System (HDFS). Hive is data warehouse software for querying and managing large datasets. Pig is a platform to take advantage of parallelization when running data analysis. Finally, you will observe how Hadoop works in and supports cloud computing and explore examples with Amazon Web Services and case studies.This class is focused on the Hadoop 2.0 (pre-)release.This course is approximately 40% lecture and 60% hands-on labs.

Prerequisites

Introduction to Java - Experience developing Java with EclipseIntroduction to Unix - Exposure to bash or tcsh shell useData Persistence with JPA 2 - Experience using JPA and data access

Duration

5 Days/Lecture & Lab

Audience

Course Topics

What is Hadoop?
Starting Hadoop
Components of Hadoop
Writing basic MapReduce programs
Advanced MapReduce
Programming Practices
Cookbook
Managing Hadoop
Running Hadoop in the cloud
Programming with Pig
Overview Hadoop Related Technologies
Case studies

Introduction to Hadoop Development

Loading Course Schedule...

Related Scheduled Courses

GSA #GS-35F-0486W