Hadoop in Real World-Development & Deployment

PT0050
Summary
A hands-on Hadoop Workshop which focuses on creating Hadoop power users with hands-on labs that includes real life Hadoop usage patterns that are commonly used in the industry. This Workshop showcases:Real life usage of Hadoop using Hive & Pig Discuss options for data lifecycle : capture, analyze, store & archive data Various patterns of ETL into HDFS Students will be doing the lab in real Hadoop clusters, one per student. They will learn the concepts and also do the high level implementation using Hive/Pig and Map Reduce programs.Click Stream AnalysisIn this part of the lab, students will do a 'Click Stream Analysis' workload. We will simulate an online ad serving agency by tracking an ad campaign performance - how many views did we get vs. for how many clicks. This lab involves:Exploring options for ingesting clickstream data into Hadoop HDFS Analyzing the logs using Pig and MapReduce Programs Order Processing in a Hadoop Data WarehouseIn this lab, we will simulate a scenario where order files are processed in a Hadoop datawarehouse using techniques like HBase Bulk-Loader and migrated to Hive for HQL analysis.This lab involves:Loading order data into Hadoop Bulkload data into HBase Create external tables in Hive out of HBase tables Run analytics on Hive order tables Run map/reduce to filter order data and extract things like "Delivered orders" only.
Prerequisites
Developers with basic understanding of Hadoop, MapReduce, Hive, Pig and HBase. Developers with basic understanding of database and ACID transactions.
Duration
1 Day/Lecture & Lab
Audience
Developers, Database administrators, Data Analytics professionals, Data architects, Managers.
Topics
.

Related Scheduled Courses