Loading Course Schedule...
PT9901
Summary
Data scientists build information platforms to ask and answer previously unimaginable questions. Learn how data science helps companies reduce costs, increase profits, improve products, retain customers and identify new opportunities. This three-day course helps participants understand what data scientists do and the problems they solve. Through in-class simulations, participants apply data science methods to real-world challenges in different industries and, ultimately, prepare for data scientist roles in the field.
Prerequisites
Students should have proficiency in a scripting language; Python is strongly preferred, but familiarity with Perl or Ruby is sufficient.
Duration
3 Days/Lecture & Labs
Audience
This course is suitable for developers, data analysts, and statisticians with basic knowledge of Apache Hadoop: HDFS, MapReduce, Hadoop Streaming and Apache Hive.
Topics
- Data Science Overview
- Use Cases
- Project Lifecycle
- Data Acquisition
- Evaluating Input Data
- Data Transformation
- Data Analysis and Statistical Methods
- Fundamentals of Machine Learning
- Recommender Overview
- Introduction to Apache Mahout
- Implementing Recommenders with Apache Mahout
- Experimentation and Evaluation
- Production Deployment and Beyond
- Conclusion