Comprehensive Apache Spark 2.3 for Machine Learning and Data Science Instructor-Led Training Course

Loading Course Schedule...

Add To Watch List Request In Your Area Request Private Class View PDF

PT15261

Training Summary

This three day training course for Data Scientists and Analysts will teach you how to harness Apache Spark 2.3 for large scale data analysis, predictive modeling, and machine learning tasks. You will learn how to program Spark as efficiently and effectively as possible, by targeting the latest version of the platform, and learning the modern approach necessary to fully leverage the advantages it offers. The entirety of the course is taught hands-on, using real code and interactive examples. In addition, longer labs allow attendees to work together to apply their growing Spark knowledge to solve common challenges faced by organizations running complex Big Data applications in production. Both lectures and lab activities use real-world datasets, so that you can practice getting Apache Spark to work well in-spite of real-world challenges. You’ll also gain hands-on experience with performance tuning and troubleshooting. Apache Spark 2 brings a suite of new features and speed improvements – but it also works differently under the hood, and requires a slightly different approach to programming in-oder to get the most out of it. This course focuses entirely on Spark 2 and will teach you how to program for the latest version of Spark (currently Spark 2.3) in the most performant, most effective, and easiest way possible.

Prerequisites

There are no prerequisites for this course.

Duration

3 Days/Lecture & Lab

Audience

This course is designed for data scientists or analysts involved in predictive modeling, who want to explore machine learning where data is too large for single-machine tools.

Course Topics

Introduction

DataFrame/Dataset and SQL Analytics

Machine Learning Overview

Streaming Overview

Using Apache Spark with the ML / Predictive Analytics Process

Understanding Apache Spark Job Performance

Additional Spark ML Algorithms and Features

Integrating Apache Spark with Other Machine Learning Systems

Extending Spark ML

Apache Spark Model Deployment Patterns

Apache Spark Cluster Deployment Overview (Optional)

Comprehensive Apache Spark 2.3 for Machine Learning and Data Science

Loading Course Schedule...

Related Scheduled Courses

GSA #GS-35F-0486W