Loading Course Schedule...
PT10273
Training Summary
This class provides a solid foundation in apache spark. Spark is a next generation processing framework that provides from 10x to 100x performance increase over traditional map/reduce processing. In this course you will write both traditional batch processing and streaming applications.
Prerequisites
Scala or Python experience is recommended.
Duration
4 Days/Lecture & Lab
Audience
This course is designed for Developers who are tasked with writing Spark applications.
Course Topics
- Spark Basics
- The Hadoop Distributed File System
- Spark and Hadoop
- RDDs
- Running Spark on a Cluster
- Parallel Programming with Spark
- Caching and Persistence
- Writing Spark Applications
- Spark Streaming
- Common Spark Algorithms
- Improving Spark Performance