Big Data Analytics with Apache Spark 3

PT27489
Training Summary
We are living in an era of ‘big data’. Spark is a popular platform for analyzing big data. This course introduces Apaches Spark to students. This class is taught with Python language and using Jupyter environment. This course covers the latest features in Spark version 3.
Prerequisites
Basic knowledge of Python language and Jupyter notebooks is preferred but not mandatory. Even if you haven’t done any Python programming, Python is such an easy language to learn quickly.
Duration
3 Days/Lecture & Lab
Audience
The audience for this class includes Developers, Data Analysts, and Business Analysts.
Course Topics
  • Spark Introduction
  • First Look at Spark
  • Spark Data Structures
  • Caching
  • Dataframes/Datasets
  • Spark SQL
  • Spark and Hadoop
  • Spark API
  • Spark ML Overview
  • Graph Processing
  • Spark Streaming
  • Bonus: Spark Performance Tuning
  • Bonus: Delta Lake (Spark 3)

Related Scheduled Courses