Introduction to Machine Learning with Apache Spark

This course teaches Machine Learning from a practical perspective. In-depth coverage of Math / Stats is beyond the scope of this course. Machine Learning (ML) is changing the world. To use ML effectively, one needs to understand the algorithms and how to utilize them. This course provides an introduction into the most popular machine learning algorithms. We will also use Apache Spark as our ML platform. Apache Spark provides scalable ML platform, that makes it possible to analyze large amount of data.
  • Good programming background
  • familiarity with Python would be a plus, but not required
  • No machine learning knowledge is assumed
  • No Spark knowledge is assumed
3 Days/Lecture & Lab
This course is designed for Data analysts, Software Engineers, and Data scientists.
  • Spark
  • Machine Learning (ML) Overview
  • ML in Python and Spark
  • Feature Engineering and Exploratory Data Analysis (EDA)
  • Machine Learning Concepts
  • Linear regression
  • Logistic Regression
  • Classification: SVM (Supervised Vector Machines)
  • Classification: Decision Trees & Random Forests
  • Classification: Naive Bayes
  • Unsupervised Algorithms
  • Unsupervised: Clustering: K-Means
  • Unsupervised: Principal Component Analysis (PCA)
  • Recommendations
  • Final workshop (time permitting)

Related Scheduled Courses