Spark is a new and very popular Big Data processing engine. Spark MLLib is a de facto standard for machine learning in Big Data. This course is intended for data scientists and software engineers. It maintains an optimal balance of theory and practice. For each machine learning concept, we first discuss the foundations, its applicability and limitations. Then we explain the implementation and use, and specific use cases. This is achieved through a combination of about 50% lecture, 50% lab work.
4 Days/Lecture & Lab
This course is designed for data scientists and software engineers.
Before taking this course, students should have a familiarity with programming in at least one language and be able to navigate Linux command line. Student should also have a basic knowledge of command line Linux editors (VI / nano).