Apache Flink is an open source platform for distributed stream and batch data processing. This course will present all essential concepts, libraries and techniques, in a complete hands-on environment, for understanding, creating and supporting Flink and Flink-ecosystem-based applications.
Development experience with Linux, Java and Hadoop are a prerequisite. Knowledge or experience with implementing EAI/EII patterns is assumed. Experience with a distributed data flow project such as NiFi is helpful. Experience with or comprehensive conceptual knowledge of Spark and/or Kafka are helpful. It is suggested that a student new to Hadoop first take the course “Advanced Hadoop.” A student not familiar with EAI/EII patterns is referred to http://www.enterpriseintegrationpatterns.com
5 Days/Lecture & Lab
The audience for this class will be bifurcated into two types of software engineers. First, those Java or Scala software engineers, with minimal knowledge of Spark and Kafka, who must quickly generate rigorous, extensible, enterprise-level applications reliant upon a distributed data flow topology. Second, those software engineers who have worked with Java, the Spark API and the Kafka API who desire to understand how the Flink functionality and performance complements or supersedes the functionality offered by Spark and Kafka. Companies like Alibaba, Capital One, Ericsson, Netflix and Uber consider Spark and Kafka to be 3rd generation and Flink 4th generation in their capabilities.
- Introduction to Flink concepts, ecosystem, use cases
- Application development with Flink
- Extending Flink into the Flink ecosystem
- DevOps, installation options, deployment and monitoring
- Performance enhancement practices with Flink and Flink ecosystem