Real-Time Data Engineering in the Cloud

PT15153
Training Summary
To handle real-time big data, you need to solve two difficult problems: how do you ingest that much data and how will you process that much data? This course explores the latest real-time frameworks—both open source and managed cloud services—discusses the leading cloud providers, and explains how to choose the right one for your company. Real-time big data frameworks are enabling brand-new use cases, while the cloud is letting us do things cheaper and faster than ever. Together, they’re making it easier to create production real-time systems. But to handle real-time big data, you need to solve two difficult problems: how do you ingest that much data and how will you process that much data? This course explores the latest real-time frameworks—both open source and managed cloud services—discusses the leading cloud providers, and explains how to choose the right one for your company. Focusing on Apache Kafka and Apache Spark, the course also demonstrates how to ingest data, process it, analyze it, and display it in real time with a dashboard.
Prerequisites
This course is designed for students with strong Java skills.
Duration
2 Days/Lecture & Lab
Audience
This course is aimed at Data Engineers, Software Engineers, Big Data Engineers, DevOps Engineers, and Software Architects.
Course Topics
Real-time Data Engineering
  • Real-time Data Pipelines
  • Real-time Technologies
  • Real-time Pipelines
  • Pros and Cons of Real-time
  • Using the Cloud
  • Cloud Providers
  • Real-time Technologies
  • Choosing a Provider
  • Ingesting Data
  • Real-time Ingestion
  • Real-time ETL5
  • Kafka
  • About Kafka
  • Kafka Internals
  • Kafka API
  • Processing Data
  • Real-time Data Processing
  • Real-time Processing Technologies
  • Spark Streaming
  • Spark Streaming
  • Streaming API
  • Advanced Streaming
  • Data Products
  • Analysis of Data
  • Dashboarding
  • Conclusion

Related Scheduled Courses