-
As businesses increasingly rely on applications that produce and process data in real-time, data streaming is an increasingly in-demand skill for data engineers. In the Data Streaming Nanodegree program, students will learn how to process data in real-time by building fluency in modern data engineering tools, such as Apache Spark, Kafka, Spark Streaming, and Kafka Streaming. The Data Streaming Nanodegree program will prepare students for the cutting edge of data engineering as more and more companies look to derive live insights from data at scale.
Learn how to stream data to unlock key insights in real-time.
Overview
Syllabus
-
- Foundations of Data Streaming
- Learn the fundamentals of stream processing, including how to work with the Apache Kafka ecosystem, data schemas, ApacheAvro, Kafka Connect and REST proxy, KSQL, and Faust Stream Processing.
- Streaming API Development and Documentation
- The goal of this course is to grow your expertise in the components of streaming data systems, and build a real time analytics application. Specifically, you will be able to identify components of Spark Streaming (architecture and API), build a continuous application with Structured Streaming, consume and process data from Apache Kafka with Spark Structured Streaming (including setting up and running a Spark Cluster), create a DataFrame as an aggregation of source DataFrames, sink a composite DataFrame to Kafka, and visually inspect a data sink for accuracy.