Apache Spark Essential Training

Go to class
Write Review

Free Online Course: Apache Spark Essential Training provided by LinkedIn Learning is a comprehensive online course, which lasts for 1-2 hours worth of material. The course is taught in English and is free of charge. Upon completion of the course, you can receive an e-certificate from LinkedIn Learning. Apache Spark Essential Training is taught by Ben Sullins.

Overview
  • Get up to speed with Spark, and discover how to leverage this powerful platform to efficiently and effectively work with big data.

Syllabus
  • Introduction

    • Welcome
    • What you should know before watching this course
    • Using the exercise files
    1. Introducing Apache Spark
    • Understanding Spark
    • Origins of Spark
    • Overview of Spark components
    • Where Spark shines
    • Overview of Databricks
    • Introduction to notebooks and PySpark
    2. Analyzing Data in Spark
    • Understanding data interfaces
    • Working with text files
    • Loading CSV data into DataFrames
    • Exploring data in DataFrames
    • Saving your results
    3. Using Spark SQL to Analyze Data
    • Creating tables
    • Querying data with Spark SQL
    • Visualizing data in Databricks notebooks
    4. Running Machine Learning Algorithms Using MLlib
    • Introduction to machine learning with Spark
    • Preparing data for machine learning
    • Building a linear regression model
    • Evaluating a linear regression model
    • Visualizing a linear regression model
    5. Real-Time Data Analysis with Spark Streaming
    • Introduction to streaming analytics
    • Streaming context setup
    • Querying streaming data
    6. Connecting BI Tools to Spark
    • Setting up spark locally
    • Connecting Jupyter notebooks to Spark
    • Other connection options
    Conclusion
    • Next steps