Scala Essential Training for Data Science

Go to class
Write Review

Free Online Course: Scala Essential Training for Data Science provided by LinkedIn Learning is a comprehensive online course, which lasts for 1-2 hours worth of material. The course is taught in English and is free of charge. Upon completion of the course, you can receive an e-certificate from LinkedIn Learning. Scala Essential Training for Data Science is taught by Dan Sullivan.

Overview
  • Use Scala in your data science work. Explore the Scala features most useful to data scientists, including custom functions, parallel processing, and programming Spark with Scala.

Syllabus
  • Introduction

    • Welcome
    • What you should know
    • Using the exercise files
    1. Introduction to Scala
    • The advantages of Scala for data science
    • Installing Scala
    • Scala data types
    • Scala collections
    • Scala sets Scala arrays, vectors, and ranges
    • Scala maps
    • Scala expressions
    • Scala functions
    • Scala objects
    2. Parallel Processing in Scala
    • Advantages of parallel collections
    • Creating parallel collections
    • Mapping functions over parallel collections
    • Filtering parallel collections
    • When and when not to use parallel collections
    3. Using SQL in Scala
    • Installing PostgreSQL
    • Loading data into PostgreSQL
    • Connecting to PostgreSQL
    • Querying with SQL strings
    • Querying with prepared statements
    • Summary of SQL in Scala
    4. Scala and Spark RDDs
    • Introduction to Spark
    • Installing Spark
    • Getting Started with Spark RDDs
    • Mapping Functions over RDDs
    • Statistics over RDDs
    • Summary of Scala and Spark RDDs
    5. Scala and Spark DataFrames
    • Creating DataFrames
    • Grouping and filtering on DataFrames
    • Joining DataFrames
    • Working with JSON files
    • Summary of Scala and Spark DataFrames
    Conclusion
    • Review of Scala for data science