CS125x: Advanced Distributed Machine Learning with Apache Spark

Go to class
Write Review

Free Online Course: CS125x: Advanced Distributed Machine Learning with Apache Spark provided by edX is a comprehensive online course, which lasts for 5-10 hours a week. The course is taught in English and is free of charge. Upon completion of the course, you can receive an e-certificate from edX. CS125x: Advanced Distributed Machine Learning with Apache Spark is taught by Ameet Talwalkar and Jon Bates.

Overview
  • Building on the core ideas presented in Distributed Machine Learning with Spark, this course covers advanced topics for training and deploying large-scale learning pipelines. You will study state-of-the-art distributed algorithms for collaborative filtering, ensemble methods (e.g., random forests), clustering and topic modeling, with a focus on model parallelism and the crucial tradeoffs between computation and communication.

    After completing this course, you will have a thorough understanding of the statistical and algorithmic principles required to develop and deploy distributed machine learning pipelines. You will further have the expertise to write efficient and scalable code in Spark, using MLlib and the spark.ml package in particular.

Tags