Machine Learning Data Lifecycle in Production

Go to class
Write Review

Free Online Course: Machine Learning Data Lifecycle in Production provided by Coursera is a comprehensive online course, which lasts for 4 weeks long, 22 hours worth of material. The course is taught in English and is free of charge. Upon completion of the course, you can receive an e-certificate from Coursera. Machine Learning Data Lifecycle in Production is taught by Robert Crowe.

Overview
  • In the second course of Machine Learning Engineering for Production Specialization, you will build data pipelines by gathering, cleaning, and validating datasets and assessing data quality; implement feature engineering, transformation, and selection with TensorFlow Extended and get the most predictive power out of your data; and establish the data lifecycle by leveraging data lineage and provenance metadata tools and follow data evolution with enterprise data schemas.

    Understanding machine learning and deep learning concepts is essential, but if you’re looking to build an effective AI career, you need production engineering capabilities as well. Machine learning engineering for production combines the foundational concepts of machine learning with the functional expertise of modern software development and engineering roles to help you develop production-ready skills.

    Week 1: Collecting, Labeling, and Validating data
    Week 2: Feature Engineering, Transformation, and Selection
    Week 3: Data Journey and Data Storage
    Week 4: Advanced Data Labeling Methods, Data Augmentation, and Preprocessing Different Data Types

Syllabus
    • Week 1: Collecting, Labeling and Validating Data
      • This week covers a quick introduction to machine learning production systems. More concretely you will learn about leveraging the TensorFlow Extended (TFX) library to collect, label and validate data to make it production ready.
    • Week 2: Feature Engineering, Transformation and Selection
      • Implement feature engineering, transformation, and selection with TensorFlow Extended by encoding structured and unstructured data types and addressing class imbalances
    • Week 3: Data Journey and Data Storage
      • Understand the data journey over a production system’s lifecycle and leverage ML metadata and enterprise schemas to address quickly evolving data.
    • Week 4 (Optional): Advanced Labeling, Augmentation and Data Preprocessing
      • Combine labeled and unlabeled data to improve ML model accuracy and augment data to diversify your training set.