Data Mining Pipeline

Go to class
Write Review

Free Online Course: Data Mining Pipeline provided by Coursera is a comprehensive online course, which lasts for 4 weeks long, 21 hours worth of material. The course is taught in English and is free of charge. Upon completion of the course, you can receive an e-certificate from Coursera. Data Mining Pipeline is taught by Qin (Christine) Lv.

Overview
  • This course introduces the key steps involved in the data mining pipeline, including data understanding, data preprocessing, data warehousing, data modeling, interpretation and evaluation, and real-world applications.

    Data Mining Pipeline can be taken for academic credit as part of CU Boulder’s Master of Science in Data Science (MS-DS) degree offered on the Coursera platform. The MS-DS is an interdisciplinary degree that brings together faculty from CU Boulder’s departments of Applied Mathematics, Computer Science, Information Science, and others. With performance-based admissions and no application process, the MS-DS is ideal for individuals with a broad range of undergraduate education and/or professional experience in computer science, information science, mathematics, and statistics. Learn more about the MS-DS program at https://www.coursera.org/degrees/master-of-science-data-science-boulder.

    Course logo image courtesy of Francesco Ungaro, available here on Unsplash: https://unsplash.com/photos/C89G61oKDDA

Syllabus
    • Data Mining Pipeline
      • This module provides an introduction to data mining and data mining pipeline, including the four views of data mining and the key components in the data mining pipeline.
    • Data Understanding
      • This module covers data understanding by identifying key data properties and applying techniques to characterize different datasets.
    • Data Preprocessing
      • This module explains why data preprocessing is needed and what techniques can be used to preprocess data.
    • Data Warehousing
      • This module covers the key characteristics of data warehousing and the techniques to support data warehousing.