Cleaning Bad Data in R

Go to class
Write Review

Free Online Course: Cleaning Bad Data in R provided by LinkedIn Learning is a comprehensive online course, which lasts for 1-2 hours worth of material. The course is taught in English and is free of charge. Upon completion of the course, you can receive an e-certificate from LinkedIn Learning. Cleaning Bad Data in R is taught by Mike Chapple.

Overview
  • Clean up your data in R. Learn how to identify and address data integrity issues such as missing and duplicate data, using R and the tidyverse.

Syllabus
  • Introduction

    • Data is messy
    • What you need to know
    1. Missing Data
    • Types of missing data
    • Missing values
    • Missing rows
    • Aggregations and missing values
    2. Duplicated Data
    • Duplicated rows and values
    • Aggregations in the data set
    3. Formatting Data
    • Converting dates
    • Unit conversions
    • Numbers stored as text
    • Text improperly converted to numbers
    • Inconsistent spellings
    4. Outliers
    • Screening for outliers
    • Handling outliers
    • Outliers use case
    • Outliers in subgroups
    • Detecting illogical values
    5. Tidy Data
    • What is tidy data?
    • Variables, observations, and values
    • Common data problems
    • Wide vs. long data sets
    • Making wide data sets long
    • Making long data sets wide
    6. Red Flags
    • Suspicious values
    • Suspicious multiples
    Conclusion
    • What's next?