Combining and Analyzing Complex Data

Go to class
Write Review

Free Online Course: Combining and Analyzing Complex Data provided by Coursera is a comprehensive online course, which lasts for 4 weeks long, 9-10 hours worth of material. The course is taught in English and is free of charge. Upon completion of the course, you can receive an e-certificate from Coursera. Combining and Analyzing Complex Data is taught by Richard Valliant, Ph.D..

Overview
  • In this course you will learn how to use survey weights to estimate descriptive statistics, like means and totals, and more complicated quantities like model parameters for linear and logistic regressions. Software capabilities will be covered with R® receiving particular emphasis. The course will also cover the basics of record linkage and statistical matching—both of which are becoming more important as ways of combining data from different sources. Combining of datasets raises ethical issues which the course reviews. Informed consent may have to be obtained from persons to allow their data to be linked. You will learn about differences in the legal requirements in different countries.

Syllabus
    • Basic Estimation
      • After completing Modules 1 and 2 of this course you will understand how to estimate descriptive statistics, overall and for subgroups, when you deal with survey data. We will review software for estimation (R, Stata, SAS) with examples for how to estimate things like means, proportions, and totals. You will also learn how to estimate parameters in linear, logistic, and other models and learn software options with emphasis on R. Module 3 and 4 discuss how you can add additional data to your analysis. This requires knowing about record linkage techniques, and what it takes to get permission to link data.
    • Models
      • Module 2 covers how to estimate linear and logistic model parameters using survey data. After completing this module, you will understand how the methods used differ from the ones for non-survey data. We also cover the features of survey data sets that need to be accounted for when estimating standard errors of estimated model parameters.
    • Record Linkage
      • Module starts with the current debate on using more (linked) administrative records in the U.S. Federal Statistical System, and a general motivation for linking records. Several examples will be given on why it is useful to link data. Challenges of record linkage will be discussed. A brief overview over key linkage techniques is included as well.
    • Ethics
      • This module will discuss key issues in obtaining consent to record linkage. Failure to consent can lead to bias estimates. Current research examples will be given as well as practical suggestions on how to obtain linkage consent.

Tags