Hadoop for Data Science Tips, Tricks, & Techniques

Go to class
Write Review

Free Online Course: Hadoop for Data Science Tips, Tricks, & Techniques provided by LinkedIn Learning is a comprehensive online course, which lasts for 1-2 hours worth of material. The course is taught in English and is free of charge. Upon completion of the course, you can receive an e-certificate from LinkedIn Learning. Hadoop for Data Science Tips, Tricks, & Techniques is taught by Ben Sullins.

Overview
  • Get up to speed with Hadoop. Learn tips and tricks for doing data science work in this popular big data platform.

Syllabus
  • Introduction

    • Welcome
    • What you should know
    • Exercise files
    • Environment setup
    1. Working with Files
    • Organize files in HDFS
    • Upload files to HDFS
    • Move files in HDFS
    • Remove files in HDFS
    2. Connecting to Hadoop
    • Explore Hive through Beeline
    • Access Hive from Python
    • Create aggregates in Hive
    • Select partitions in Hive
    3. Complex Data Structures in Hive
    • Map data in Hive
    • Arrays in Hive
    • Structs in Hive
    • Create flat tables for Impala
    • Deconstruct Impala queries
    Conclusion
    • Next steps