Big Data Analytics with Hadoop and Apache Spark

Go to class

Write Review

Free Online Course: Big Data Analytics with Hadoop and Apache Spark provided by LinkedIn Learning is a comprehensive online course, which lasts for 1-2 hours worth of material. The course is taught in English and is free of charge. Upon completion of the course, you can receive an e-certificate from LinkedIn Learning. Big Data Analytics with Hadoop and Apache Spark is taught by Kumaran Ponnambalam.

via LinkedIn Learning

1-2 hours worth of material

Self Paced

Discover how to build scalable and optimized data analytics pipelines by combining the powers of Apache Hadoop and Spark.

Syllabus

Introduction
- The combined power of Spark and Hadoop Distributed File System (HDFS)
1. Introduction and Setup
- Apache Hadoop overview
- Apache Spark overview
- Integrating Hadoop and Spark
- Setting up the environment
- Using exercise files
2. HDFS Data Modeling for Analytics
- Storage formats
- Compression
- Partitioning
- Bucketing
- Best practices for data storage
3. Data Ingestion with Spark
- Reading external files into Spark
- Writing to HDFS
- Parallel writes with partitioning
- Parallel writes with bucketing
- Best practices for ingestion
4. Data Extraction with Spark
- How Spark works
- Reading HDFS files with schema
- Reading partitioned data
- Reading bucketed data
- Best practices for data extraction
5. Optimizing Spark Processing
- Pushing down projections
- Pushing down filters
- Managing partitions
- Managing shuffling
- Improving joins
- Storing intermediate results
- Best practices for data processing
6. Use Case Project
- Problem definition
- Data loading
- Total score analytics
- Average score analytics
- Top student analytics
Conclusion
- Next steps

0, based on 0 Course votes at GetYourEducation

4.5, based on 169 Course reviews at LinkedIn Learning

Start your review of Big Data Analytics with Hadoop and Apache Spark

Add review ->

Language

English
Certificate
Certificate Available
Date of Start
24th February, 2020
Price
Free Trial Available
Provider
LinkedIn Learning
Taught By

Kumaran Ponnambalam
Share this course