SRE Fundamentals and Security

Go to class
Write Review

Free Online Course: SRE Fundamentals and Security provided by edX is a comprehensive online course, which lasts for 5 weeks long, 2-3 hours a week. The course is taught in English and is free of charge. Upon completion of the course, you can receive an e-certificate from edX. SRE Fundamentals and Security is taught by Michele Jordan and Marissa Moore.

Overview
  • Site Reliability Engineers must have the right tools and strategies to perform in a technical, fast-paced environment. IBM Cloud SRE is guided by nine competency areas that lead to the successful practice of the discipline:

    ● Applying Site Reliability Engineering principles

    ● Operations

    ● Monitoring and incident management

    ● Security and compliance

    ● Compute infrastructure

    ● Networking

    ● Storage and data management

    ● Reliability and resiliency

    ● Deployment automation

    In this first course of the three-part Professional Certificate in Site Reliability Engineering (SRE), you will focus on the first four SRE competencies:

    ● Applying Site Reliability Engineering principles

    ● Operations

    ● Monitoring and incident management

    ● Security and compliance

    NOTE: The remaining five SRE competencies are covered in Course 2: SRE Infrastructure, Resiliency and Deployment Automation.

    This course covers approximately 50% of the content required to help you prepare for the “IBM Certified Professional SRE - Cloud V2” certification exam.

    If you are interested in pursuing the “IBM Certified Professional SRE - Cloud V2” certification, we recommend that you complete all three offerings of the Professional Certificate in Site Reliability Engineering (SRE) to ensure a successful certification exam experience.

Syllabus
  • Module 1: Welcome and Introduction

    You will cover the following topics:

    ● An introduction to the IBM Professional SRE role

    Module 2: SRE Fundamentals and Terminology

    You will cover the following topics:

    ● Deeper dive into SRE role

    ● SRE principles

    ● Managing trade-offs between change, velocity, and reliability

    ● Negotiating service level objectives, service level indicators, error budgets and the user experience

    ● IBM Cloud tools and technology across the Software Development Life Cycle

    ● Applying software engineering principles to drive reliability

    Module 3: Operations

    You will cover the following topics:

    ● Performing operational readiness reviews (ORR) on IBM Cloud

    ● Creating ORR checklist

    ● Employing cost-optimization strategies

    ● Managing backups and recoveries on IBM Cloud

    Module 4: Monitoring

    You will cover the following topics:

    ● Monitoring overview

    ● Creating and maintaining metrics, traces, and alerts on IBM Cloud

    ● Collecting, analyzing, and managing logs on IBM Cloud

    ● Identifying key metrics for service health on IBM Cloud

    ● Using performance and availability metrics to measure the health of services on IBM Cloud

    Module 5: Incident Management

    You will cover the following topics:

    ● Managing incidents on IBM Cloud

    ● Developing a balanced action plan to mitigate future incidents

    ● Performing the post-incident review

    Module 6: Security and Compliance

    You will cover the following topics:

    ● Monitoring and managing security threats on IBM Cloud

    ● Implementing and managing security policies on IBM Cloud

    ● Implementing encryption models

    ● Managing role-based access control on IBM Cloud