Site Reliability Engineering (SRE) Foundation & Practitioner - eLearning (exam included)

1.250,00 EUR

  • 50 hours
eLearning

Our course thoroughly covers the DevOps Institute's SRE℠ curriculum, providing in-depth education on site reliability engineering and its impact on delivering and scaling high-quality services. Starting with SRE principles and methodologies, the program explores their practical application and how they can optimize your workflow and operations.

Key Features

Language

Course and material are available in English

Level

Beginner - Intermediate level

180 days access

to each Foundation & Practitioner eLearning course

30+ hours total video content

with 50 hours study time recommended

Practice

Quizzes and exam practice

Exam Included

2 official exam voucher (Foundation & Practitioner)

Certificate

Certification of course completion

Provided by GEL

Accredited by PeopleCert

Hero

Course Timeline

Hero
  1. SRE Principles & Practices

    Foundation Module 1

    This module provides an introduction to site reliability engineering (SRE) as a field, highlighting its distinctions from DevOps. Delve into the core principles and methodologies of SRE in this comprehensive overview.


  2. Service Level Objectives & Error Budgets

    Foundation Module 2

    This module explores service level objectives (SLOs), service levels, error budgets, and policies governing error budgets.


  3. Reducing Toil

    Foundation Module 3

    This module introduces the concept of 'toil', discusses its implications as a challenge, and explores effective strategies for its management.


  4. Monitoring & Service Level Indicators

    Foundation Module 4

    This module centers on service level indicators (SLIs), emphasizing observability and monitoring practices.


  5. SRE Tools & Automation

    Foundation Module 5

    This module examines the concept of 'automation' as defined by both SRE and DevOps. It delves into various categories of automation and their organizational structure, in addition to highlighting popular automation tools.


  6. Anti-Fragility & Learning from Failure

    Foundation Module 6

    This module explores the SRE principle of deriving insights from failures and its correlation with anti-fragility and chaos engineering practices.


  7. Organizational Impact of SRE

    Foundation Module 7

    This module investigates the organizational management of SRE. It discusses the initial implementation of SRE, the reasons behind the widespread adoption of SRE by businesses, strategies for integrating SRE, effective incident response practices, and the importance of blameless post-mortems. Additionally, it explores the scalability of SRE implementation.


  8. SRE, Other Frameworks, Trends

    Foundation Module 8

    This module delves into the integration of SRE with prominent frameworks such as IT4IT, Agile, and ITIL 4. It also explores the evolution of SRE and its future trajectory.


  9. An Introduction to SRE Practitioner (SREP)

    Practitioner Module 0

    This module presents students with an overview of the course, highlighting its goals, objectives, study schedule, and layout. Participants will be guided through the course outline and offered supplementary resources such as a glossary, additional reading materials, diagrams, and links to access crucial SRE publications. Common queries about SRE Practitioner are addressed, followed by a quick assessment to evaluate retention of the SRE Foundation syllabus content.


  10. SRE Antipatterns

    Practitioner Module 1

    This module delves into SRE antipatterns and explores how these counterproductive behaviors can have adverse effects on a pipeline.


  11. Service Levels and Error Budgets

    Practitioner Module 2

    This module explores system boundaries and illustrates the process of defining system capabilities, as well as establishing suitable service level indicators (SLIs) and service level objectives (SLOs). Additionally, it covers measuring the baseline and delves into multi-service architecture, including the calculation and utilization of error budgets.


  12. Building Secure and Reliable Systems

    Practitioner Module 3

    This module outlines the responsibilities of a site reliability engineer in system design, emphasizing key factors related to evolving landscapes and security needs. It further explores modern methodologies, technologies, and resources for system design, including design patterns that empower SRE professionals to construct secure, robust, dependable, and scalable systems.


  13. Full-stack Observability

    Practitioner Module 4

    This module centers on the essential components of comprehensive stack observability and the role of instrumentation in enhancing the observability of SRE systems.


  14. Platform SRE and AIOps

    Practitioner Module 5

    This module explores the advantages of adopting a platform-centric approach in the development and management of platforms as products. It further delves into the utilization of artificial intelligence for enhancing IT operations and the strategies for AI implementation.


  15. SRE and Incident Management

    Practitioner Module 6

    This module explores the essential components of incident management within the incident command framework. It also discusses the application of the Observe, Orient, Decide, Act (OODA) loop in integrating technology, procedures, and assets for effective incident responses.


  16. Chaos Engineering

    Practitioner Module 7

    This module explores the concept of 'chaos engineering', which involves conducting experiments on a distributed system to enhance trust in its resilience and adaptability during challenging circumstances. It also provides insights on organizing game day drills to practice chaos engineering and debunks prevalent misconceptions surrounding the topic

  17. Implementing SRE Practices

    Practitioner Module 8

    This module delves into the significance of Site Reliability Engineering (SRE) in enhancing operational efficiency and embracing DevOps principles to the fullest. It further explores the strategies and frameworks employed to deploy and operationalize SRE practices.

Learning Outcomes

After finishing this course, you will know:

exams

Gain comprehensive knowledge to excel in SRE Foundation & Practitioner certification exams.

principles

Understand core SRE principles, methodologies, and technologies and their impact on development and operations

efficiency

Learn how SRE and DevOps work together to improve operational efficiency.

scale

Implement SRE best practices to scale services reliably and cost-effectively.

(SLOs)

Define, set, and monitor Service Level Objectives (SLOs) and Service Level Indicators (SLIs) in distributed environments.

budgets

Apply error budgets to guide decision-making and improve system reliability.

monitoring

Master observability, monitoring, and incident response using control platforms and frameworks

resilience

Explore AIOps and chaos engineering to enhance IT service efficiency and system resilience

optimization

Design systems with built-in security, reliability, and performance optimization.

trends

Stay updated with evolving SRE trends and continuous learning practices

Site Reliability Engineering

Target Audience

Prerequisites: No prerequisites are necessary for taking the exam.

This course does not have any mandatory requirements for enrollment. Nonetheless, having prior familiarity with SRE and DevOps concepts can be advantageous for a better understanding of the course material.



Professionals in software engineering

DevOps practitioners and Site Reliability Engineers

DevOps Engineers

Site Reliability Engineers (SREs)

Start course now

Exam details

SRE Foundation (SREF) exam

  • This exam comprises 40 multiple-choice questions
  • Candidates have 60 minutes to finish the exam
  • It is an open-book exam, allowing the use of provided materials only
  • To pass, candidates need to achieve a minimum score of 65%: at least 26 out of 40 questions must be answered correctly
  • The exam can be taken either online or in person under invigilation

SRE Practitioner (SREP) exam

  • This exam comprises 40 multiple-choice questions
  • Candidates have 90 minutes to complete the exam
  • It is an open-book exam, allowing the use of provided materials only
  • To pass, candidates need to achieve a minimum score of 65%: at least 26 out of 40 questions must be answered correctly
  • The exam can be taken either online or in person with supervision

Statements

Licensing and accreditation

The Site Reliability Engineering course is provided by GEL, an ATO of PeopleCert. SRE is a registered trademark of PeopleCert. Used under license from PeopleCert. All rights reserved. AVC promotes this course on behalf of GEL.

Equity Policy

PeopleCert provides a Special Considerations Policy for exam accommodations. Candidates requiring accommodations should refer to the PeopleCert terms and policies at PeopleCert Special Considerations Policy.

Frequently Asked Questions

certification training

Need corporate solutions or LMS integration?

Didn't find the course or program which would work for your business? Need LMS integration? Write us, we will solve everything!