- Learn about SRE, an engineering discipline that helps you sustainably achieve the appropriate level of reliability in your systems, services, and products.
In this module you will:
- Gain a basic understanding of Site Reliability Engineering (SRE).
- Learn how to get started with this valuable operations practice.
- Learn how to manage site reliability.
After completing this module, you'll be able to:
- Describe how site reliability engineering (SRE) empowers software developers to own the ongoing daily operation of their applications in production.
- Describe how Application Insights analyzes the performance of your web application and can warn you about potential problems.
- List the processes that you can implement to monitor site reliability.
- Build a "just culture" that balances safety and accountability.
- Cloud Admin course from Dr. Majd Sakr at Carnegie Mellon University. Discover what cloud elasticity means and different ways to scale your cloud resources.
In this module you will:
- Describe common load patterns and how they drive the need to scale
- Enumerate the strategies and considerations in scaling cloud applications
- Discuss the advantages of auto-scaling and the mechanisms used to achieve it
- Describe the importance of load balancing in cloud applications and enumerate various methods to achieve it
- List the primary benefits of serverless computing and explain the concept of serverless functions
This content is provided in partnership with Dr. Majd Sakr and Carnegie Mellon University.
- Carnegie Mellon University's Cloud Developer course. Learn how developers write programs that run on the cloud, including how to deploy, be fault-tolerant, load balance, scale, and deal with latency.
In this module, you will:
- Evaluate different considerations when programming applications that run on clouds
- Evaluate different considerations when deploying applications on clouds
- Compare and contrast proactive and reactive measures for fault tolerance in cloud applications
- Describe the importance of load balancing in cloud applications and enumerate various methods to achieve it
- Enumerate the strategies and considerations in scaling cloud applications
- Motivate the case for minimizing tail latency and discuss the various strategies to reduce tail latency
- Describe the strategies to optimize total operational cost of using cloud services
In partnership with Dr. Majd Sakr and Carnegie Mellon University.
- Learn how to monitor your Azure VMs by using Azure Monitor to collect and analyze VM host and client metrics and logs.
- Understand which monitoring data you need to collect from your VM.
- Enable and view recommended alerts and diagnostics.
- Use Azure Monitor to collect and analyze VM host metrics data.
- Use Azure Monitor Agent to collect VM client performance metrics and event logs.
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Syllabus
- Introduction to Site Reliability Engineering (SRE)
- Introduction to Site Reliability Engineering
- What is SRE and why does it matter?
- SRE in context
- Key SRE principles and practices: virtuous cycles
- Key SRE principles and practices: The human side of SRE
- Getting started with SRE
- Summary
- Manage site reliability
- Introduction
- What is reliability engineering?
- What is Application Insights?
- Perform ongoing tuning to reduce meaningless alerts
- Analyze alerts to establish a baseline
- Blameless postmortems
- Module assessment
- Summary
- Scale your cloud resources with elasticity
- Introduction
- Compute load patterns
- Scaling compute resources
- Automated scaling on the cloud
- Load balancing
- Serverless computing
- Summary
- Build applications on the cloud
- Introduction
- Programming the cloud
- Deploy applications on the cloud
- Build fault-tolerant cloud services
- Load balancing
- Scale resources
- How to deal with tail latency
- Economics for cloud applications
- Summary
- Monitor your Azure virtual machines with Azure Monitor
- Introduction
- Monitoring for Azure VMs
- Monitor VM host data
- Use Metrics Explorer to view detailed host metrics
- Collect client performance counters by using VM insights
- Collect VM client event logs
- Summary