ForgeSDLC
Navigate
Home
Discover ForgeSDLC (101)
Practice (201)
Master (301)

DevOps body of knowledge

This document maps the core concerns of DevOps — culture, automation, measurement, continuous delivery, observability, and incident management — to the blueprint ecosystem.

How DevOps relates to PDLC and SDLC: DevOps is a cross-cutting discipline that enables continuous delivery and operational excellence across both lifecycles. See DEVOPS-SDLC-PDLC-BRIDGE.md for the full mapping.

Practices: Deep guides for specific practices (CI/CD, IaC, observability, etc.) are in practices/.

Tooling: Framework and platform guidance is in tooling/.


1. The Three Ways

The Three Ways (from The Phoenix Project by Gene Kim) provide the philosophical foundation for DevOps:

Way Principle Practices
First Way: Flow Accelerate the flow of work from development to operations to the customer CI/CD, small batch sizes, WIP limits, automation of manual steps, trunk-based development
Second Way: Feedback Create fast, frequent feedback loops from right to left Automated testing, monitoring/alerting, A/B testing, fast rollback, telemetry in production
Third Way: Continuous Learning Create a culture of experimentation, learning from failure, and improvement Blameless postmortems, game days/chaos engineering, innovation time, shared knowledge bases

2. CALMS framework

CALMS describes the five pillars of DevOps adoption:

Pillar Description Indicators of maturity
Culture Shared responsibility between dev and ops; trust, transparency, psychological safety Blameless postmortems are standard; devs participate in on-call; cross-functional teams
Automation Eliminate manual, repetitive work through tooling and scripting CI/CD pipelines for all services; infrastructure as code; automated testing; self-service provisioning
Lean Apply lean principles — eliminate waste, small batches, continuous improvement WIP limits enforced; value stream mapping done; bottlenecks identified and addressed
Measurement Use data to drive decisions about delivery performance and system health DORA metrics tracked; SLOs defined; dashboards for pipeline health, system health, business metrics
Sharing Break down silos; share knowledge, tools, and responsibilities Runbooks maintained; cross-team code review; shared tooling; guilds/chapters for practices

3. DORA metrics

The four key metrics from the DORA (DevOps Research and Assessment) program, validated across thousands of organizations:

Metric Definition Elite High Medium Low
Deployment frequency How often code is deployed to production On-demand (multiple/day) Daily to weekly Weekly to monthly Monthly to semi-annually
Lead time for changes Time from commit to production Less than 1 hour 1 day to 1 week 1 week to 1 month 1 to 6 months
Change failure rate % of deployments causing production failure 0–15% 16–30% 16–30% 46–60%
Time to restore Time from failure to restoration Less than 1 hour Less than 1 day 1 day to 1 week More than 6 months

Fifth metric — Reliability: Added in 2021, measuring operational performance against targets (SLOs).

How to use DORA metrics

  • Baseline — measure current state before making changes
  • Trend — track improvement over time, not absolute values
  • System-level — measure per service or value stream, not per team (avoids gaming)
  • Complement with SDLC and PDLC metrics — DORA measures delivery performance; SDLC metrics measure engineering quality; PDLC metrics measure product outcomes

4. DevOps maturity model

Level Characteristics Focus areas
1 — Initial Manual processes, siloed teams, infrequent releases, long stabilization periods Basic CI, version control, automated build
2 — Managed CI pipeline, some automated testing, environments managed but not codified Extend test automation, introduce IaC, define deployment process
3 — Defined CD pipeline, IaC for all environments, monitoring in place, defined incident process Shift-left security, observability (tracing, structured logging), SLO definition
4 — Measured DORA metrics tracked, SLOs enforced, chaos engineering practiced, self-service platforms Error budgets, advanced deployment strategies, platform engineering
5 — Optimized Continuous improvement culture, fast experimentation, proactive capacity planning Innovation time, cross-org knowledge sharing, contributing to internal platforms

5. Site Reliability Engineering (SRE)

SRE (from Google) is an implementation of DevOps principles with specific practices:

SRE concept Description
SLO / SLI / SLA Service Level Objectives (internal targets), Indicators (measurements), Agreements (external contracts)
Error budgets Acceptable failure rate derived from SLO; budget remaining determines whether to prioritize features or reliability
Toil Manual, repetitive, automatable work that scales linearly with service size; SRE aims to keep toil below 50%
Capacity planning Proactive resource provisioning based on demand forecasting
Blameless postmortems Learning from incidents without blame; focus on systemic improvements

6. DevSecOps

Integrating security into the DevOps pipeline rather than treating it as a separate gate:

Practice Where in pipeline Purpose
SAST (Static Application Security Testing) Build Find vulnerabilities in source code
SCA (Software Composition Analysis) Build Identify vulnerable dependencies
DAST (Dynamic Application Security Testing) Verify Find vulnerabilities in running application
Container scanning Build / Deploy Verify container image security
Secret detection Commit / Build Prevent secrets from entering the codebase
Infrastructure compliance Deploy Verify IaC against security policies
Runtime protection Operate WAF, RASP, anomaly detection

7. Competencies

Competency Description
Systems thinking Understanding the full delivery pipeline and its interactions; seeing bottlenecks and feedback loops
Automation mindset Default to automating repetitive tasks; script-first approach to operations
Collaboration Working across team boundaries; shared ownership of delivery and operations
Incident response Structured approach to detecting, responding to, and learning from production incidents
Infrastructure knowledge Understanding cloud platforms, networking, containers, orchestration
Security awareness Integrating security practices into daily delivery work

8. External references

Topic URL Why it is linked
DORA — State of DevOps Report https://dora.dev/ Canonical research on DevOps performance metrics
Google SRE Books https://sre.google/books/ Free online SRE handbook, workbook, and security engineering
The Phoenix Project (Gene Kim) https://itrevolution.com/product/the-phoenix-project/ Narrative introduction to DevOps principles (Three Ways)
The DevOps Handbook (Kim, Humble, Debois, Willis) https://itrevolution.com/product/the-devops-handbook-second-edition/ Comprehensive DevOps practices guide
Continuous Delivery (Humble, Farley) https://continuousdelivery.com/ Foundational text on deployment pipelines and release engineering
DevOps Institute https://www.devopsinstitute.com/ Certification and competency framework

Keep project-specific CI/CD and operational documentation in docs/development/CI-CD.md and docs/operations/, not in this file.