🎉 DevOps Interview Prep Bundle is live — 1000+ Q&A across 20 topicsGet it →
All Articles

How DevOps Engineers Write Technical RFCs and ADRs That Get Approved

Writing RFCs and Architecture Decision Records is a senior DevOps skill that gets your proposals implemented. Here's a practical template and real examples.

DevOpsBoys6 min read
Share:Tweet

Being able to write clearly about technical decisions is what separates senior engineers from principal engineers. If you want your infrastructure proposals implemented, you need to write RFCs and ADRs that people actually read and approve.

RFC vs ADR — The Difference

RFC (Request for Comments): A proposal for a significant change. You're asking for feedback and approval. Used when the decision is large, has broad impact, or involves multiple teams.

ADR (Architecture Decision Record): A record of a decision that was made. Documents the context, decision, and consequences. Used for posterity — "why did we choose this?"

Think of it this way:

  • RFC = "I want to do X, here's my reasoning, please review"
  • ADR = "We decided to do X, here's why, here's what it means"

When to Write an RFC

Not every change needs an RFC. Write one when:

  • The change affects multiple teams or services
  • It's hard or expensive to reverse
  • Reasonable engineers might disagree on the approach
  • You need budget or headcount approval

Examples of RFC-worthy changes:

  • Migrating from Jenkins to GitHub Actions (affects all teams)
  • Moving from Prometheus to Datadog (cost + workflow change)
  • Adopting Kubernetes for services currently on EC2
  • Introducing an internal developer portal

The RFC Template

markdown
# RFC: [Title]
 
**Author:** [Your name]  
**Date:** [Date]  
**Status:** Draft | Under Review | Approved | Rejected | Implemented  
**Reviewers:** [Names/teams who should review]
 
## Summary
 
One paragraph. What are you proposing and why?
 
## Background / Problem
 
What problem are you solving? Why does this matter now?
What are the business or technical consequences of NOT solving it?
 
## Proposal
 
What exactly are you proposing to do?
Be specific — list the changes, timeline, owners.
 
## Alternatives Considered
 
What other approaches did you evaluate?
For each: what is it, why didn't you choose it?
 
This section is critical. If you don't show alternatives,
reviewers will suggest them in comments and delay your RFC.
 
## Impact
 
- **Who is affected:** [teams, services, users]
- **Migration path:** [how existing users move to the new system]
- **Rollback plan:** [how to undo this if it goes wrong]
- **Timeline:** [rough milestones]
 
## Open Questions
 
List things you haven't decided yet or want input on.
This signals intellectual honesty and invites targeted feedback.
 
## References
 
- Related RFCs
- External documentation
- Benchmarks or data that informed this proposal

A Real RFC Example: Migrating to GitHub Actions

markdown
# RFC: Migrate from Jenkins to GitHub Actions
 
**Author:** Rahul S.  
**Date:** 2026-06-22  
**Status:** Under Review  
**Reviewers:** Platform Team, Backend Leads, Security Team
 
## Summary
 
Propose migrating all CI/CD pipelines from Jenkins to GitHub Actions
to reduce operational overhead, improve developer experience, and 
cut infrastructure costs by ~₹4L/year.
 
## Background
 
Our Jenkins cluster (3 agents, 1 master) requires:
- 2-4 hours/week of maintenance (updates, plugin conflicts, agent issues)  
- ₹4.2L/year in EC2 costs for the jenkins infrastructure
- Security patching — Jenkins has had 8 critical CVEs in the past 18 months
- Expertise that's concentrated in 2 engineers (bus factor risk)
 
Developer complaint rate about CI flakiness: 23% of deployments in Q1 had
at least one CI-related delay.
 
## Proposal
 
1. Set up GitHub Actions org-wide for all new repositories (April)
2. Create shared workflow templates for build/test/deploy patterns (April)
3. Migrate 3 pilot services with volunteer teams (May)
4. Migrate all remaining services (June-July)
5. Decommission Jenkins cluster (August)
 
All pipelines will use GitHub-hosted runners for builds. Deployments
to private EKS clusters will use self-hosted runners in our VPC.
 
Secrets will move from Jenkins credentials store to GitHub encrypted secrets
and AWS Secrets Manager (via GitHub OIDC, no long-lived keys).
 
## Alternatives Considered
 
**Keep Jenkins, upgrade to latest LTS**
Pros: No migration work. Cons: Still requires dedicated ops, costs unchanged,
developer experience doesn't improve. Doesn't address the bus factor issue.
 
**Migrate to GitLab CI**
Pros: Self-hosted option, good pipeline visualization. Cons: Requires migrating
source control too (we're on GitHub). Much larger migration scope, higher risk.
 
**Migrate to CircleCI**
Pros: Good ecosystem, fast builds. Cons: External SaaS dependency,
expensive at our scale (~$1,800/month), vendor lock-in risk.
 
GitHub Actions wins on cost (included in GitHub Team plan we already pay for),
zero new vendor relationships, and native integration with our existing SCM.
 
## Impact
 
- **Affected teams:** All 8 product teams (currently using Jenkins)
- **Migration path:** Teams migrate service by service. Jenkins stays up until August.
  Both can run in parallel during transition. Rollback = keep using Jenkins.
- **Developer impact:** YAML-based pipelines instead of Groovy.
  Initial learning curve ~1-2 days per engineer. Long-term: faster, cleaner pipelines.
- **Rollback:** Jenkins remains running until August. Any team can revert by
  removing the GitHub Actions workflow and re-enabling Jenkins job.
 
## Open Questions
 
1. Self-hosted runner security model — should runners have access to AWS secrets
   via IAM role or fetch from Secrets Manager at runtime?
2. How do we handle the 3 Jenkins jobs that use Jenkins-specific plugins
   with no GitHub Actions equivalent?
3. Should we adopt a specific GitHub Actions workflow template library,
   or allow each team to write their own?
 
## References
 
- [GitHub Actions pricing for our plan](https://github.com/pricing)
- [GitHub OIDC with AWS](https://docs.github.com/en/actions/deployment/security-hardening-your-deployments/configuring-openid-connect-in-amazon-web-services)
- [Jenkins to GitHub Actions migration guide](https://docs.github.com/en/actions/migrating-to-github-actions)
- Q1 incident review showing CI-related delays

Architecture Decision Records (ADRs)

ADRs are shorter and document decisions already made:

markdown
# ADR-0042: Use ArgoCD for GitOps Deployments
 
**Date:** 2026-05-15  
**Status:** Accepted  
**Deciders:** Platform Team
 
## Context
 
We needed a GitOps deployment tool that works with our Kubernetes clusters.
Options evaluated: ArgoCD, FluxCD, manual kubectl.
 
## Decision
 
We will use ArgoCD as our GitOps deployment controller across all clusters.
 
## Rationale
 
- ArgoCD has the best UI for application visualization and drift detection
- Broader adoption in the industry (more hiring pool familiar with it)
- App of Apps pattern works well for our multi-tenant setup
- Active CNCF project with commercial support available
- FluxCD is also excellent but ArgoCD's UI is significantly better for
  our ops team who manages multiple clusters
 
## Consequences
 
**Positive:**
- Developers can see deployment status without kubectl access
- GitOps reduces drift between intended and actual cluster state
- Rollbacks are one click in UI or one git revert
 
**Negative:**
- ArgoCD is another component to maintain and patch
- SSO integration requires additional setup
- Teams need training on GitOps model (1-day investment)
 
## Alternatives Rejected
 
- **FluxCD:** Less feature-rich UI, harder for non-platform engineers to use
- **Manual kubectl:** No audit trail, no drift detection, doesn't scale

Getting Your RFC Approved

Address objections before they're raised. Read your RFC as a skeptic and preemptively add "Alternatives Considered" for every objection you can think of.

Data > opinions. "Jenkins takes 2 hours/week to maintain" is more compelling than "Jenkins is painful."

Offer a pilot, not a big bang. "Migrate 3 services first, then decide" reduces perceived risk.

Identify allies early. Have 1-2 technical leaders review informally before the official review. Their early buy-in signals that the RFC is credible.

Set a review deadline. "Comments due by June 30" prevents RFC from dying in perpetual review.

The engineers who move up fastest are usually not the best coders — they're the ones who can make a clear written case for technical decisions.

Resources: ADR GitHub org | RFC template by Rust team

🔧

Today I Fixed

Short real fixes from production — posted daily

Browse fixes
Newsletter

Stay ahead of the curve

Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.

Related Articles

Comments