Terraform

Browse all Terraform

📊 SRE Practice 18 min

Disaster Recovery Planning: RTO, RPO, and Building Resilient Systems

Introduction Disaster Recovery (DR) is the process, policies, and procedures for recovering and continuing technology infrastructure after a disaster. A disaster can be natural (earthquake, flood), technical (data center failure, ransomware), or human-caused (accidental deletion, security breach). Core Principle: “Hope is not a strategy. Plan for failure before it happens.” Key Concepts RTO vs RPO Time ─────────────────────────────────────────────────────────────> │ │ │ │ Disaster Detection Recovery Normal Occurs Time Begins Operations │◄──────────────────────────────────►│ │ Recovery Time │ │ Objective (RTO) │ │ │ │◄───────────►│ │ Data Loss │ (Recovery Point │ Objective - RPO) │ Recovery Time Objective (RTO) Definition: Maximum acceptable time that a system can be down after a disaster. …

October 16, 2025 · 18 min · DevOps Engineer

🛠️ Guide 30 min

Infrastructure as Code Best Practices: Terraform, Ansible, Kubernetes

Introduction Infrastructure as Code (IaC) is how modern teams build reliable systems. Instead of manually clicking through cloud consoles or SSHing into servers, you define infrastructure in code—testable, version-controlled, repeatable. This guide shows you practical patterns for Terraform, Ansible, and Kubernetes with real examples, not just theory. Why Infrastructure as Code? Consider a production outage scenario: Without IaC: Database server dies You manually recreate it through AWS console (30 minutes) Forgot to enable backups? Another 15 minutes Need to reconfigure custom security groups? More time Total recovery: 2-4 hours Risk of missing steps = still broken With IaC: …

October 16, 2025 · 30 min · DevOps Engineer

iac terraform ansible kubernetes automation

🛠️ Guide 10 min

Layer 4 Load Balancing Guide: TCP/UDP Load Balancing for DevOps/SRE

Executive Summary Layer 4 (Transport Layer) Load Balancing distributes traffic at the TCP/UDP level, before any application-level processing. Unlike Layer 7 (HTTP), L4 LBs don’t inspect request content—they simply route packets based on IP protocol data. When to use L4: Raw throughput requirements (millions of requests/sec) Non-HTTP protocols (gRPC, databases, MQTT, game servers) TLS passthrough (encrypted SNI unavailable) Extreme latency sensitivity When NOT to use L4: HTTP/HTTPS (use Layer 7 instead) Request-based routing (path-based, host-based) Simple workloads with <1M req/sec Fundamentals L4 vs L7: Quick Comparison Aspect Layer 4 (TCP/UDP) Layer 7 (HTTP/HTTPS) What it sees IP/port/protocol HTTP headers, body, cookies Routing based on Destination IP, port, protocol Host, path, query string, cookies Throughput Very high (millions pps) Lower (thousands rps) Latency <1ms typical 5-50ms typical Protocols TCP, UDP, QUIC, SCTP HTTP/1.1, HTTP/2, HTTPS, WebSocket Encryption Can passthrough TLS Can terminate/re-encrypt Best for Databases, non-HTTP, TLS passthrough Web apps, microservices, APIs Core Concepts Listeners: Defined by (protocol, port). Example: TCP:443, UDP:5353 …

October 16, 2025 · 10 min · DevOps Engineer

load-balancing layer4 tcp udp aws

🛠️ Guide 22 min

Layer 7 Load Balancing Guide: Application-Level Routing for DevOps/SRE

Executive Summary Layer 7 (Application Layer) Load Balancing routes traffic based on HTTP/HTTPS semantics: hostnames, paths, headers, cookies, and body content. Unlike Layer 4, L7 LBs inspect and understand application protocols. When to use L7: HTTP/HTTPS workloads (99% of web apps) Host-based or path-based routing (SaaS multi-tenant) Advanced features: canary deployments, content-based routing API gateways with authentication/authorization WebSockets, gRPC, Server-Sent Events (SSE) When NOT to use L7: Non-HTTP protocols (use L4) Ultra-low latency (<5ms) with extreme throughput (use L4) Binary protocols (databases, Kafka) Fundamentals L7 vs L4: What L7 Adds Feature L4 L7 Visibility IP/port/protocol Full HTTP request/response Routing based on Destination IP, port Host, path, headers, cookies, body Request modification None Rewrite, redirect, compress TLS Passthrough only Terminate + re-encrypt Session affinity IP hash (crude) Sticky cookies, affinity headers Compression No Gzip/Brotli inline WebSockets Requires passthrough Native support gRPC Via TLS passthrough Native with trailers, keep-alives Rate limiting App-level only LB-level per path/host Auth App-level only OIDC, JWT, basic @ edge Throughput Millions RPS Thousands-millions RPS Latency <1ms 1-10ms Core L7 Concepts Listeners: HTTP port 80, HTTPS port 443 (often combined as single listener with TLS upgrade) …

October 16, 2025 · 22 min · DevOps Engineer

load-balancing layer7 http https aws

🛠️ Guide 11 min

Terraform State Management: Remote Backends, Locking, and Workspaces

Introduction Terraform state is the source of truth for your infrastructure. Proper state management is critical for team collaboration, preventing conflicts, and maintaining infrastructure integrity. This guide covers remote backends, locking mechanisms, and workspace strategies. Understanding Terraform State What is State? State is Terraform’s way of tracking which real-world resources correspond to your configuration. It’s stored in terraform.tfstate file. State file contains: Resource mappings Metadata Resource dependencies Attribute values Why State Matters Without proper state management: …

October 15, 2025 · 11 min · DevOps Engineer

terraform iac state backend workspaces

📊 SRE Practice 15 min

Toil Reduction: Strategies and Automation Priorities

Introduction Toil is manual, repetitive, automatable work that scales linearly with service growth. It’s the operational burden that keeps engineers from doing valuable engineering work. Reducing toil is essential for scaling both systems and teams effectively. What is Toil? Google’s SRE Definition Toil has the following characteristics: Manual - Requires human action Repetitive - Done over and over Automatable - Could be automated Tactical - Reactive, interrupt-driven No enduring value - Doesn’t improve the system Scales linearly - Grows with service growth Toil vs Engineering Work Toil (eliminate this): …

October 15, 2025 · 15 min · DevOps Engineer

toil automation efficiency devops