Executive Summary

Layer 4 (Transport Layer) Load Balancing distributes traffic at the TCP/UDP level, before any application-level processing. Unlike Layer 7 (HTTP), L4 LBs don’t inspect request contentβ€”they simply route packets based on IP protocol data.

When to use L4:

  • Raw throughput requirements (millions of requests/sec)
  • Non-HTTP protocols (gRPC, databases, MQTT, game servers)
  • TLS passthrough (encrypted SNI unavailable)
  • Extreme latency sensitivity

When NOT to use L4:

  • HTTP/HTTPS (use Layer 7 instead)
  • Request-based routing (path-based, host-based)
  • Simple workloads with <1M req/sec

Fundamentals

L4 vs L7: Quick Comparison

AspectLayer 4 (TCP/UDP)Layer 7 (HTTP/HTTPS)
What it seesIP/port/protocolHTTP headers, body, cookies
Routing based onDestination IP, port, protocolHost, path, query string, cookies
ThroughputVery high (millions pps)Lower (thousands rps)
Latency<1ms typical5-50ms typical
ProtocolsTCP, UDP, QUIC, SCTPHTTP/1.1, HTTP/2, HTTPS, WebSocket
EncryptionCan passthrough TLSCan terminate/re-encrypt
Best forDatabases, non-HTTP, TLS passthroughWeb apps, microservices, APIs

Core Concepts

Listeners: Defined by (protocol, port). Example: TCP:443, UDP:5353

Target Groups/Backends: Pool of servers receiving traffic. Health checked individually.

Health Checks:

  • Interval: How often to check (AWS default: 30s)
  • Timeout: Max time to wait for response (AWS default: 6s)
  • Healthy threshold: Consecutive passes before marking “healthy” (AWS default: 3)
  • Unhealthy threshold: Consecutive fails before marking “unhealthy” (AWS default: 3)

Connection Draining: Graceful shutdown window (AWS: 0-3600s). Existing connections allowed to complete; new connections rejected.

Source IP Preservation: Pass client IP to backend (critical for logging, rate limiting). L4 preserves by default; only breaks with NAT.

Static IPs: Public/private IPs that don’t change. Critical for allowlists, DNS.

Cross-Zone Load Balancing: Distribute traffic across AZs even if backend pool is uneven per AZ. Costs inter-AZ data transfer.


Cloud Implementation Mapping

FeatureAWS NLBAzure Std LBGCP TCP LB
ScopeRegionalRegionalRegional (external) / Regional (internal)
Target typesEC2, Lambda, IP (on-prem)VMs, VMSS, ILBs, NICsInstances, NEGs, managed groups
Static IPOptional (elastic IPs)Yes (always)Optional (static anycast IP)
Private/InternalYes (NLB in private subnet)Yes (private frontend IP)Yes (Internal TCP/UDP LB)
Health checksTCP/HTTP/gRPCTCP, HTTP, HTTPSTCP, HTTP, HTTPS
Cross-zoneYes (configurable, costs inter-AZ)Yes (default, included)Yes (by design)
Max targets1000 per TG1000 per backend poolUnlimited (NEGs)
LoggingFlow logs (5-tuple)Diagnostic logsCloud Logging
DDoS shieldAWS Shield Standard/PlusAzure DDoS Standard/ProtectionCloud Armor (L7), no L4 native
Pricing modelHourly + LCUHourly + LCU + outbound dataPer forwarding rule + data

AWS: Network Load Balancer (NLB)

Use NLB when:

  • Need extreme throughput (100 Gbps+)
  • TLS termination needed (decrypt, re-encrypt, or passthrough)
  • UDP required (DNS, syslog, gaming)
  • Preserve source IP critical
  • Private/internal traffic between AWS services

Key features:

  • Ultra-high performance (millions of req/sec)
  • Preserves source IP by default
  • Supports TLS termination + passthrough
  • Cross-zone by default (cost lever)

Azure: Standard Load Balancer

Use Azure LB when:

  • HA Ports: Single listener on all 65k ports (stateless appliances)
  • Private Link integration
  • Outbound NAT rules configured
  • VMs, VMSS, or private endpoints

Key features:

  • HA Ports: One listener handles all traffic
  • Always has static public/private IP
  • Outbound rules for NAT
  • Zone-redundant by default

GCP: TCP/UDP Load Balancer

Use GCP LB when:

  • Non-HTTP protocols on Google Cloud
  • External: Public IP, multi-region capable
  • Internal: Private IP for GCP-to-GCP

Key features:

  • Seamless global LB (optional)
  • NEGs (Network Endpoint Groups) for flexible targeting
  • Private service connections
  • Cloud Armor for DDoS (Layer 7)

Kubernetes Service Type=LoadBalancer

---
apiVersion: v1
kind: Service
metadata:
  name: app-lb
  namespace: default
  annotations:
    # AWS NLB annotations
    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
    service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
    service.beta.kubernetes.io/aws-load-balancer-backend-protocol: "tcp"
    # Preserve source IP
    externalTrafficPolicy: "Local"  # Keeps client IP; may distribute unevenly
spec:
  type: LoadBalancer
  selector:
    app: myapp
  ports:
  - name: main
    port: 443         # External port
    targetPort: 8443  # Container port
    protocol: TCP
  - name: dns
    port: 53
    targetPort: 5353
    protocol: UDP
  sessionAffinity: ClientIP  # Sticky sessions (optional)
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 3600
  # For GCP/Azure: loadBalancerIP can be static
  # loadBalancerIP: "203.0.113.1"
  # For AWS: SecurityGroups annotation
  # service.beta.kubernetes.io/aws-load-balancer-security-groups: "sg-12345"

K8s L4 Behavior per Cloud:

CloudTypeBehaviorSource IP
AWSNLBOne NLB per ServicePreserved (via SNAT)
AzureStandard LBOne LB per ServicePreserved (if rules set)
GCPTCP LBOne LB per ServiceLost (use externalTrafficPolicy: Local)

Security

Network Security Groups / Firewalls

Inbound (to LB frontend):

# AWS Security Group
ingress {
  from_port   = 443
  to_port     = 443
  protocol    = "tcp"
  cidr_blocks = ["0.0.0.0/0"]  # Restrict if possible
  description = "HTTPS from internet"
}

Outbound (from LB to targets):

egress {
  from_port       = 8443
  to_port         = 8443
  protocol        = "tcp"
  security_groups = [aws_security_group.backend.id]
  description     = "To backend servers"
}

Private/Internal LBs

# AWS: Private NLB (internal subnet)
internal = true
subnets  = [aws_subnet.private_a.id, aws_subnet.private_b.id]

# Azure: Private LB
frontend_ip_configuration {
  subnet_id                     = azurerm_subnet.internal.id
  private_ip_address_allocation = "Dynamic"
  # or "Static" + private_ip_address = "10.0.1.100"
}

mTLS for L4 Passthrough

When doing TLS passthrough (no decryption), app must handle mTLS:

# Example: Node.js server with mTLS
import tls from 'tls';
import fs from 'fs';

const options = {
  key: fs.readFileSync('server.key'),
  cert: fs.readFileSync('server.crt'),
  ca: fs.readFileSync('ca.crt'),
  requestCert: true,
  rejectUnauthorized: true  // Require valid client cert
};

tls.createServer(options, (socket) => {
  console.log('Client IP:', socket.remoteAddress);
  console.log('Client cert:', socket.getPeerCertificate());
  socket.write('Hello, authenticated client!');
}).listen(8443);

Reliability & Performance

Multi-AZ Strategy

Avoid single-AZ NAT/LB:

❌ BAD: All outbound NAT through single-AZ NAT gateway
  Instance A (AZ-1) β†’ NAT (AZ-1) β†’ Internet
  Instance B (AZ-2) β†’ NAT (AZ-1) β†’ Internet  [cross-AZ cost!]

βœ“ GOOD: One NAT per AZ
  Instance A (AZ-1) β†’ NAT (AZ-1) β†’ Internet
  Instance B (AZ-2) β†’ NAT (AZ-2) β†’ Internet

Connection Reuse

Client β†’ LB (TCP connection A)
LB β†’ Backend1 (TCP connection B, reused)
LB β†’ Backend2 (new TCP connection)

Keep-alive timeout: 60s (allow connection reuse)
Idle timeout: 350s (AWS default, prevent stale connections)

Timeouts

# AWS NLB timeouts
deregistration_delay = 30        # Drain timeout
load_balancer_deregistration_delay = 350  # Idle timeout
enable_cross_zone_load_balancing = true

Health Check Tuning

# Aggressive (fast detection)
health_check:
  interval       = 5
  timeout        = 2
  healthy_threshold   = 2
  unhealthy_threshold = 2
# Result: ~15s to mark unhealthy

# Relaxed (stable, fewer false positives)
health_check:
  interval       = 30
  timeout        = 6
  healthy_threshold   = 3
  unhealthy_threshold = 3
# Result: ~120s to mark unhealthy

Observability

Key Metrics

# AWS NLB
AWS/NetworkELB:
  ActiveFlowCount_TCP:        # Current TCP connections
  NewFlowCount_TCP:           # New connections/period
  ClientTLSNegotiationCount:  # TLS handshakes
  UnHealthyHostCount:         # Unhealthy targets
  HealthyHostCount:           # Healthy targets
  ProcessedBytes:             # Total bytes processed
  TCP_ELB_Reset_Count:        # Resets from LB
  TargetTLSNegotiationErrorCount:  # Backend TLS errors

# Azure LB
Microsoft.Network/loadBalancers:
  BytesIn / BytesOut:         # Data processed
  SynCount:                    # SYN packets
  SnatConnectionCount:        # Outbound NAT flows
  AllocatedSnatPorts:         # SNAT port usage

# GCP LB
compute.googleapis.com/forwarding_rule:
  https/internal_tcp_lb_rule/connections:  # Active connections
  https/internal_tcp_lb_rule/internal_tcp_lb_rule_internal_tcp_lb_rule_packets:  # Packets/sec

Logging

# AWS Flow Logs (5-tuple: src-ip, dst-ip, src-port, dst-port, protocol, bytes, packets)
2 123456789012 eni-12345678 10.0.1.1 10.0.2.1 49152 8443 6 1024 512 1234567890 1234567890 ACCEPT OK

# Azure Diagnostic Logs
{
  "time": "2025-01-15T10:30:45Z",
  "resourceId": "/SUBSCRIPTIONS/.../NETWORKINTERFACES/nic-1",
  "bytes_sent": 1024,
  "bytes_received": 512,
  "direction": "outbound"
}

# GCP Cloud Logging (via Cloud Load Balancing)
jsonPayload: {
  client_ip: "203.0.113.1"
  client_port: 49152
  target_ip: "10.0.1.100"
  target_port: 8443
  bytes_sent: 1024
  bytes_received: 512
}

Cost Optimization

Pricing Models

AWS NLB:

  • Hourly LB charge (~$0.006/hour)
  • LCU (Load Balancer Capacity Unit): processes, new connections, bandwidth
  • Cross-AZ data transfer: $0.01/GB (lever!)

Azure Standard LB:

  • Per rule charge (~$0.10/month)
  • Data processed: included
  • Outbound NAT: optional charge

GCP TCP/UDP LB:

  • Per forwarding rule: $0.10/month
  • Data: $0.02/GB (cross-region)
  • No per-LB charge

Cost Reduction Tactics

# 1. Minimize cross-AZ traffic
cross_zone_load_balancing: false  # Only if targets evenly distributed per AZ

# 2. Idle timeout tuning
idle_timeout: 60  # Lower = fewer stale connections, but may break legitimate long-lived

# 3. Connection consolidation
# Instead of: 10 listeners (10 LCU units each)
# Use: 1 listener, multiplex protocols at app level

# 4. Single-AZ for stateless workloads (with auto-recovery)
availability_zones = ["us-east-1a"]  # Saves ~50% cross-AZ fees, trade-off: AZ failure = downtime

# 5. Right-size health checks
interval: 30  # Default; reducing to 5 = 6x more API calls, check if needed

IaC: Pulumi Python Examples

AWS NLB (TCP/TLS Passthrough)

import pulumi
import pulumi_aws as aws
import pulumi_awsx as awsx

# VPC
vpc = aws.ec2.Vpc("main",
    cidr_block="10.0.0.0/16",
    enable_dns_hostnames=True,
    enable_dns_support=True,
    tags={"Name": "main-vpc"})

# Subnets
subnet_a = aws.ec2.Subnet("subnet-a",
    vpc_id=vpc.id,
    cidr_block="10.0.1.0/24",
    availability_zone="us-east-1a",
    tags={"Name": "subnet-a"})

subnet_b = aws.ec2.Subnet("subnet-b",
    vpc_id=vpc.id,
    cidr_block="10.0.2.0/24",
    availability_zone="us-east-1b",
    tags={"Name": "subnet-b"})

# Security group for NLB
nlb_sg = aws.ec2.SecurityGroup("nlb-sg",
    vpc_id=vpc.id,
    ingress=[
        aws.ec2.SecurityGroupIngressArgs(
            protocol="tcp",
            from_port=443,
            to_port=443,
            cidr_blocks=["0.0.0.0/0"]),
    ],
    egress=[
        aws.ec2.SecurityGroupEgressArgs(
            protocol="-1",
            from_port=0,
            to_port=0,
            cidr_blocks=["0.0.0.0/0"]),
    ],
    tags={"Name": "nlb-sg"})

# Backend security group
backend_sg = aws.ec2.SecurityGroup("backend-sg",
    vpc_id=vpc.id,
    ingress=[
        aws.ec2.SecurityGroupIngressArgs(
            protocol="tcp",
            from_port=8443,
            to_port=8443,
            security_groups=[nlb_sg.id]),
    ],
    tags={"Name": "backend-sg"})

# NLB
nlb = aws.lb.LoadBalancer("app-nlb",
    internal=False,
    load_balancer_type="network",
    security_groups=[nlb_sg.id],
    subnets=[subnet_a.id, subnet_b.id],
    enable_cross_zone_load_balancing=True,
    tags={"Name": "app-nlb"})

# Target Group
target_group = aws.lb.TargetGroup("app-tg",
    port=8443,
    protocol="TLS",
    target_type="instance",
    vpc_id=vpc.id,
    health_check=aws.lb.TargetGroupHealthCheckArgs(
        healthy_threshold=3,
        unhealthy_threshold=3,
        timeout=6,
        interval=30,
        protocol="TLS"),
    tags={"Name": "app-tg"})

# Listener (TLS passthrough)
listener = aws.lb.Listener("app-listener",
    load_balancer_arn=nlb.arn,
    port=443,
    protocol="TLS",
    default_actions=[
        aws.lb.ListenerDefaultActionArgs(
            type="forward",
            target_group_arn=target_group.arn)],
    ssl_policy="ELBSecurityPolicy-TLS-1-2-2017-01",
    certificate_arn="arn:aws:acm:us-east-1:123456789012:certificate/12345678-1234-1234-1234-123456789012")

# Export NLB DNS
pulumi.export("nlb_dns_name", nlb.dns_name)

Azure Standard Load Balancer

import pulumi
import pulumi_azure as azure

# Resource group
rg = azure.core.ResourceGroup("main",
    location="eastus")

# Virtual network
vnet = azure.network.VirtualNetwork("main",
    resource_group_name=rg.name,
    address_spaces=["10.0.0.0/16"],
    location=rg.location)

# Subnet
subnet = azure.network.Subnet("main",
    resource_group_name=rg.name,
    virtual_network_name=vnet.name,
    address_prefix="10.0.1.0/24")

# Public IP for LB
public_ip = azure.network.PublicIp("lb-pip",
    resource_group_name=rg.name,
    location=rg.location,
    allocation_method="Static",
    sku="Standard")

# Load Balancer
lb = azure.network.LoadBalancer("main",
    resource_group_name=rg.name,
    location=rg.location,
    sku="Standard",
    frontend_ip_configurations=[
        azure.network.LoadBalancerFrontendIpConfigurationArgs(
            name="PublicIPAddress",
            public_ip_address_id=public_ip.id)],
    backend_address_pools=[
        azure.network.LoadBalancerBackendAddressPoolArgs(
            name="BackendPool")],
    health_probes=[
        azure.network.LoadBalancerHealthProbeArgs(
            name="tcpProbe",
            protocol="Tcp",
            port=8443,
            interval_in_seconds=30)],
    load_balancing_rules=[
        azure.network.LoadBalancerLoadBalancingRuleArgs(
            name="LBRule",
            protocol="Tcp",
            frontend_port=443,
            backend_port=8443,
            frontend_ip_configuration_name="PublicIPAddress",
            backend_address_pool_name="BackendPool",
            health_probe_name="tcpProbe",
            enable_floating_ip=False)],
    tags={"Name": "main-lb"})

pulumi.export("lb_public_ip", public_ip.ip_address)

GCP TCP Load Balancer (Internal)

import pulumi
import pulumi_gcp as gcp

# Network
network = gcp.compute.Network("main",
    auto_create_subnetworks=False)

# Subnet
subnet = gcp.compute.Subnetwork("main",
    network=network.id,
    ip_cidr_range="10.0.0.0/24",
    region="us-central1")

# Instance group (targets)
ig = gcp.compute.InstanceGroup("backend",
    zone="us-central1-a",
    instances=[],  # Add instances later
    named_ports=[
        gcp.compute.InstanceGroupNamedPortArgs(
            name="tcp8443",
            port=8443)])

# Health check
health_check = gcp.compute.HealthCheck("tcp-health",
    tcp_health_check=gcp.compute.HealthCheckTcpHealthCheckArgs(
        port=8443),
    check_interval_sec=30,
    timeout_sec=6)

# Backend service
backend_service = gcp.compute.BackendService("tcp-backend",
    backends=[
        gcp.compute.BackendServiceBackendArgs(
            group=ig.self_link,
            balancing_mode="CONNECTION")],
    health_checks=[health_check.id],
    load_balancing_scheme="INTERNAL",
    protocol="TCP",
    region="us-central1",
    session_affinity="CLIENT_IP")

# Forwarding rule (internal)
forwarding_rule = gcp.compute.ForwardingRule("tcp-forwarding",
    load_balancing_scheme="INTERNAL",
    backend_service=backend_service.id,
    ports=["443"],
    region="us-central1",
    subnetwork=subnet.id)

pulumi.export("internal_ip", forwarding_rule.ip_address)

Kubernetes Service (AWS NLB)

import pulumi
import pulumi_kubernetes as k8s

# Namespace
ns = k8s.core.v1.Namespace("apps",
    metadata={"name": "apps"})

# Service
svc = k8s.core.v1.Service("app-lb",
    metadata={
        "namespace": ns.metadata["name"],
        "annotations": {
            "service.beta.kubernetes.io/aws-load-balancer-type": "nlb",
            "service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled": "true"}
    },
    spec={
        "type": "LoadBalancer",
        "selector": {"app": "myapp"},
        "ports": [
            {"name": "main", "port": 443, "targetPort": 8443, "protocol": "TCP"},
            {"name": "health", "port": 8080, "targetPort": 8080, "protocol": "TCP"}
        ],
        "externalTrafficPolicy": "Local"  # Preserve client IP
    })

pulumi.export("service_dns", svc.status["load_balancer"]["ingress"][0]["hostname"])

Best Practices Checklist

Before Creating L4 LB

  • Documented traffic patterns (protocol, ports, expected RPS/PPS)
  • Decided: Static IP needed? β†’ Plan EIP/reserved IP allocation
  • Chosen: Private or public LB? β†’ Confirm subnet/security group planning
  • Cross-AZ traffic cost analyzed β†’ Decision made on cross_zone_load_balancing
  • Quotas checked (NLB targets, security group rules, EIPs per account)
  • High-availability: Multi-AZ targets identified

After Creating L4 LB

  • Health checks validated: All targets = healthy
  • Connection drain tested: Old clients disconnect gracefully
  • Failover tested: Kill an AZ β†’ traffic shifts to healthy AZ within timeout
  • Flow logs enabled β†’ Shipped to CloudWatch/Application Insights
  • Metrics monitored: Active connections, bytes/sec, unhealthy count
  • Security: Inbound rules restrict source, outbound rules to specific SGs

Top Pitfalls to Avoid

PitfallImpactFix
Client IP lostRate limiting, GeoIP, logging brokenEnable source IP preservation; use SNAT awareness
Aggressive timeoutsLegitimate long-lived connections killedIdle timeout β‰₯60s; health check interval β‰₯30s
Single-AZ NAT/LBCost shock, single point of failureOne NAT/LB per AZ; enable cross-zone (with cost awareness)
Health checks wrong portAll targets marked unhealthy, total blackoutMatch health check port to app listener port
Cross-AZ cost surpriseBill spike (can be 10%+ of LB cost)Decision: accept cost, or accept lower availability
No connection drainingClient connections abruptly closed during deploySet deregistration_delay = 30-300s
Uneven target distributionSome backends hot, others coldRe-balance; or disable cross-zone if truly uneven per AZ
Idle timeout too lowDrop long-lived connections (WebSocket, DB)Increase idle_timeout; test with real app

Architecture Diagram

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Internet / Clients                           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                 β”‚ TCP:443 (encrypt TLS)
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚  Layer 4 Load Balancer  β”‚
                    β”‚  (Public IP, Sticky)    β”‚
                    β”‚  Listeners: TCP:443     β”‚
                    β”‚  Health: TCP:8443 /10s  β”‚
                    β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                         β”‚ (preserve source IP)
          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
          β”‚              β”‚              β”‚
    β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
    β”‚  Backend-1 β”‚ β”‚ Backend-2  β”‚ β”‚ Backend-3 β”‚
    β”‚  AZ-1      β”‚ β”‚ AZ-2       β”‚ β”‚ AZ-1      β”‚
    β”‚  :8443     β”‚ β”‚ :8443      β”‚ β”‚ :8443     β”‚
    β”‚  TLS       β”‚ β”‚ TLS        β”‚ β”‚ TLS       β”‚
    β”‚  Health: βœ“ β”‚ β”‚ Health: βœ“  β”‚ β”‚ Health: βœ“ β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Failover scenario:
  AZ-1 fails β†’ LB detects unhealthy (Backend-1, Backend-3)
  Traffic redistributes β†’ AZ-2 (Backend-2) now 100% capacity
  Auto-scaling hooks trigger β†’ New Backend-X in AZ-1

Conclusion

Layer 4 load balancing is essential for high-throughput, low-latency, non-HTTP workloads. Key takeaways:

  1. Choose right for your use case: L4 for extreme performance/non-HTTP, L7 for HTTP routing
  2. Plan multi-AZ: Avoid single points of failure; accept cross-AZ costs or design for single-AZ resilience
  3. Monitor relentlessly: Health, active connections, latency percentiles
  4. Cost matters: Cross-AZ, LCUs, idle timeoutsβ€”all levers to pull
  5. Test before prod: Drain, failover, TLS passthrough security

Further Reading: