Introduction
GitOps is a paradigm that uses Git as the single source of truth for declarative infrastructure and applications. ArgoCD and Flux are the leading tools for implementing GitOps on Kubernetes. This guide covers deployment patterns, rollback strategies, and choosing between the two.
GitOps Principles
Core Concepts
1. Declarative - Everything defined in Git 2. Versioned - Git history = deployment history 3. Automated - Tools sync Git to cluster 4. Auditable - All changes tracked in Git
Benefits
- Version control - Full deployment history
- Disaster recovery - Restore from Git
- Collaboration - Pull request workflow
- Security - Git as gatekeeper
- Consistency - Same process for all environments
ArgoCD
Installation
Quick install:
# Create namespace
kubectl create namespace argocd
# Install ArgoCD
kubectl apply -n argocd -f \
https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
# Access UI
kubectl port-forward svc/argocd-server -n argocd 8080:443
# Get initial password
kubectl -n argocd get secret argocd-initial-admin-secret \
-o jsonpath="{.data.password}" | base64 -d
Production install with HA:
kubectl apply -n argocd -f \
https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/ha/install.yaml
Creating Applications
Declarative Application
application.yaml:
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: myapp
namespace: argocd
spec:
project: default
# Source repository
source:
repoURL: https://github.com/myorg/myapp
targetRevision: main
path: k8s/overlays/production
# Kustomize
kustomize:
version: v5.0.0
# Or Helm
# helm:
# releaseName: myapp
# valueFiles:
# - values-production.yaml
# Destination cluster
destination:
server: https://kubernetes.default.svc
namespace: production
# Sync policy
syncPolicy:
automated:
prune: true # Delete resources not in Git
selfHeal: true # Sync on manual changes
allowEmpty: false
syncOptions:
- CreateNamespace=true
- PruneLast=true
retry:
limit: 5
backoff:
duration: 5s
factor: 2
maxDuration: 3m
CLI Application Creation
# Create app
argocd app create myapp \
--repo https://github.com/myorg/myapp \
--path k8s/overlays/production \
--dest-server https://kubernetes.default.svc \
--dest-namespace production \
--sync-policy automated \
--auto-prune \
--self-heal
# List apps
argocd app list
# Get app details
argocd app get myapp
# Sync app
argocd app sync myapp
# Delete app
argocd app delete myapp
Deployment Patterns
1. Multi-Environment Deployment
Repository structure:
myapp/
โโโ base/
โ โโโ deployment.yaml
โ โโโ service.yaml
โ โโโ kustomization.yaml
โโโ overlays/
โโโ development/
โ โโโ kustomization.yaml
โ โโโ patches.yaml
โโโ staging/
โ โโโ kustomization.yaml
โ โโโ patches.yaml
โโโ production/
โโโ kustomization.yaml
โโโ patches.yaml
ArgoCD applications:
# dev-app.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: myapp-dev
namespace: argocd
spec:
source:
repoURL: https://github.com/myorg/myapp
targetRevision: main
path: overlays/development
destination:
namespace: development
---
# staging-app.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: myapp-staging
namespace: argocd
spec:
source:
repoURL: https://github.com/myorg/myapp
targetRevision: main
path: overlays/staging
destination:
namespace: staging
---
# prod-app.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: myapp-prod
namespace: argocd
spec:
source:
repoURL: https://github.com/myorg/myapp
targetRevision: release # Different branch
path: overlays/production
destination:
namespace: production
syncPolicy:
automated:
prune: true
selfHeal: false # Manual sync for production
2. Progressive Delivery with Argo Rollouts
Install Argo Rollouts:
kubectl create namespace argo-rollouts
kubectl apply -n argo-rollouts -f \
https://github.com/argoproj/argo-rollouts/releases/latest/download/install.yaml
Canary deployment:
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: myapp
spec:
replicas: 10
strategy:
canary:
steps:
- setWeight: 10 # 10% traffic to new version
- pause: {duration: 5m}
- setWeight: 25 # 25% traffic
- pause: {duration: 5m}
- setWeight: 50 # 50% traffic
- pause: {duration: 5m}
- setWeight: 75 # 75% traffic
- pause: {duration: 5m}
# 100% traffic (completes rollout)
# Automatic analysis
analysis:
templates:
- templateName: success-rate
startingStep: 2 # Start analysis at 25%
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: myapp:v2
Analysis template:
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: success-rate
spec:
metrics:
- name: success-rate
interval: 1m
successCondition: result >= 0.95
failureLimit: 3
provider:
prometheus:
address: http://prometheus:9090
query: |
sum(rate(http_requests_total{status=~"2.."}[5m]))
/
sum(rate(http_requests_total[5m]))
Blue-Green deployment:
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: myapp
spec:
replicas: 3
strategy:
blueGreen:
activeService: myapp-active
previewService: myapp-preview
autoPromotionEnabled: false # Manual promotion
scaleDownDelaySeconds: 300 # Keep old version 5min
selector:
matchLabels:
app: myapp
template:
spec:
containers:
- name: myapp
image: myapp:v2
3. App of Apps Pattern
Managing multiple applications:
# apps/app-of-apps.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: app-of-apps
namespace: argocd
spec:
source:
repoURL: https://github.com/myorg/k8s-apps
targetRevision: main
path: apps
destination:
server: https://kubernetes.default.svc
namespace: argocd
syncPolicy:
automated:
prune: true
selfHeal: true
apps/ directory:
apps/
โโโ app-of-apps.yaml
โโโ frontend-app.yaml
โโโ backend-app.yaml
โโโ database-app.yaml
โโโ monitoring-app.yaml
Rollback Strategies
1. Git Revert
Rollback via Git:
# Revert last commit
git revert HEAD
git push
# ArgoCD auto-syncs, rolling back deployment
# Or revert to specific commit
git revert <commit-hash>
git push
2. ArgoCD History Rollback
# View deployment history
argocd app history myapp
# Rollback to specific revision
argocd app rollback myapp <revision-number>
# Or via UI: Application -> History and Rollback
Declarative rollback:
# Update targetRevision to previous commit
spec:
source:
targetRevision: abc123 # Previous commit hash
3. Automated Rollback with Analysis
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: myapp
spec:
strategy:
canary:
steps:
- setWeight: 10
- pause: {duration: 5m}
# Automatic rollback on failure
analysis:
templates:
- templateName: error-rate
startingStep: 1
# Abort on high error rate
abortScaleDownDelaySeconds: 30
Error rate analysis:
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: error-rate
spec:
metrics:
- name: error-rate
interval: 1m
failureCondition: result > 0.05 # >5% error rate
failureLimit: 2 # Fail after 2 consecutive failures
provider:
prometheus:
address: http://prometheus:9090
query: |
sum(rate(http_requests_total{status=~"5.."}[5m]))
/
sum(rate(http_requests_total[5m]))
Flux
Installation
Install Flux CLI:
# macOS
brew install fluxcd/tap/flux
# Linux
curl -s https://fluxcd.io/install.sh | sudo bash
# Verify
flux --version
Bootstrap Flux:
# Set GitHub token
export GITHUB_TOKEN=<your-token>
# Bootstrap
flux bootstrap github \
--owner=myorg \
--repository=fleet-infra \
--branch=main \
--path=./clusters/production \
--personal
# This creates:
# - fleet-infra repository
# - Flux components in cluster
# - Deploy key for repository
Creating Resources
GitRepository Source
# sources/myapp.yaml
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
name: myapp
namespace: flux-system
spec:
interval: 1m
url: https://github.com/myorg/myapp
ref:
branch: main
# Or tag
# ref:
# tag: v1.2.3
# Or semver
# ref:
# semver: ">=1.0.0 <2.0.0"
Kustomization
# kustomizations/myapp.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: myapp
namespace: flux-system
spec:
interval: 5m
path: ./k8s/overlays/production
prune: true
sourceRef:
kind: GitRepository
name: myapp
healthChecks:
- apiVersion: apps/v1
kind: Deployment
name: myapp
namespace: production
timeout: 2m
HelmRelease
# helm-releases/myapp.yaml
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: myapp
namespace: production
spec:
interval: 5m
chart:
spec:
chart: myapp
version: '>=1.0.0 <2.0.0'
sourceRef:
kind: HelmRepository
name: myapp-charts
namespace: flux-system
values:
replicaCount: 3
image:
repository: myapp
tag: v1.2.3
Deployment Patterns
1. Multi-Tenant Setup
Repository structure:
fleet-infra/
โโโ clusters/
โ โโโ production/
โ โ โโโ flux-system/
โ โ โโโ infrastructure.yaml
โ โ โโโ tenants.yaml
โ โโโ staging/
โ โโโ ...
โโโ infrastructure/
โ โโโ cert-manager/
โ โโโ ingress-nginx/
โ โโโ prometheus/
โโโ tenants/
โโโ team-a/
โ โโโ namespace.yaml
โ โโโ app.yaml
โโโ team-b/
โโโ namespace.yaml
โโโ app.yaml
tenants.yaml:
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: tenants
namespace: flux-system
spec:
interval: 5m
sourceRef:
kind: GitRepository
name: flux-system
path: ./tenants
prune: true
2. Image Automation
Image repository scanner:
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImageRepository
metadata:
name: myapp
namespace: flux-system
spec:
image: registry.example.com/myapp
interval: 1m
Image policy:
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImagePolicy
metadata:
name: myapp
namespace: flux-system
spec:
imageRepositoryRef:
name: myapp
policy:
semver:
range: '>=1.0.0 <2.0.0'
# Or numerical
# policy:
# numerical:
# order: asc
# Or alphabetical
# policy:
# alphabetical:
# order: asc
Image update automation:
apiVersion: image.toolkit.fluxcd.io/v1beta1
kind: ImageUpdateAutomation
metadata:
name: myapp
namespace: flux-system
spec:
interval: 1m
sourceRef:
kind: GitRepository
name: flux-system
git:
checkout:
ref:
branch: main
commit:
author:
email: [email protected]
name: fluxcdbot
messageTemplate: |
Update {{range .Updated.Images}}{{println .}}{{end}}
push:
branch: main
update:
path: ./k8s/overlays/production
strategy: Setters
Deployment with image marker:
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
template:
spec:
containers:
- name: myapp
image: registry.example.com/myapp:v1.2.3 # {"$imagepolicy": "flux-system:myapp"}
3. Progressive Delivery with Flagger
Install Flagger:
flux create source helm flagger \
--url https://flagger.app \
--namespace flux-system
flux create helmrelease flagger \
--source HelmRepository/flagger \
--chart flagger \
--namespace flux-system
Canary deployment:
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: myapp
namespace: production
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
service:
port: 80
analysis:
interval: 1m
threshold: 5
maxWeight: 50
stepWeight: 10
metrics:
- name: request-success-rate
thresholdRange:
min: 99
interval: 1m
- name: request-duration
thresholdRange:
max: 500
interval: 1m
webhooks:
- name: load-test
url: http://loadtester/
timeout: 5s
metadata:
cmd: "hey -z 1m -q 10 -c 2 http://myapp/"
Rollback Strategies
1. Git Revert
Same as ArgoCD - revert commit in Git.
2. Suspend Reconciliation
# Suspend Kustomization
flux suspend kustomization myapp
# Make manual changes or fix in Git
# Resume
flux resume kustomization myapp
3. Flagger Automated Rollback
Flagger automatically rolls back on:
- Failed metrics
- Failed webhooks
- Analysis threshold exceeded
spec:
analysis:
threshold: 5 # Rollback after 5 failed checks
metrics:
- name: error-rate
thresholdRange:
max: 5 # Rollback if >5% errors
ArgoCD vs Flux
Comparison
Feature | ArgoCD | Flux |
---|---|---|
UI | Web UI included | No native UI (can use Weave GitOps) |
Architecture | Centralized | Distributed (agent per cluster) |
Sync | Pull-based | Pull-based |
RBAC | Built-in | Kubernetes RBAC |
Helm | Native support | HelmRelease CRD |
Kustomize | Native support | Kustomization CRD |
Progressive delivery | Argo Rollouts | Flagger |
Image automation | External | Built-in |
Multi-tenancy | Projects | Namespace isolation |
Secret management | External Secret Operator | SOPS, Sealed Secrets |
When to Choose ArgoCD
Use ArgoCD if you need:
- Web UI for visualization
- Built-in RBAC
- Centralized management
- Application grouping (projects)
- Team prefers UI-driven workflows
When to Choose Flux
Use Flux if you need:
- Kubernetes-native approach
- Image automation built-in
- Multi-cluster from single repo
- GitOps for infrastructure and apps
- Team prefers CLI/GitOps-only
Can Use Both
Hybrid approach:
- Flux for infrastructure
- ArgoCD for applications
- Both read from same Git repo
Best Practices
1. Repository Structure
Monorepo:
gitops-repo/
โโโ clusters/
โ โโโ production/
โ โโโ staging/
โโโ apps/
โ โโโ frontend/
โ โโโ backend/
โโโ infrastructure/
โโโ ingress/
โโโ monitoring/
Repo per team:
team-a-apps/
โโโ app-1/
โโโ app-2/
team-b-apps/
โโโ app-3/
โโโ app-4/
infrastructure/
โโโ cert-manager/
โโโ prometheus/
2. Secrets Management
Sealed Secrets:
# Install sealed secrets
kubectl apply -f \
https://github.com/bitnami-labs/sealed-secrets/releases/download/v0.24.0/controller.yaml
# Create sealed secret
echo -n mypassword | kubectl create secret generic mysecret \
--dry-run=client \
--from-file=password=/dev/stdin \
-o yaml | \
kubeseal -o yaml > sealed-secret.yaml
# Commit sealed-secret.yaml to Git
SOPS:
# Install SOPS
brew install sops
# Encrypt secret
sops --encrypt --age <age-public-key> secret.yaml > secret.enc.yaml
# Flux decryption
kubectl create secret generic sops-age \
--from-file=age.agekey=./age.agekey \
--namespace flux-system
External Secrets Operator:
apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
name: aws-secrets
spec:
provider:
aws:
service: SecretsManager
region: us-east-1
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: myapp-secret
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secrets
data:
- secretKey: password
remoteRef:
key: myapp/password
3. Notifications
ArgoCD notifications:
apiVersion: v1
kind: ConfigMap
metadata:
name: argocd-notifications-cm
namespace: argocd
data:
service.slack: |
token: $slack-token
trigger.on-sync-succeeded: |
- send: [app-deployed]
template.app-deployed: |
message: |
Application {{.app.metadata.name}} deployed successfully
slack:
attachments: |
[{
"title": "{{.app.metadata.name}}",
"color": "good"
}]
Flux notifications:
apiVersion: notification.toolkit.fluxcd.io/v1beta1
kind: Provider
metadata:
name: slack
namespace: flux-system
spec:
type: slack
channel: deployments
secretRef:
name: slack-url
---
apiVersion: notification.toolkit.fluxcd.io/v1beta1
kind: Alert
metadata:
name: on-deploy
namespace: flux-system
spec:
providerRef:
name: slack
eventSeverity: info
eventSources:
- kind: Kustomization
name: '*'
4. Health Checks
ArgoCD health check:
spec:
ignoreDifferences:
- group: apps
kind: Deployment
jsonPointers:
- /spec/replicas # Ignore HPA changes
syncPolicy:
syncOptions:
- RespectIgnoreDifferences=true
Flux health check:
spec:
healthChecks:
- apiVersion: apps/v1
kind: Deployment
name: myapp
namespace: production
timeout: 5m
5. Resource Limits
Prevent resource exhaustion:
# ArgoCD Application
spec:
syncPolicy:
syncOptions:
- CreateNamespace=false # Don't auto-create namespaces
- PruneLast=true # Delete resources last
# Flux Kustomization
spec:
prune: true
force: false # Don't force recreate
timeout: 5m
Troubleshooting
ArgoCD Issues
App out of sync:
# Check diff
argocd app diff myapp
# Hard refresh
argocd app get myapp --hard-refresh
# Force sync
argocd app sync myapp --force
Sync failed:
# Check logs
kubectl logs -n argocd deployment/argocd-application-controller
# Check app status
argocd app get myapp
# Manual sync with prune
argocd app sync myapp --prune
Flux Issues
Reconciliation stuck:
# Check status
flux get kustomizations
# Check events
kubectl describe kustomization myapp -n flux-system
# Force reconciliation
flux reconcile kustomization myapp --with-source
Source not updating:
# Check GitRepository
flux get sources git
# Force update
flux reconcile source git myapp
Conclusion
GitOps with ArgoCD or Flux provides:
- Declarative deployments - Infrastructure as code
- Version control - Full audit trail
- Automated sync - Git push triggers deployment
- Easy rollback - Git revert = infrastructure revert
- Security - Git-based access control
Key takeaways:
- Choose ArgoCD for UI and centralized management
- Choose Flux for Kubernetes-native GitOps
- Use progressive delivery for safer rollouts
- Encrypt secrets properly
- Structure repositories for your workflow
- Implement notifications for visibility
- Test rollback procedures
GitOps transforms deployment into a simple git push
, making infrastructure and applications easier to manage, audit, and recover.