Level 2 Technical Implementation Documentation

08b. Workload Migration Patterns

Audience: Migration Architects, Application Teams, DevOps Engineers
Prerequisites: Understanding of containerization, CI/CD pipelines, and the "6 Rs" migration strategies

This section provides detailed technical patterns for migrating workloads from US hyperscale cloud providers to sovereign infrastructure, including decision frameworks, implementation guides, and rollback procedures.

The 6 Rs Migration Framework

Each workload must be assessed against six potential migration strategies. The choice depends on sovereignty requirements, technical debt, and business criticality.

Rehost (Lift & Shift)

Low Complexity

Move applications without modification to sovereign infrastructure.

When to Use:

  • Time-critical migrations
  • Applications with minimal cloud-native dependencies
  • VM-based workloads

Sovereignty Considerations:

  • Network endpoints must be updated
  • DNS and certificates require re-issuance
  • Data must be migrated with encryption in transit

Replatform (Lift, Tinker & Shift)

Medium Complexity

Make targeted optimizations during migration without changing core architecture.

When to Use:

  • Applications using managed services with open-source equivalents
  • Container-ready applications
  • Moderate modernization budget

Common Changes:

  • RDS → PostgreSQL on Kubernetes
  • S3 → MinIO
  • ECS/EKS → Native Kubernetes

Refactor (Re-architect)

High Complexity

Significantly modify applications to leverage sovereign-native capabilities.

When to Use:

  • Deep AWS/Azure/GCP service dependencies
  • Applications requiring modernization anyway
  • Strategic systems with long lifespan

Typical Scope:

  • Replace Lambda → Knative/OpenFaaS
  • Replace DynamoDB → CockroachDB/PostgreSQL
  • Replace Cognito → Keycloak

Repurchase (Drop & Shop)

Medium Complexity

Replace with a different product, typically a sovereign SaaS equivalent.

When to Use:

  • Commercial off-the-shelf software
  • End-of-life applications
  • SaaS with European alternatives

Examples:

  • Salesforce → Odoo (self-hosted)
  • Microsoft 365 → Nextcloud + OnlyOffice
  • Slack → Mattermost/Element

Retire

Low Complexity

Decommission applications that are no longer needed.

When to Use:

  • Redundant applications
  • Legacy systems with no active users
  • Technical debt candidates

Requirements:

  • Data archival for compliance
  • Audit trail preservation
  • Stakeholder sign-off

Retain

Deferred

Keep in current location temporarily due to constraints.

When to Use:

  • Deep technical dependencies requiring long refactor
  • Contractual obligations
  • Pending replacement projects

Requirements:

  • Risk assessment documentation
  • Migration timeline commitment
  • Enhanced monitoring

Migration Decision Tree

Workload Assessment Questions

  1. Does the workload handle Tier 1 sensitive data?
    • Yes → Priority migration, consider Refactor for maximum control
    • No → Continue assessment
  2. Does the workload use proprietary managed services (Lambda, DynamoDB, Cosmos DB)?
    • Heavily → Refactor or Repurchase
    • Lightly → Replatform
    • None → Rehost
  3. Is the application container-ready?
    • Already containerized → Rehost to sovereign Kubernetes
    • VM-based but stateless → Replatform to containers
    • VM-based with state → Replatform with external state store
  4. What is the remaining application lifespan?
    • < 2 years → Rehost (minimize investment)
    • 2-5 years → Replatform
    • > 5 years → Consider Refactor for long-term benefits

Pattern: AWS to Sovereign Migration

Service Mapping

AWS Service Sovereign Equivalent Migration Complexity Notes
EC2 Hetzner Cloud / OVH Compute Low Direct VM migration
EKS Vanilla Kubernetes / k3s Low Manifests portable, check IAM
RDS PostgreSQL PostgreSQL on Kubernetes Low pg_dump/pg_restore
RDS Aurora PostgreSQL + PgBouncer Medium Aurora-specific features need workarounds
S3 MinIO Low API-compatible, use rclone
Lambda Knative / OpenFaaS High Requires code refactoring
DynamoDB ScyllaDB / CockroachDB High Schema and query redesign
SQS RabbitMQ / Apache Kafka Medium Different semantics
Cognito Keycloak Medium User export, reconfigure clients
CloudWatch Prometheus + Grafana Medium Dashboard recreation needed
Secrets Manager OpenBao Low Export/import secrets
Route 53 CoreDNS / PowerDNS Low Zone file export

Database Migration Procedure

# PostgreSQL Migration from RDS to Sovereign Infrastructure

# 1. Create consistent snapshot on source
aws rds create-db-snapshot \
  --db-instance-identifier source-db \
  --db-snapshot-identifier migration-snapshot

# 2. Export data using pg_dump (run from bastion with access to both)
pg_dump \
  --host=source-db.xxx.eu-west-2.rds.amazonaws.com \
  --username=admin \
  --format=custom \
  --file=/tmp/migration.dump \
  --verbose \
  source_database

# 3. Verify dump integrity
pg_restore --list /tmp/migration.dump | head -50

# 4. Transfer to sovereign infrastructure (encrypted)
# Using rclone with client-side encryption
rclone copy /tmp/migration.dump sovereign-minio:migrations/ \
  --crypt-remote=sovereign-minio-encrypted:migrations/

# 5. Restore on sovereign PostgreSQL
kubectl exec -it postgres-primary-0 -- pg_restore \
  --username=admin \
  --dbname=target_database \
  --verbose \
  --clean \
  /migrations/migration.dump

# 6. Verify row counts match
psql -h source-db -U admin -c "SELECT COUNT(*) FROM critical_table;"
psql -h sovereign-db -U admin -c "SELECT COUNT(*) FROM critical_table;"

# 7. Update application connection strings
kubectl set env deployment/app \
  DATABASE_URL="postgresql://admin:xxx@sovereign-db:5432/target_database"

Pattern: Blue-Green Migration

For critical workloads requiring zero-downtime migration with instant rollback capability.

Principle: Both US cloud and sovereign environments run simultaneously. Traffic is gradually shifted using weighted DNS or load balancer rules. Full rollback is always one configuration change away.

Implementation Steps

# Phase 1: Deploy to Sovereign (Green environment)
# Application deployed but receiving no traffic

kubectl apply -f sovereign-deployment.yaml
kubectl wait --for=condition=available deployment/app-green --timeout=300s

# Phase 2: Smoke tests on Green
kubectl run smoke-test --image=curlimages/curl --rm -it -- \
  curl -s http://app-green-internal:8080/health

# Phase 3: Canary traffic (10%)
# Using Istio VirtualService
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: app-routing
spec:
  hosts:
    - app.service.gov.uk
  http:
    - route:
        - destination:
            host: app-blue  # US cloud
          weight: 90
        - destination:
            host: app-green  # Sovereign
          weight: 10

# Phase 4: Monitor error rates
# Prometheus query for comparison
sum(rate(http_requests_total{status=~"5..",env="green"}[5m])) /
sum(rate(http_requests_total{env="green"}[5m]))

# Phase 5: Gradual increase (10% -> 50% -> 90% -> 100%)
# Phase 6: Decommission Blue (US cloud) after stability period

Pre-Migration Checklist

Technical Readiness

  • Inventory all cloud service dependencies
  • Map proprietary services to sovereign equivalents
  • Identify hard dependencies requiring refactoring
  • Document all IAM roles and permissions
  • Export infrastructure as code (if exists)
  • Identify data classification for all data stores

Data Migration Readiness

  • Calculate total data volume to migrate
  • Estimate bandwidth requirements and transfer time
  • Plan for data synchronization during cutover window
  • Verify encryption in transit (TLS 1.3)
  • Confirm no data routes through US territory
  • Test restore procedures from backups

Operational Readiness

  • Runbooks updated for sovereign infrastructure
  • On-call procedures documented
  • Monitoring dashboards configured
  • Alert thresholds defined
  • Rollback procedures tested
  • Communication plan for stakeholders

Rollback Procedures

Critical: Every migration must have a tested rollback plan. If rollback is not possible (e.g., schema changes), the migration must include a longer parallel-run period.

Rollback Decision Criteria

Metric Threshold Action
Error rate increase > 1% above baseline Halt traffic shift, investigate
Latency increase > 50% P99 increase Halt traffic shift, investigate
Data integrity errors Any occurrence Immediate rollback
Security incidents Any occurrence Immediate rollback, incident response

Related Documentation