MIGRATION PLANNING

Migration Strategy & Rehosting Approaches

Systematic approach to migrating workloads from US cloud providers (AWS, Azure, GCP, OCI) to sovereign infrastructure. Covers rehosting strategies, data migration, cutover procedures, and rollback capabilities.

Migration must maintain service continuity. No citizen-facing service should experience extended outage during migration. Blue-green deployment and instant rollback capability mandatory.

Migration Approach Options

🔧

Replatform

Minor optimisations during migration (e.g., managed DB → self-managed PostgreSQL).

Moderate effort, some performance/cost benefits.

Timeline: Weeks per workload

🏗️

Refactor/Rearchitect

Significant changes to leverage new platform. Not recommended for initial migration due to time constraints.

Timeline: Months per workload

Emergency Mobilisation Recommendation

Rehost first, optimise later. The primary goal is achieving sovereignty, not optimisation. Once workloads are on sovereign infrastructure, they can be refactored incrementally. The risk of continued US cloud dependency far outweighs any performance or cost inefficiency from lift-and-shift.


Migration Strategy by Workload Type

Virtual Machine Workloads

EC2, Azure VMs, GCE, OCI Compute

Approach: Image-Based Rehost

[US Cloud VM] → [Export Image] → [Secure Transfer] → [Import to CloudStack] → [Validation] → [DNS Cutover]
  1. Inventory: Document all VMs, configurations, networking, storage attachments
  2. Image Export: Create machine images/snapshots (AMI, VHD, etc.)
  3. Image Conversion: Convert to KVM-compatible format (qcow2) if needed
  4. Secure Transfer: Transfer images via encrypted channel to sovereign storage
  5. Import: Register images in CloudStack, configure networking
  6. Validation: Boot VMs, validate functionality in isolated network
  7. DNS Cutover: Update DNS to point to new infrastructure
  8. Monitoring: Monitor for issues, maintain rollback capability for 72 hours

Pre-Migration Checklist

  • Document all security groups/firewall rules
  • Identify instance metadata dependencies (AWS 169.254.x.x, Azure IMDS)
  • Check for cloud-specific agents (CloudWatch agent, Azure agent)
  • Map elastic IPs/public IPs
  • Document auto-scaling configurations
  • Backup all data volumes separately

Key Risk: Cloud-Specific Dependencies

VMs may depend on cloud-specific services (instance metadata, cloud-init, managed DNS). These must be identified and replicated or replaced on sovereign infrastructure.

Containerised Workloads (Kubernetes)

EKS, AKS, GKE, OKE

Approach: Manifest Migration + Data Replication

[Export Manifests] → [Container Registry Sync] → [Deploy to Sovereign K8s] → [Data Migration] → [Traffic Shift]
  1. Export: Export all Kubernetes manifests (Deployments, Services, ConfigMaps, Secrets)
  2. Registry Sync: Mirror container images to sovereign registry (Harbor)
  3. Cluster Setup: Deploy equivalent K8s cluster on CloudStack
  4. Manifest Adjustment: Update storage classes, load balancer configs, ingress
  5. Data Migration: Migrate PersistentVolumes and databases
  6. Parallel Deploy: Deploy workloads to sovereign cluster
  7. Traffic Shift: Gradual traffic migration via DNS or ingress weights
  8. Validation: Full functional testing before full cutover

Advantage: Kubernetes Portability

Containerised workloads are inherently more portable. Standard Kubernetes manifests work across any conformant cluster. Main changes: storage class names, cloud-specific load balancer annotations, and any managed service integrations (RDS, managed Redis, etc.).

Cloud-Specific Components to Replace

  • Ingress: AWS ALB Ingress → nginx-ingress or Traefik
  • Storage: EBS CSI → CloudStack CSI or Ceph RBD
  • Secrets: AWS Secrets Manager → OpenBao
  • Service Mesh: App Mesh/Anthos → Istio/Linkerd
  • DNS: Route53 external-dns → CoreDNS + sovereign DNS

Database Workloads

RDS, Azure SQL, Cloud SQL, OCI DB

Approach: Logical Replication + Cutover

[US Managed DB] → [Logical Replication] → [Sovereign PostgreSQL/MySQL] → [Sync Verification] → [App Cutover]
  1. Setup Replica: Deploy PostgreSQL/MySQL on sovereign infrastructure
  2. Enable Replication: Configure logical replication from source (AWS DMS or native)
  3. Sync: Wait for replica to catch up with source
  4. Test: Validate data integrity and schema consistency
  5. Freeze Writes: Brief write freeze on source (seconds to minutes)
  6. Final Sync: Ensure final transactions replicated
  7. App Cutover: Point applications to new database
  8. Verify: Confirm read/write operations working

Key Challenge: Managed Service Features

Managed databases include features that must be replicated manually:

  • Automated backups → Configure pgBackRest or equivalent
  • Multi-AZ failover → Configure Patroni or replication-manager
  • Read replicas → Configure native replication
  • Parameter groups → Apply equivalent PostgreSQL/MySQL settings
  • Monitoring → Configure Prometheus PostgreSQL exporter
Large Database Consideration: For multi-terabyte databases, initial sync may take days. Plan for extended replication period. Consider physical backup/restore for initial seed, then logical replication for ongoing sync.

Object Storage Data

S3, Azure Blob, GCS, OCI Object Storage

Approach: Parallel Sync + Application Update

[S3 Bucket] → [rclone/s5cmd sync] → [MinIO Bucket] → [App Config Update] → [Verify Access]
  1. Inventory: List all buckets, sizes, access patterns
  2. Setup MinIO: Create equivalent bucket structure on MinIO
  3. Initial Sync: Use rclone or s5cmd for high-speed transfer
  4. Continuous Sync: Set up ongoing sync to capture new objects
  5. Policy Migration: Recreate bucket policies and IAM in Keycloak
  6. Application Update: Update application configs to use MinIO endpoint
  7. Validation: Verify all objects accessible and integrity intact
  8. Final Sync: Final sync before S3 cutoff

MinIO S3 Compatibility

MinIO provides full S3 API compatibility. Applications using standard S3 SDKs typically require only endpoint URL and credential changes. No code modifications needed for most applications.

Data Transfer Considerations

  • Bandwidth: Plan for sustained transfer speeds (10Gbps+ for large datasets)
  • Cost: AWS egress charges apply (~$0.09/GB for first 10TB)
  • Time: 100TB at 1Gbps ≈ 10 days
  • Security: Encrypt data in transit (TLS) and verify checksums

Serverless/FaaS Workloads

Lambda, Azure Functions, Cloud Functions

Approach: Function Containerisation + OpenFaaS Deploy

[Lambda Function] → [Extract Code/Dependencies] → [Containerise] → [Deploy to OpenFaaS] → [Update Triggers]
  1. Inventory: List all functions, triggers, environment variables
  2. Extract: Download function code and identify dependencies
  3. Containerise: Package as container image with OpenFaaS template
  4. Deploy: Deploy to OpenFaaS on sovereign Kubernetes
  5. Trigger Setup: Configure equivalent triggers (HTTP, queue, schedule)
  6. Environment: Migrate environment variables and secrets to OpenBao
  7. Testing: Comprehensive functional testing
  8. Cutover: Update calling applications to new endpoints

Key Challenge: Cloud-Native Integrations

Lambda functions often integrate tightly with AWS services:

  • API Gateway triggers: → Replace with Kong or Traefik
  • SQS triggers: → Replace with Kafka consumer
  • S3 event triggers: → Replace with MinIO bucket notifications
  • DynamoDB streams: → Replace with Kafka CDC or PostgreSQL triggers
  • AWS SDK calls within function: → Refactor to use sovereign equivalents

Cutover Strategy: Blue-Green Deployment

All critical workloads use blue-green deployment for zero-downtime cutover with instant rollback capability.

[Users] → [DNS/Load Balancer] → 100% [Blue: US Cloud] / 0% [Green: Sovereign]

↓ Gradual shift

[Users] → [DNS/Load Balancer] → 50% [Blue: US Cloud] / 50% [Green: Sovereign]

↓ Validation successful

[Users] → [DNS/Load Balancer] → 0% [Blue: US Cloud] / 100% [Green: Sovereign]

Cutover Phases

T-72 hours

Pre-Cutover Validation

  • Full functional testing on sovereign infrastructure
  • Performance benchmarking against production baseline
  • Security scanning and penetration testing
  • Disaster recovery testing (simulate rollback)
T-24 hours

Go/No-Go Decision

  • Final data sync verification
  • All teams confirm readiness
  • Communication sent to stakeholders
  • Rollback procedures confirmed
T-0: Cutover

Traffic Migration

  • 10% traffic shift: Initial canary (monitor for 30 minutes)
  • 50% traffic shift: Broader validation (monitor for 1 hour)
  • 90% traffic shift: Near-complete migration (monitor for 2 hours)
  • 100% traffic shift: Full cutover
T+72 hours

Stabilisation Period

  • Maintain rollback capability to US cloud
  • 24/7 monitoring and incident response
  • Performance optimisation
  • User feedback collection
T+7 days

US Cloud Decommission

  • Final data verification
  • US cloud resources terminated
  • Contracts terminated/not renewed
  • Data deletion verification (compliance)

US Cloud Cessation Strategy

Immediate Actions (Week 1)

Contract Termination Approach

Provider Typical Notice Period Termination Clause Recommended Action
AWS Pay-as-you-go: None. Reserved: Varies Standard ToS allows termination Immediate migration; let reserved capacity expire
Azure EA: 30-90 days typically Review EA amendment rights Serve termination notice immediately
GCP Committed use: Varies Standard ToS Migrate and allow commitments to expire
OCI Universal Credits: Varies Review contract terms Serve notice; negotiate early exit if needed
Legal Consideration: Review all contracts with legal counsel. National security provisions may provide grounds for early termination without penalty. Document all concerns about US legal jurisdiction as basis for termination.

Data Deletion Requirements


Back to Emergency Mobilisation Hub

Next: DR/HA & US Shutdown Contingency →