DR/HA & US Shutdown Contingency

Disaster recovery, high availability strategies, and contingency plans for scenarios where US cloud providers terminate service or experience control plane outages—whether through deliberate action, legal order, or technical failure.

Critical Context: The Threat is Real

Under US law (CLOUD Act, FISA 702, IEEPA, Executive Orders), the US government can compel AWS, Azure, GCP, and OCI to terminate services, deny access, or provide data access without notice to foreign governments or entities. This is not theoretical—these powers have been exercised. This document provides contingency plans for immediate service termination scenarios.

Threat Scenarios

Trigger: US government orders cloud providers to terminate services to specific government entities or entire countries (sanctions, trade dispute, political conflict).

Warning Time	Impact	Probability
Zero to 24 hours	Complete loss of compute, storage, network	Low but increasing

Historical Precedent

Huawei: Immediate termination of US software/hardware services (2019)
Russia sanctions: Service termination to sanctioned entities (2022)
China export controls: Technology access restrictions (ongoing)

Immediate: Activate pre-deployed sovereign infrastructure
Within 1 hour: DNS failover to sovereign endpoints
Within 4 hours: Full traffic redirect to sovereign platform
Data: Rely on continuously replicated data (see HA architecture below)

Trigger: Major outage affecting cloud provider control plane (API, management, orchestration). Workloads may continue running but cannot be managed.

Warning Time	Impact	Probability
Zero (outage starts)	Cannot deploy, scale, or modify workloads	Moderate (has occurred)

Historical Examples

AWS us-east-1 outages (multiple incidents affecting global services)
Azure Active Directory outages (authentication failures)
GCP global networking incidents

Immediate: Workloads continue on existing resources (no changes possible)
Within 15 minutes: Activate sovereign standby if outage confirmed major
Traffic: Shift to sovereign platform via DNS/load balancer
Return: Can return to US cloud once restored (if policy allows)

Trigger: US government orders cloud provider to provide access to government data without customer knowledge (FISA 702, National Security Letter).

Warning Time	Impact	Probability
No warning (gag order)	Data compromised without knowledge	High (documented occurrence)

This scenario cannot be responded to—it can only be prevented.

Move data off US cloud infrastructure entirely
Encrypt all data with keys held in sovereign HSMs (US provider cannot decrypt)
Minimise data stored on US cloud to non-sensitive operational data only

High Availability Architecture (Pre-Migration)

During the migration period, critical systems must be deployed in active-active or active-standby configuration across US cloud AND sovereign infrastructure, enabling instant failover.

GLOBAL TRAFFIC MANAGEMENT

Sovereign DNS / Global Load Balancer

▼ ▼

US CLOUD (Primary)

AWS / Azure / GCP

App Cluster

Database (RDS/Azure SQL)

Object Storage (S3)

SOVEREIGN (Hot Standby)

CloudStack / Sovereign K8s

App Cluster

Database (PostgreSQL HA)

Object Storage (MinIO)

Async Replication + Sync

Recovery Objectives

RTO: 1 hour Recovery Time Objective: Maximum time to restore service

RPO: 15 minutes Recovery Point Objective: Maximum data loss (replication lag)

Continuous Replication Requirements

Data Type	Replication Method	Lag Target	Technology
Databases	Logical replication (async)	< 5 minutes	PostgreSQL logical replication, Debezium CDC
Object Storage	Continuous sync	< 15 minutes	rclone with change detection, MinIO bucket replication
Application State	Distributed cache sync	< 1 minute	Redis Cluster cross-region, Kafka mirroring
Secrets/Config	Periodic sync	< 1 hour	OpenBao replication, GitOps

US Shutdown Incident Response Procedure

T+0: Incident Detected

Trigger: US cloud services inaccessible or termination notice received

Automated monitoring alerts on US cloud API failures
Manual escalation if external notification received
Incident commander assigned immediately

T+5 min: Confirm & Classify

Determine scope and intentionality

Confirm outage is real (not monitoring false positive)
Check cloud provider status pages
Classify: Technical outage vs. deliberate termination
If deliberate: Escalate to senior leadership immediately

T+15 min: Failover Decision

Authorise failover to sovereign infrastructure

Technical outage expected > 1 hour: Initiate failover
Deliberate termination: Initiate failover immediately
Notify all jurisdiction coordination centres

T+30 min: Execute Failover

Traffic redirect to sovereign platform

Update DNS records (low TTL should be pre-configured)
Update global load balancer weights
Verify sovereign endpoints accepting traffic
Monitor error rates and latency

T+1 hour: Stabilise

Confirm sovereign platform operational

All critical services confirmed operational
Data integrity verification
Capacity scaling as needed
Stakeholder communication issued

T+24 hours: Assess Return

Determine if return to US cloud is appropriate (if technical outage)

If deliberate termination: No return. Accelerate full migration.
If technical outage resolved: Assess risk of return vs. staying on sovereign
Recommendation: Use incident to justify permanent sovereign migration

Sovereign DR Site Strategy by Jurisdiction

Jurisdiction	Primary Site	DR Site	Cross-Jurisdiction DR
UK	Crown Hosting (Corsham)	Crown Hosting (Farnborough)	EU (OVHcloud France) - data agreement required
EU	Primary varies by member state	Secondary within same member state	Cross-member-state replication (Gaia-X)
Canada	SSC Borden (Ontario)	SSC Gatineau (Quebec)	UK (bilateral agreement possible)
Australia	Canberra DC	Sydney/Melbourne	Limited (geographic isolation); consider NZ partnership

Cross-Jurisdiction DR Consideration: Data sovereignty requirements may limit cross-border DR for some workloads. Each jurisdiction must define which data classifications can be replicated to partner jurisdictions and under what agreements. GDPR adequacy decisions and bilateral data sharing agreements govern these arrangements.

DR Testing Requirements

Test Type	Frequency	Scope	Success Criteria
Failover Test	Monthly	Single critical service	Failover completed within RTO; no data loss beyond RPO
Full DR Exercise	Quarterly	All critical services	Full service restoration on sovereign platform
US Shutdown Simulation	Bi-annually	Complete cutover simulation	Zero US cloud dependency for 24 hours
Cross-Jurisdiction Test	Annually	Multi-jurisdiction coordination	Coordinated failover across cooperative

Back to Emergency Mobilisation Hub

Next: Intelligence Sharing Framework →

Critical Context: The Threat is Real

Threat Scenarios

Scenario 1: Deliberate Service Termination

Historical Precedent

Contingency Response

Scenario 2: Control Plane Outage (Technical)

Historical Examples

Contingency Response

Scenario 3: Data Access/Exfiltration Order

Mitigation (Pre-Incident)

High Availability Architecture (Pre-Migration)

GLOBAL TRAFFIC MANAGEMENT

US CLOUD (Primary)

SOVEREIGN (Hot Standby)

Recovery Objectives

Continuous Replication Requirements

US Shutdown Incident Response Procedure

Sovereign DR Site Strategy by Jurisdiction

DR Testing Requirements

Cookies on Sovereign Cloud Architecture

DR/HA & US Shutdown Contingency

Critical Context: The Threat is Real

Threat Scenarios

Scenario 1: Deliberate Service Termination

Historical Precedent

Contingency Response

Scenario 2: Control Plane Outage (Technical)

Historical Examples

Contingency Response

Scenario 3: Data Access/Exfiltration Order

Mitigation (Pre-Incident)

High Availability Architecture (Pre-Migration)

GLOBAL TRAFFIC MANAGEMENT

US CLOUD (Primary)

SOVEREIGN (Hot Standby)

Recovery Objectives

Continuous Replication Requirements

US Shutdown Incident Response Procedure

Sovereign DR Site Strategy by Jurisdiction

DR Testing Requirements