Restoring a Workspace

In this guide, you will:

Test backup and restore procedures for Turbot Guardrails workspaces within the single region.
Monitor and troubleshoot the disaster recovery process.

An essential part of maintaining Turbot Guardrails is testing disaster recovery. This document covers the process for restoring a destroyed workspace. Restoration should be tested at least once a year, ideally twice. The goal is to have Guardrails Admins familiar with the restoration process and the tools involved.

Testing backup and restore procedures is critical for:

Validating backup integrity and restore processes
Meeting compliance and audit requirements
Training administrators on recovery procedures
Measuring recovery time objectives (RTO)

Note
Workspace restoration is just one of several disaster recovery scenarios. Evaluate other scenarios as part of your organization's comprehensive disaster recovery strategy.

Prerequisites

Administrator access to AWS Console.
Familiarity with Guardrails installation.
Understanding of database backup/restore.
Access to required AWS services such as RDS, CloudFormation, ECS and Route 53.

Process Summary

Build a New Workspace – Set up a fresh workspace for testing, install required mods, and take an RDS snapshot.
Simulate Disaster – Destroy the workspace by deleting its CloudFormation stack.
Restore the Workspace – Recover data from the latest backup, apply migrations, and restart the workspace.
Validate Restoration – Log in and verify the workspace is functional.

Important
Only test with non-production workspaces

Document all parameters and configurations

Time the restore process to measure RTO

Test regularly (recommended twice per year)

Follow security best practices

Step 1: Build a New Workspace

In this phase, create a workspace and install baseline mods. Then, import an AWS account with Event Pollers.

Note
Same process applies to Azure and GCP.

This process assumes that Route53 is used for DNS. Customers with manually configured DNS will need to keep track of their configuration.

Steps:

Select TE Version:
- Choose a dedicated TE version for testing
- Note: ECS container flush during restore may cause brief outages for workspaces using this TE version
- If multiple workspaces use this TE version, pause event processing
Access AWS Master Account:
- Navigate to the alpha region of your AWS Master account
Create Test Workspace:
- Follow the workspace creation guide
- Save all CloudFormation parameters used (needed for restoration)
- Record credentials from CloudFormation Stack outputs
- Note the Turbot ID of workspace Turbot Root (tmod:@turbot/turbot#/)
Install Required AWS Mods:
- aws
- aws-iam
- aws-kms
- aws-s3
Configure Workspace:
- Create "AWS" folder under Turbot Root
- Import an AWS account into the folder
- Verify no controls/policies are in tbd state
Document Initial State:
- Take screenshots of workspace dashboard
- Record key metrics:
  - Number of resources
  - Active controls count
  - Other relevant statistics
- Save for post-restore validation
Create Backup:
- Wait for automated "Restore to point in time" backup
- Or take a manual RDS backup

Step 2: Drop the Workspace

Warning
Do not delete a production workspace CloudFormation Stack.

Do not delete original database.

Delete the Workspace CloudFormation stack created earlier.
If necessary, force delete the workspace.
Verify that the workspace URL is no longer accessible.

Step 3: Restore the Workspace

In this step, we will recreate a new workspace which initializes an empty database schema. The goal is to restore this empty schema with the data from our restored DB, effectively bringing back the workspace to its previous state. This process ensures we maintain the database structure while recovering all workspace configurations, resources, and control states from the backup.

Steps:

Start RTO Measurement:
- Begin timing the restore process
- This helps determine your Recovery Time Objective (RTO)
Recreate Workspace:
- Use original Workspace CloudFormation template
- Apply identical parameter values from original workspace
- Deploy the new workspace stack
Restore Database:
- Navigate to AWS RDS console
- Choose either:
  - Restore from snapshot, or
  - Use "Restore to point in time" feature
- Ensure restored DB configurations match original:
  - Instance class
  - Storage type/size
  - Network settings
  - Security groups
Configure Temporary Database:
- Wait for restored DB to become available
- Record the new database endpoint
- Verify connectivity
Deploy Bastion Host:
- Launch a Turbot Bastion Host instance. Follow setup guide Turbot Bastion Host Setup
- Ensure network access to both databases
Execute Migration:
- Run migration script to copy DB schema:
  - From (Source): The restored database
  - To (Target): New existing database

nohup ./migration.sh <turbot_schema> <source_or_restored_DB_endpoint> <target_or_actual_db_endpoint> &

example: nohup ./migration.sh panda turbot-panda.abcxyzabcxyz.us-east-1.rds.amazonaws.com turbot-babbage.abcxyzabcxyz.us-east-1.rds.amazonaws.com &

Wait for the pg_dump and pg_restore process in migration.sh to complete.
Flush ECS Containers:
- Navigate to the AWS ECS console → Cluster open the Tasks tab
- Locate the TE version-related tasks and stop them.

Step 4: Clear Redis Cache

To clear the workspace from Redis, log into the bastion host and execute:

export REDISHOST=master.turbot-babbage-cache-cluster.abcxyz.use1.cache.amazonaws.com
redis-cli -h $REDISHOST --tls -p 6379 -a <password> KEYS "<turbot_schema>*" | xargs redis-cli -h $REDISHOST --tls -p 6379 -a <password> DEL

example: redis-cli -h $REDISHOST --tls -p 6379 -a mysecurepassword KEYS "panda*" | xargs redis-cli -h $REDISHOST --tls -p 6379 -a mysecurepassword DEL

Step 5: Review

This step validates the restoration process.

Login Validation to ensure the previous credentials still work.
Resource & Control Check: Verify the number of resources and controls match pre-disaster stats.
Test New Resource Import: Create a new S3 bucket and verify it appears in Guardrails UI.
Verify Control Execution: Run a control scan to confirm that all controls are in OK or Skipped state.

Next Steps

Explore the following resources to expand your understanding of Guardrails disaster recovery and workspace management:

Troubleshooting

Issue	Description	Guide
Workspace Not Accessible	If the workspace does not restore correctly, ensure that RDS endpoints are correct in the migration script.
Redis Cache Not Cleared	If controls fail to execute, verify that Redis cache clearing was performed correctly.	See Step 4: Clear Redis Cache in this guide.
Further Assistance	If the issue persists, open a support ticket and provide logs & screenshots for faster resolution.	Open Support Ticket