Configuring Key Rotation for AWS Event Handlers
As a part of deploying Event Handlers on AWS, Azure or GCP, Guardrails automatically generates a JSON Web Token (JWT) with a security token embedded in it. On a periodic basis, this token ought to be rotated. This document describes the policies, best practices and troubleshooting procedures for rotating the JWT.
Workspace Configuration Policies
These are the Turbot > Workspace policies relevant to event handling for SaaS and Enterprise customers. Ideally, they are configured before enabling event handling for the first time but can be changed at any time.
- Turbot > Workspace > Webhook Secrets > Rotation - Instructs Guardrails to regularly rotate the secrets used to sign the JWTs. Defaults to 'Skip'.
- Turbot > Workspace > Webhook Secrets > Expiration Period - Specifies the interval for secret rotation. Default value is 'Never'.
- Turbot > Workspace > Webhook Secrets - Use this policy only when there is a requirement for specific secrets to be used. Otherwise, the default setting will auto-generate new secrets as required.
- AWS > Account > Regions - Specifies the list of regions that Guardrails will monitor. By default, Guardrails monitors all regions that do not require an opt-in. If there are regional restrictions through SCPs, then the regions list should not exceed those permitted by the SCP.
- Turbot > Workspace > Gateway Domain Name - Enterprise Only: Specifies the API gateway address to use for the Event Handlers.
Initial Setup Process
This process assumes that Event Handlers have already been enabled and deployed. If not, follow these configuration steps then enable Event Handling.
- Decide how often the JWT will rotate for the Event Handlers. It can be as often as 1 month or as long as 5 years.
- Set Webhook Secrets > Expiration Period to a value that meets your organizational needs.
- Only if specific secrets are required, set Webhook Secrets. Otherwise, Guardrails will automatically generate new secrets. Setting specific secrets is an uncommon requirement.
- Set Webhook Secrets > Rotation to Enforce: Rotate webhook secret. This will kick off rotation of the JWT for all the event handlers in this workspace.
Forcing a Key Rotation
In cases where a key has been compromised or a very old key needs to be refreshed, follow these steps to kick off a refresh.
Preflight Checks
- Examine all the event handler controls for all platforms in this workspace.
- Verify that all event handler controls are in an
ok
state. - For AWS: Set the control type filter to AWS > SNS > Subscription > Configured. Verify that all Subscription
Configured
controls for theturbot_aws_api_handler
topics are in anok
state. - For Azure: Check that the Azure > Monitor > Action Group > Configured controls are in
ok
for eachturbot_azure_event_handler_action_group
action group in eachturbot_rg
resource group. - For GCP: Check that the GCP > Turbot > Event Handlers > Pub/Sub controls are all in
ok
. - Resolve any controls in
error
.
Rotation and Verification
NOTE: In large environments, this can cause significant load on Guardrails. Schedule this change for off-hours.
- Set Webhook Secrets > Rotation to Enforce: Rotate webhook secret if not already set.
- Set Webhook Secrets > Expiration Period to 1 month. This will cause an immediate recalculation of the Webhook Secrets policy.
- If Expiration Period is already set to 1 month, set to 2 months then back to 1 month. When you see the activity described in the next step, rotation was successful.
- Look at the Activity page of the Webhook Secrets policy setting. You should see the following activity:
- A
Control Updated
notification for the Turbot > Webhook Secrets Rotation control fromok
toalarm
. - A
Notify
saying "Rotated Webhook secrets" - A
Policy Setting Updated
notification for Webhook Secrets - A
Control Updated
notification for the Turbot > Webhook Secrets Rotation control fromalarm
took
.
- A
- Go to the Controls by Control Type report in the top Reports tab.
- Filter for the Event Handlers for each platform used in this workspace.
- Verify that all Event Handler controls are in
ok
. If there are controls in anerror
state, resolve them immediately.
- Extended verification that the webhook secret was updated. Each of the control types listed below are responsible for the cloud resource that holds the JWT. If these controls are in an
error
state, then the webhook hasn't rotated for some reason.- For AWS: Set the control type filter to AWS > SNS > Subscription > Configured. Verify that all Subscription
Configured
controls for theturbot_aws_api_handler
topics are in anok
state. - For Azure: Check that the Azure > Monitor > Action Group > Configured controls are in
ok
for eachturbot_azure_event_handler_action_group
action group in eachturbot_rg
resource group. - For GCP: Check that the GCP > Turbot > Event Handlers > Pub/Sub controls are all in
ok
.
- For AWS: Set the control type filter to AWS > SNS > Subscription > Configured. Verify that all Subscription
- Set Webhook Secrets > Expiration Period back to whatever the normal rotation period is.
- Go to the
Activity Ledger
report. Filter for theresource
notification type. In sufficiently busy environments, there should be some activity after the JWT was rotated. If there is no activity, then generate some in a testing account.
Troubleshooting
In case event handling has stopped because of a key rotation, try the following steps:
- Was event handling working before the key rotation?
- Are the event handling policies set to Enforce: Configured? Are there any exceptions where event handling is set to Skip or Enforce: Not configured?
- Are all the event handlers in an
ok
state? If not, grab the logs for an Event Handler control that is inerror
. - Are all the controls listed in the extended verification step above in an
ok
state? - Have Webhook Secrets been specified?
- Were any other Event Handler policies changed at the same time?
- Is there any environmental change visible in the Guardrails console after the key rotation?
- Are events missing for all cloud accounts in the workspace, or a specific account/sub/project?
- If AWS, is Cloudtrail present and functional in all accounts?
If Webhook Secrets has been set and event handling isn't working, do the following:
- Delete the Webhook Secrets policy setting.
- Follow the rotation and verification steps described above.
If event handling is still not working, gather the above troubleshooting information, then send it to help@turbot.com for additional assistance.
Best Practices
- Rotating the Webhook Secret should be done at least once per year.
- Unless there is a very good reason, stay with the default behavior where Guardrails generates new secrets. This avoids the chances of a silent and accidental event handling outage.
- Be sure to set two Webhook secrets with overlapping expiration periods.
- Setting a single key may cause Event handling to silently stop working when the secret expires.
- Setting two keys without overlapping active periods may cause a silent break in event handling too.