Build a scheduled secret-rotation audit and iterate it on your laptop
You are a platform engineer at a company that runs a few dozen services. Each service has its own Secrets Manager entries for downstream API keys, database passwords, third-party tokens. Every quarter security asks the same question: "which credentials have not been rotated in over thirty days?" Today, finding the answer means a senior engineer clicking through the AWS console and writing a spreadsheet. You want the answer to be the contents of a DynamoDB table that updates itself.
The boring AWS pattern for this is an EventBridge Scheduler that
fires a Lambda every hour. The Lambda lists secrets tagged
Managed=true,
classifies each one as ok
or needs_rotation
based on LastRotatedDate,
and writes one audit row to DynamoDB per (secret, run). The
dashboard reads from that table. The job runs every hour
regardless of whether anyone is looking.
Iterating that on real AWS is awkward in a specific way: you cannot wait an hour between test runs to see whether the schedule fires correctly. This tutorial builds the whole stack on LocalEmu in about 21 seconds, then tests it deterministically by asserting the schedule's configuration directly and invoking the Lambda on demand, the same way EventBridge Scheduler would. 13 Terraform resources, three integration tests, no AWS account, no waiting on the clock.
What you will have working at the end
Three scenarios you can run by hand on your laptop, all returning the exact responses captured later on this page:
- A. The schedule resource is wired
correctly:
scheduler get-schedulereturnsState=ENABLED,ScheduleExpression=rate(1 hour), and a Target ARN that points at the rotation-check Lambda. - B. Invoking the Lambda directly
(the same call the scheduler makes every hour) returns
checked: 2and a freshrun_id. Two because there are exactly twoManaged=truesecrets in the test seed; the third (taggedManaged=false) is skipped. - C. The audit table has exactly two
new rows after that invocation, both stamped with the
same
run_idand an ISO-8601 UTCchecked_attimestamp. The Managed=false secret has no row.
Architecture
⏱ EventBridge Scheduler ── rate(1 hour) ──▶ lambda: rotation-check │ ┌───────────────────────┤ ▼ ▼ secretsmanager: ListSecrets dynamodb: PutItem (Managed=true filter) (audit row per secret)
1. The handler
The Lambda is thirty lines. Page through Secrets Manager filtered
on the Managed
tag, classify each result by age, write the audit row, return a
summary. Two boto3 clients, no framework, no conditional on local
versus AWS (boto3 picks up
AWS_ENDPOINT_URL
from the environment, and LocalEmu sets it inside every Lambda
container it spawns):
# src/rotation_check.py: the scheduled Lambda.
# On every fire, list Managed=true secrets and write an audit row for each.
TABLE_NAME = os.environ["TABLE_NAME"]
MAX_AGE_DAYS = int(os.environ.get("MAX_AGE_DAYS", "30"))
_sm = boto3.client("secretsmanager")
_tbl = boto3.resource("dynamodb").Table(TABLE_NAME)
def handle(event, _ctx):
now = dt.datetime.now(dt.timezone.utc)
cutoff = now - dt.timedelta(days=MAX_AGE_DAYS)
run_id = f"run-{int(time.time() * 1000)}"
checked = flagged = 0
# ListSecrets pre-filters by tag-key. We re-check Managed=="true" below
# because the API filter is a key-presence test, not a value match.
for page in _sm.get_paginator("list_secrets").paginate(
Filters=[{"Key": "tag-key", "Values": ["Managed"]}]):
for secret in page.get("SecretList", []):
tags = {t["Key"]: t["Value"] for t in secret.get("Tags", [])}
if tags.get("Managed") != "true":
continue
checked += 1
# LastRotatedDate is set only by AWS-driven rotations; fall back to
# LastChangedDate when a human has not configured rotation yet.
last = secret.get("LastRotatedDate") or secret.get("LastChangedDate")
status = "needs_rotation" if (last is None or last < cutoff) else "ok"
flagged += status == "needs_rotation"
_tbl.put_item(Item={
"secret_name": secret["Name"],
"run_id": run_id,
"status": status,
"checked_at": now.isoformat(),
})
return {"run_id": run_id, "checked": checked, "flagged": flagged}
Two details in the code that matter once you scale this up. The
tag-key filter
on ListSecrets
is a presence test, not a value match, so the
tags.get("Managed") != "true"
check on the next line is what actually excludes the Managed=false
secrets. And the audit table uses
(secret_name, run_id)
as a composite key, so one secret across many runs is queryable
as a single partition. Both primary-key attributes belong in the
same KeyConditionExpression
when you query: see the "running on real AWS" section for why
putting either in a FilterExpression
will be rejected by AWS.
2. The schedule
The schedule is one Terraform resource. The target is the Lambda
ARN, the payload is whatever JSON you want the handler to see,
and the execution role allows
scheduler.amazonaws.com
to invoke the Lambda. EventBridge Scheduler is the modern
dedicated service for this; an older CloudWatch Events Rule with
schedule_expression
would also work but has smaller quotas and no native
flexible_time_window
support.
# terraform/main.tf: the EventBridge Scheduler resource.
# schedule_expression is "rate(1 hour)" by default, settable per environment.
resource "aws_scheduler_schedule" "hourly" {
name = "${var.prefix}-hourly"
schedule_expression = var.schedule_expression
schedule_expression_timezone = "UTC"
# "OFF" means fire exactly when the expression says. Set this to a jitter
# window in production to spread load across replicas of the same schedule.
flexible_time_window { mode = "OFF" }
target {
arn = aws_lambda_function.check.arn
role_arn = aws_iam_role.scheduler_role.arn
input = jsonencode({ source = "eventbridge-scheduler" })
}
}
Three secrets are seeded by Terraform so every deploy has
something real to check. Two are tagged
Managed=true
(picked up by the handler); one is tagged
Managed=false
(deliberately ignored, so the third test can prove the filter
worked):
# terraform/main.tf: three seeded secrets so every deploy has something real
# for the handler to find and classify.
# Picked up by the handler (tagged Managed=true).
resource "aws_secretsmanager_secret" "fresh" {
name = "${var.prefix}-fresh"
recovery_window_in_days = 0
tags = { Managed = "true" }
}
# Also picked up by the handler.
resource "aws_secretsmanager_secret" "stale" {
name = "${var.prefix}-stale"
recovery_window_in_days = 0
tags = { Managed = "true" }
}
# Deliberately ignored (tagged Managed=false). The third test asserts
# no audit row was written for this one.
resource "aws_secretsmanager_secret" "ignored" {
name = "${var.prefix}-ignored"
recovery_window_in_days = 0
tags = { Managed = "false" }
}
# recovery_window_in_days = 0 disables the default 7-30 day soft-delete window
# so terraform destroy is fully clean and the next deploy can reuse the name.
# Use it in test/dev. Leave the default on for production-grade secrets. 3. Run the scenarios on your laptop
Clone the project, start LocalEmu in another terminal, then deploy:
$ cd localemu-examples/02-scheduled-job
$ localemu start # in a separate terminal
$ ./scripts/deploy.sh local
Thirteen resources apply in 20 seconds. Most of the wall time is Lambda's container cold-start (LocalEmu pulls the python:3.12 runtime image the first time). The Scheduler resource itself creates in 2 seconds:
$ ./scripts/deploy.sh local
aws_dynamodb_table.audit: Creation complete after 0s
aws_secretsmanager_secret.fresh: Creation complete after 0s
aws_secretsmanager_secret.stale: Creation complete after 0s
aws_secretsmanager_secret.ignored: Creation complete after 0s
aws_secretsmanager_secret_version.fresh: Creation complete after 0s
aws_secretsmanager_secret_version.stale: Creation complete after 0s
aws_secretsmanager_secret_version.ignored: Creation complete after 0s
aws_iam_role.lambda_role: Creation complete after 0s
aws_iam_role.scheduler_role: Creation complete after 0s
aws_iam_role_policy.lambda_policy: Creation complete after 0s
aws_iam_role_policy.scheduler_policy: Creation complete after 0s
aws_lambda_function.check: Creation complete after 5s
aws_scheduler_schedule.hourly: Creation complete after 2s
Apply complete! Resources: 13 added, 0 changed, 0 destroyed.
→ deployed to local. outputs:
function_name = "le-rotcheck-check"
schedule_name = "le-rotcheck-hourly"
schedule_expression = "rate(1 hour)"
table_name = "le-rotcheck-audit"
fresh_secret_name = "le-rotcheck-fresh"
stale_secret_name = "le-rotcheck-stale"
ignored_secret_name = "le-rotcheck-ignored"
real 0m21.293s Now drive the three scenarios from the top of this page by hand: inspect the schedule, invoke the Lambda directly, scan the audit table to see what it wrote. None of this waits on the clock; the whole point of testing a scheduled job is to assert the configuration and the handler behaviour separately.
$ # --- Scenario A: confirm the schedule resource itself is wired correctly. ---
$ aws --endpoint-url http://localhost:4566 scheduler get-schedule \
--name le-rotcheck-hourly --output json
{
"Name": "le-rotcheck-hourly",
"State": "ENABLED",
"ScheduleExpression": "rate(1 hour)",
"Target": {
"Arn": "arn:aws:lambda:us-east-1:000000000000:function:le-rotcheck-check",
"RoleArn": "arn:aws:iam::000000000000:role/le-rotcheck-scheduler-role",
"Input": "{\"source\":\"eventbridge-scheduler\"}"
}
}
$ ### Scenario B: invoke the Lambda the same way the scheduler would, see what comes back.
$ aws --endpoint-url http://localhost:4566 lambda invoke \
--function-name le-rotcheck-check --payload '{}' /tmp/sj-resp.json
{
"StatusCode": 200,
"ExecutedVersion": "$LATEST"
}
$ cat /tmp/sj-resp.json | jq
{
"run_id": "run-1779554000991",
"checked": 2,
"flagged": 0
}
# checked=2: both Managed=true secrets were visited. The Managed=false one
# was skipped before the audit-write line. flagged=0 because both secrets
# were just created and their LastChangedDate is fresh; in a real run after
# 30+ days, secrets that have not been rotated would show up here.
$ ### Scenario C: scan the audit table to see the rows the invocation wrote.
$ aws --endpoint-url http://localhost:4566 dynamodb scan \
--table-name le-rotcheck-audit --output table \
--query 'Items[*].[secret_name.S,run_id.S,status.S,checked_at.S]'
+-------------------+-------------------+-----+--------------------------------+
| le-rotcheck-fresh| run-1779554000991| ok | 2026-05-23T16:33:20.991779+00:00 |
| le-rotcheck-stale| run-1779554000991| ok | 2026-05-23T16:33:20.991779+00:00 |
+-------------------+-------------------+-----+--------------------------------+
# Two rows for one invocation. The "ignored" secret has no row.
# ISO-8601 UTC timestamp on every row, the same value the test asserts.
The same three scenarios are baked into
tests/test_scheduled_job.py
as one pytest assertion each, running end to end in under three
seconds:
$ ./scripts/test.sh local
============================= test session starts ==============================
platform darwin -- Python 3.13.12, pytest-9.0.3
collected 3 items
tests/test_scheduled_job.py::test_schedule_is_configured PASSED
tests/test_scheduled_job.py::test_lambda_flags_managed_secrets PASSED
tests/test_scheduled_job.py::test_audit_row_carries_status_and_timestamp PASSED
============================== 3 passed in 2.82s ============================== Tear it back down:
$ ./scripts/teardown.sh local
Destroy complete! Resources: 13 destroyed.
→ verifying teardown for prefix 'le-rotcheck' on local
clean: nothing left behind
real 0m8.193s Deploy 21 seconds, tests 3 seconds, teardown 8 seconds. Roughly half a minute for a complete cycle, repeatable as many times as you like.
4. The same code on real AWS
Same three scripts, aws
instead of local:
$ ./scripts/test.sh aws
$ ./scripts/teardown.sh aws
One DynamoDB gotcha is worth knowing before you put the same
query in your dashboard service. The audit table has a composite
primary key
(secret_name,
run_id), and
the obvious shortcut "query by partition key, filter by sort key"
is exactly what real AWS does not allow:
# The wrong way: FilterExpression cannot reference primary-key attributes.
table.query(
KeyConditionExpression=Key("secret_name").eq(secret_name),
FilterExpression="run_id = :rid", # run_id is the SORT key
ExpressionAttributeValues={":rid": run_id},
)
# Real DynamoDB rejects this with:
# ValidationException: Filter Expression can only contain non-primary
# key attributes: Primary key attribute: run_id
# The right way: both primary-key attributes belong in KeyConditionExpression.
table.query(
KeyConditionExpression=(
Key("secret_name").eq(secret_name) & Key("run_id").eq(run_id)
),
)
Both LocalEmu and real DynamoDB reject the first form with
identical ValidationException
text. Always put primary-key attributes in
KeyConditionExpression;
leave FilterExpression
for everything else. The third test in this tutorial uses the
right form, which is why it passes on both targets.
Other places where the same Terraform behaves a bit differently between LocalEmu and real AWS:
- • Secrets Manager recovery window.
On real AWS, the default 7-day deletion window means a
terraform destroyfollowed by a re-deploy fails because the secret name is still reserved. Setrecovery_window_in_days = 0in non-production so the destroy is fully clean. This tutorial does that for all three seeded secrets. - • Scheduler timing on real AWS
is not perfectly aligned: a
rate(1 hour)schedule can fire a few seconds early or late, especially with aflexible_time_windowconfigured. Tests should never assert on the wall-clock interval between fires; assert configuration plus per-invocation behaviour, as this tutorial does. - • IAM propagation. The scheduler role on real AWS takes a few seconds to become usable. Terraform's dependency ordering handles this on the first apply. LocalEmu propagation is instant.
Broader comparison in LocalEmu vs Real AWS and Known Limitations.
Get the full project
git clone https://github.com/localemu/localemu-examples
: the scheduled-job tutorial lives in
02-scheduled-job/
with the Terraform, the Lambda, the three integration tests,
and the deploy / test / teardown scripts that produced every
terminal output on this page.