Latency Simulation
Add per-service latency to AWS API responses so timeout handling and performance budgets behave the way they will in front of real AWS, without leaving your laptop. Two modes: realistic per-service profiles with Gaussian jitter, or a single fixed-millisecond delay applied to every call.
SIMULATE_LATENCY=1
(realistic per-service profiles) or
SIMULATE_LATENCY=50
(flat 50 ms). Hit DynamoDB, S3 and SQS with
awsemu, then with the
Python SDK in a 20-call loop, and watch the injected delay match the
profile table below. The mean / stddev pairs in the table are read
verbatim from
aws/handlers/latency.py,
so the page and the code cannot drift.
Per-Service Profiles
DynamoDB ~8ms, S3 ~30-80ms, Lambda Invoke ~150ms. Each service gets a profile that approximates real AWS.
Gaussian Jitter
Not flat delays. Each call samples from a normal distribution (mean +/- stddev) for realistic variance.
Fixed Mode
Set SIMULATE_LATENCY=50 for a flat 50ms on every call. Perfect for timeout testing and performance budgets.
Enable latency simulation
Mode 1: Per-service realistic profiles
$ SIMULATE_LATENCY=1 localemu start
# Per-service realistic latency profiles enabled
# DynamoDB ~8ms, S3 ~30-80ms, SQS ~15-25ms, Lambda ~150ms, etc. SIMULATE_LATENCY=1 enables per-service latency profiles. Each service gets a realistic delay based on real AWS measurements. Each call's sampled delay is clamped to [1.0 ms, 5 × mean] so worst-case latency stays bounded.
Mode 2: Fixed latency for all services
$ SIMULATE_LATENCY=50 localemu start
# Fixed 50ms latency added to every API response
# Useful for consistent testing and performance budgets SIMULATE_LATENCY=50 adds a fixed 50ms delay to every API response. Values are in milliseconds (decimals allowed). The fixed value is applied verbatim with no jitter or clamping.
Per-service latency profiles
Every (mean, stddev) pair in this table is read directly from
src/localemu/aws/handlers/latency.py.
The sampled delay is clamped to the range
[1 ms, 5 * mean] to rule out negative
or extreme-outlier draws.
DynamoDB latency
$ time awsemu dynamodb put-item \
--table-name test-table \
--item '{"id": {"S": "item-1"}}'
real 0m0.265s
user 0m0.180s
sys 0m0.040s
# ~265ms total, but this includes CLI process startup,
# HTTP serialization, and response parsing overhead.
# The actual injected latency is ~8ms. Use SDK for accurate measurements. CLI shows ~265ms total, but this includes process startup, HTTP serialization, and response parsing. The actual injected latency is ~8ms. Use the Python SDK (below) for accurate latency measurements.
S3 latency
$ time awsemu s3api put-object \
--bucket test-bucket \
--key "file.txt" \
--body /dev/null
real 0m0.360s
user 0m0.190s
sys 0m0.042s
# ~360ms via CLI. Higher variance than DynamoDB, matches real AWS S3 behavior. PutObject ~360ms via CLI. S3 has higher variance than DynamoDB: this matches real AWS S3 behavior where latencies are less predictable than DynamoDB's single-digit millisecond promise.
SQS latency
$ time awsemu sqs send-message \
--queue-url http://localhost:4566/000000000000/test-queue \
--message-body "hello"
real 0m0.270s
$ time awsemu sqs receive-message \
--queue-url http://localhost:4566/000000000000/test-queue
real 0m0.288s SendMessage ~270ms, ReceiveMessage ~288ms via CLI. Again, these include CLI overhead. The injected latency is ~15ms for SendMessage and ~25ms for ReceiveMessage.
Python SDK measurement
$ python3 -c "
import boto3, time
client_ddb = boto3.client(
'dynamodb',
endpoint_url='http://localhost:4566',
region_name='us-east-1'
)
client_s3 = boto3.client(
's3',
endpoint_url='http://localhost:4566',
region_name='us-east-1'
)
# DynamoDB PutItem
times_ddb = []
for i in range(20):
start = time.monotonic()
client_ddb.put_item(
TableName='test-table',
Item={'id': {'S': f'sdk-{i}'}}
)
times_ddb.append((time.monotonic() - start) * 1000)
# S3 ListBuckets
times_s3 = []
for i in range(20):
start = time.monotonic()
client_s3.list_buckets()
times_s3.append((time.monotonic() - start) * 1000)
print(f'DynamoDB PutItem:')
print(f' avg={sum(times_ddb)/len(times_ddb):.1f}ms')
print(f' min={min(times_ddb):.1f}ms')
print(f' max={max(times_ddb):.1f}ms')
print(f'S3 ListBuckets:')
print(f' avg={sum(times_s3)/len(times_s3):.1f}ms')
print(f' min={min(times_s3):.1f}ms')
print(f' max={max(times_s3):.1f}ms')
"
DynamoDB PutItem:
avg=12.2ms
min=8.9ms
max=15.1ms
S3 ListBuckets:
avg=52.1ms
min=12.7ms
max=78.9ms
# DynamoDB matches ~8ms mean profile. S3 matches ~50ms mean profile.
# Gaussian jitter creates realistic variance around the mean. The SDK measurements show the real picture. DynamoDB PutItem averages 12.2ms (matches ~8ms profile + local overhead). S3 ListBuckets averages 52.1ms (matches ~50ms profile). Use time.monotonic() for accurate timing, not wall clock time.
Fixed latency mode
$ SIMULATE_LATENCY=50 localemu start
# Every API response gets exactly 50ms added delay
# No per-service profiles, no jitter: flat and predictable
$ SIMULATE_LATENCY=200 localemu start
# 200ms delay, useful for testing timeout handling
# Verify your app handles slow responses gracefully SIMULATE_LATENCY=50 gives every response exactly 50ms of added delay. No per-service profiles, no jitter. Useful for consistent testing, verifying performance budgets, and testing timeout handling.
How It Works
Response handler
Runs after the service processes the request, before returning the response to the client. Like real network latency.
Gaussian distribution
Each call gets a random delay from a normal distribution (mean +/- stddev). Clamped to [1ms, 5x mean] to avoid negative or extreme outliers.
Skips internal calls
Cross-service calls (e.g. Lambda invoking DynamoDB internally) skip the delay. Only the external client request pays it, the way real AWS bills network latency.
Off by default
The handler only registers when SIMULATE_LATENCY is set to something other than 0 / false / no. Calm-mode runs pay nothing.