Docs / Use Cases / Latency Simulation

Latency Simulation

Add per-service latency to AWS API responses so timeout handling and performance budgets behave the way they will in front of real AWS, without leaving your laptop. Two modes: realistic per-service profiles with Gaussian jitter, or a single fixed-millisecond delay applied to every call.

What the demo does. Start LocalEmu with SIMULATE_LATENCY=1 (realistic per-service profiles) or SIMULATE_LATENCY=50 (flat 50 ms). Hit DynamoDB, S3 and SQS with awsemu, then with the Python SDK in a 20-call loop, and watch the injected delay match the profile table below. The mean / stddev pairs in the table are read verbatim from aws/handlers/latency.py, so the page and the code cannot drift.

⏱

Per-Service Profiles

DynamoDB ~8ms, S3 ~30-80ms, Lambda Invoke ~150ms. Each service gets a profile that approximates real AWS.

📊

Gaussian Jitter

Not flat delays. Each call samples from a normal distribution (mean +/- stddev) for realistic variance.

📐

Fixed Mode

Set SIMULATE_LATENCY=50 for a flat 50ms on every call. Perfect for timeout testing and performance budgets.

Enable latency simulation

Mode 1: Per-service realistic profiles

$ SIMULATE_LATENCY=1 localemu start

# Per-service realistic latency profiles enabled
# DynamoDB ~8ms, S3 ~30-80ms, SQS ~15-25ms, Lambda ~150ms, etc.

SIMULATE_LATENCY=1 enables per-service latency profiles. Each service gets a realistic delay based on real AWS measurements. Each call's sampled delay is clamped to [1.0 ms, 5 × mean] so worst-case latency stays bounded.

Mode 2: Fixed latency for all services

$ SIMULATE_LATENCY=50 localemu start

# Fixed 50ms latency added to every API response
# Useful for consistent testing and performance budgets

SIMULATE_LATENCY=50 adds a fixed 50ms delay to every API response. Values are in milliseconds (decimals allowed). The fixed value is applied verbatim with no jitter or clamping.

Per-service latency profiles

Service / Operation

Mean Latency

Std Dev

DynamoDB (default)

8ms

2ms

DynamoDB Query

10ms

3ms

DynamoDB Scan

15ms

5ms

S3 (default)

50ms

15ms

S3 GetObject

30ms

10ms

S3 PutObject

80ms

20ms

SQS (default)

20ms

5ms

SQS SendMessage

15ms

4ms

SQS ReceiveMessage

25ms

6ms

Lambda (default)

60ms

15ms

Lambda Invoke

150ms

40ms

EC2 (default)

60ms

20ms

EC2 RunInstances

200ms

50ms

EC2 DescribeInstances

80ms

20ms

SNS

25ms

6ms

CloudFormation

80ms

25ms

IAM

20ms

5ms

STS

10ms

3ms

KMS

25ms

8ms

Kinesis

30ms

10ms

CloudWatch

35ms

10ms

Other services (default)

30ms

12ms

Every (mean, stddev) pair in this table is read directly from src/localemu/aws/handlers/latency.py. The sampled delay is clamped to the range [1 ms, 5 * mean] to rule out negative or extreme-outlier draws.

DynamoDB latency

$ time awsemu dynamodb put-item \
    --table-name test-table \
    --item '&#123;"id": &#123;"S": "item-1"&#125;&#125;'

real    0m0.265s
user    0m0.180s
sys     0m0.040s

# ~265ms total, but this includes CLI process startup,
# HTTP serialization, and response parsing overhead.
# The actual injected latency is ~8ms. Use SDK for accurate measurements.

CLI shows ~265ms total, but this includes process startup, HTTP serialization, and response parsing. The actual injected latency is ~8ms. Use the Python SDK (below) for accurate latency measurements.

S3 latency

$ time awsemu s3api put-object \
    --bucket test-bucket \
    --key "file.txt" \
    --body /dev/null

real    0m0.360s
user    0m0.190s
sys     0m0.042s

# ~360ms via CLI. Higher variance than DynamoDB, matches real AWS S3 behavior.

PutObject ~360ms via CLI. S3 has higher variance than DynamoDB: this matches real AWS S3 behavior where latencies are less predictable than DynamoDB's single-digit millisecond promise.

SQS latency

$ time awsemu sqs send-message \
    --queue-url http://localhost:4566/000000000000/test-queue \
    --message-body "hello"

real    0m0.270s

$ time awsemu sqs receive-message \
    --queue-url http://localhost:4566/000000000000/test-queue

real    0m0.288s

SendMessage ~270ms, ReceiveMessage ~288ms via CLI. Again, these include CLI overhead. The injected latency is ~15ms for SendMessage and ~25ms for ReceiveMessage.

Python SDK measurement

$ python3 -c "
import boto3, time

client_ddb = boto3.client(
    'dynamodb',
    endpoint_url='http://localhost:4566',
    region_name='us-east-1'
)
client_s3 = boto3.client(
    's3',
    endpoint_url='http://localhost:4566',
    region_name='us-east-1'
)

# DynamoDB PutItem
times_ddb = []
for i in range(20):
    start = time.monotonic()
    client_ddb.put_item(
        TableName='test-table',
        Item=&#123;'id': &#123;'S': f'sdk-&#123;i&#125;'&#125;&#125;
    )
    times_ddb.append((time.monotonic() - start) * 1000)

# S3 ListBuckets
times_s3 = []
for i in range(20):
    start = time.monotonic()
    client_s3.list_buckets()
    times_s3.append((time.monotonic() - start) * 1000)

print(f'DynamoDB PutItem:')
print(f'  avg=&#123;sum(times_ddb)/len(times_ddb):.1f&#125;ms')
print(f'  min=&#123;min(times_ddb):.1f&#125;ms')
print(f'  max=&#123;max(times_ddb):.1f&#125;ms')
print(f'S3 ListBuckets:')
print(f'  avg=&#123;sum(times_s3)/len(times_s3):.1f&#125;ms')
print(f'  min=&#123;min(times_s3):.1f&#125;ms')
print(f'  max=&#123;max(times_s3):.1f&#125;ms')
"

DynamoDB PutItem:
  avg=12.2ms
  min=8.9ms
  max=15.1ms
S3 ListBuckets:
  avg=52.1ms
  min=12.7ms
  max=78.9ms

# DynamoDB matches ~8ms mean profile. S3 matches ~50ms mean profile.
# Gaussian jitter creates realistic variance around the mean.

The SDK measurements show the real picture. DynamoDB PutItem averages 12.2ms (matches ~8ms profile + local overhead). S3 ListBuckets averages 52.1ms (matches ~50ms profile). Use time.monotonic() for accurate timing, not wall clock time.

Fixed latency mode

$ SIMULATE_LATENCY=50 localemu start

# Every API response gets exactly 50ms added delay
# No per-service profiles, no jitter: flat and predictable

$ SIMULATE_LATENCY=200 localemu start

# 200ms delay, useful for testing timeout handling
# Verify your app handles slow responses gracefully

SIMULATE_LATENCY=50 gives every response exactly 50ms of added delay. No per-service profiles, no jitter. Useful for consistent testing, verifying performance budgets, and testing timeout handling.

How It Works

Response handler

Runs after the service processes the request, before returning the response to the client. Like real network latency.

Gaussian distribution

Each call gets a random delay from a normal distribution (mean +/- stddev). Clamped to [1ms, 5x mean] to avoid negative or extreme outliers.

Skips internal calls

Cross-service calls (e.g. Lambda invoking DynamoDB internally) skip the delay. Only the external client request pays it, the way real AWS bills network latency.

Off by default

The handler only registers when SIMULATE_LATENCY is set to something other than 0 / false / no. Calm-mode runs pay nothing.

← Throttling Simulation Lambda Cold Starts →