Tutorial: CI/CD Pipeline Testing with LocalEmu
Build a real application from scratch, test it locally with LocalEmu, and deploy a fully automated GitHub Actions CI pipeline. No AWS account required.
What you will build
A File Processing Service in Python that reads JSON files from S3, extracts records, and writes them to a DynamoDB table. You will test it against LocalEmu both on your machine and in GitHub Actions CI.
Prerequisites
- Docker installed and running
- Python 3.10 or later
- LocalEmu installed (
pip install localemu[runtime]or via Docker) - A GitHub account (for the CI steps)
Step 1: Create the Project
Start by creating the project directory with separate folders for source code and tests.
mkdir file-processor && cd file-processor
mkdir -p src tests Your project structure will look like this when you are done:
file-processor/
src/
app.py
tests/
conftest.py
test_app.py
requirements.txt
.github/
workflows/
test.yml Step 2: Write the Application Code
Create the file processor in src/app.py. This function downloads a JSON file from S3, loops through the records inside it, and writes each one to a DynamoDB table. The key detail: when the AWS_ENDPOINT_URL environment variable is set, boto3 sends requests to LocalEmu instead of real AWS. When it is not set, the same code talks to real AWS. Zero code changes between environments.
import boto3
import json
import os
def get_client(service):
"""Create a boto3 client. Uses AWS_ENDPOINT_URL for LocalEmu when set."""
kwargs = {}
endpoint = os.environ.get("AWS_ENDPOINT_URL")
if endpoint:
kwargs["endpoint_url"] = endpoint
return boto3.client(service, region_name="us-east-1", **kwargs)
def process_file(bucket, key):
"""Read a JSON file from S3, extract records, write each to DynamoDB."""
s3 = get_client("s3")
dynamodb = get_client("dynamodb")
# Download the file from S3
response = s3.get_object(Bucket=bucket, Key=key)
data = json.loads(response["Body"].read())
# Write each record to DynamoDB
written = 0
for record in data.get("records", []):
dynamodb.put_item(
TableName="processed-records",
Item={
"id": {"S": record["id"]},
"name": {"S": record["name"]},
"status": {"S": "processed"},
"source_file": {"S": key},
},
)
written += 1
return {"processed": written, "source": f"s3://{bucket}/{key}"}
Notice that there is nothing LocalEmu-specific in this code. The get_client helper checks for an environment variable and passes it to boto3 if present. This is the same pattern you would use for any custom endpoint, and it means your production code and test code are identical.
Step 3: Write the Requirements File
Create requirements.txt in the project root. You only need two packages: boto3 for AWS interactions and pytest for running the tests.
boto3
pytest Step 4: Write the Test Fixtures
Create tests/conftest.py with reusable pytest fixtures. These fixtures create boto3 clients that point at LocalEmu. Using scope="session" means the clients are created once and reused across all tests, which is faster than recreating them for every test function.
import boto3
import pytest
import os
ENDPOINT = os.environ.get("AWS_ENDPOINT_URL", "http://localhost:4566")
CLIENT_KWARGS = dict(
endpoint_url=ENDPOINT,
aws_access_key_id="AKIAIOSFODNN7EXAMPLE",
aws_secret_access_key="wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
region_name="us-east-1",
)
@pytest.fixture(scope="session")
def s3_client():
"""Create a reusable S3 client for the entire test session."""
return boto3.client("s3", **CLIENT_KWARGS)
@pytest.fixture(scope="session")
def dynamodb_client():
"""Create a reusable DynamoDB client for the entire test session."""
return boto3.client("dynamodb", **CLIENT_KWARGS)
The endpoint defaults to http://localhost:4566 for local runs and falls through to the AWS_ENDPOINT_URL environment variable when CI sets it. The credentials in CLIENT_KWARGS are the example pair from AWS's own documentation, kept here just so boto3 has something to send: LocalEmu's default ROOT_ACCESS_KEYS recognises this access key as a root key, so the suite passes whether or not IAM_ENFORCEMENT=1 is enabled.
Step 5: Write the Tests
Create tests/test_app.py. Each test follows the same pattern: set up the AWS resources on LocalEmu, upload test data to S3, call the application function, and verify the results in DynamoDB.
import json
import sys
import os
import pytest
# Add the src directory to the path so we can import app
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "src"))
from app import process_file
@pytest.fixture(autouse=True)
def setup_resources(s3_client, dynamodb_client):
"""Create S3 bucket and DynamoDB table before each test, clean up after."""
# Create bucket (ignore if it already exists)
try:
s3_client.create_bucket(Bucket="uploads")
except s3_client.exceptions.BucketAlreadyOwnedByYou:
pass
# Create table (ignore if it already exists)
try:
dynamodb_client.create_table(
TableName="processed-records",
KeySchema=[{"AttributeName": "id", "KeyType": "HASH"}],
AttributeDefinitions=[{"AttributeName": "id", "AttributeType": "S"}],
BillingMode="PAY_PER_REQUEST",
)
except dynamodb_client.exceptions.ResourceInUseException:
pass
yield
# Clean up: delete all items and resources
try:
scan = dynamodb_client.scan(TableName="processed-records")
for item in scan.get("Items", []):
dynamodb_client.delete_item(
TableName="processed-records",
Key={"id": item["id"]},
)
except Exception:
pass
def test_process_file_writes_all_records(s3_client, dynamodb_client):
"""Upload a JSON file with 3 records, verify all 3 land in DynamoDB."""
test_data = json.dumps({"records": [
{"id": "r1", "name": "Alice"},
{"id": "r2", "name": "Bob"},
{"id": "r3", "name": "Charlie"},
]})
s3_client.put_object(Bucket="uploads", Key="batch-001.json", Body=test_data)
result = process_file("uploads", "batch-001.json")
assert result["processed"] == 3
# Verify each record
for rid, name in [("r1", "Alice"), ("r2", "Bob"), ("r3", "Charlie")]:
item = dynamodb_client.get_item(
TableName="processed-records",
Key={"id": {"S": rid}},
)["Item"]
assert item["name"]["S"] == name
assert item["status"]["S"] == "processed"
assert item["source_file"]["S"] == "batch-001.json"
def test_process_file_handles_empty_records(s3_client):
"""An empty records array should process zero items without errors."""
s3_client.put_object(
Bucket="uploads",
Key="empty.json",
Body=json.dumps({"records": []}),
)
result = process_file("uploads", "empty.json")
assert result["processed"] == 0
def test_process_file_sets_source_file(s3_client, dynamodb_client):
"""The source_file field should match the S3 key of the input file."""
test_data = json.dumps({"records": [
{"id": "src-1", "name": "Diana"},
]})
s3_client.put_object(Bucket="uploads", Key="report-42.json", Body=test_data)
process_file("uploads", "report-42.json")
item = dynamodb_client.get_item(
TableName="processed-records",
Key={"id": {"S": "src-1"}},
)["Item"]
assert item["source_file"]["S"] == "report-42.json"
The setup_resources fixture runs before every test. It creates the S3 bucket and DynamoDB table (ignoring errors if they already exist), then cleans up DynamoDB items after each test. This gives every test a fresh starting state.
There are three tests:
- test_process_file_writes_all_records - uploads a file with three records, runs the processor, and verifies all three records appear in DynamoDB with the correct fields.
- test_process_file_handles_empty_records - uploads a file with an empty records array and verifies the function handles it gracefully, returning zero processed.
- test_process_file_sets_source_file - verifies that the source_file metadata field in DynamoDB matches the original S3 key.
Step 6: Test Locally
Before pushing anything to GitHub, run the tests on your machine. Start LocalEmu, install dependencies, and run pytest.
localemu start -d
pip install -r requirements.txt
AWS_ENDPOINT_URL=http://localhost:4566 pytest tests/ -v You should see output like this:
collected 3 items
tests/test_app.py::test_process_file_writes_all_records PASSED
tests/test_app.py::test_process_file_handles_empty_records PASSED
tests/test_app.py::test_process_file_sets_source_file PASSED
============================== 3 passed in 0.35s ============================== All three tests pass. The S3 bucket, the DynamoDB table, the file uploads, and the record writes all happened inside LocalEmu on your machine. No AWS account was involved. When you are done testing, stop LocalEmu:
localemu stop Step 7: Create the GitHub Repository
Initialize a git repository and make your first commit. Make sure you are in the file-processor directory.
git init
git add src/ tests/ requirements.txt
git commit -m "Initial commit: file processing service with LocalEmu tests" Notice that you are not committing the workflow file yet. You will add that in the next step.
Step 8: Create the GitHub Actions Workflow
Create the workflow file at .github/workflows/test.yml. This tells GitHub Actions to start a LocalEmu container alongside your test runner, install your Python dependencies, and run pytest.
mkdir -p .github/workflows name: Integration Tests
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
services:
localemu:
image: localemu/localemu:latest
ports:
- 4566:4566
options: >-
--health-cmd "curl -f http://localhost:4566/_localemu/health"
--health-interval 10s
--health-timeout 5s
--health-retries 5
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install dependencies
run: pip install -r requirements.txt
- name: Run integration tests
run: pytest tests/ -v --tb=short
env:
AWS_ENDPOINT_URL: http://localhost:4566
AWS_ACCESS_KEY_ID: AKIAIOSFODNN7EXAMPLE
AWS_SECRET_ACCESS_KEY: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
AWS_DEFAULT_REGION: us-east-1 Here is what each part does:
- services.localemu - GitHub Actions starts LocalEmu as a sidecar container. It runs alongside your test steps and is automatically destroyed when the job finishes.
- health-cmd - The health check pings the LocalEmu health endpoint every 10 seconds. GitHub will not start running your test steps until the health check passes. This replaces any need for
sleepcommands. - AWS_ENDPOINT_URL - This environment variable tells boto3 to send all AWS API calls to
http://localhost:4566(the LocalEmu container) instead of real AWS. - AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY: boto3 always wants a credential pair on the wire. The values are the example pair from AWS's own documentation; LocalEmu's default
ROOT_ACCESS_KEYSmatches the access key, so the same workflow runs underIAM_ENFORCEMENT=1without change.
Now commit the workflow file:
git add .github/
git commit -m "Add GitHub Actions workflow for LocalEmu integration tests" Step 9: Push to GitHub
Create a new repository on GitHub (you can do this through the GitHub web interface or the gh CLI), then push your code.
git remote add origin https://github.com/yourname/file-processor.git
git branch -M main
git push -u origin main
Replace yourname with your actual GitHub username.
Step 10: Watch CI Run
Go to your repository on GitHub and click the Actions tab. You will see a workflow run triggered by your push. Here is what happens behind the scenes:
GitHub provisions a runner
An Ubuntu VM spins up with Docker pre-installed.
LocalEmu container starts
GitHub pulls the localemu/localemu:latest image and starts it. The health check runs every 10 seconds until LocalEmu responds on port 4566.
Your code is checked out and dependencies installed
GitHub runs pip install -r requirements.txt to install boto3 and pytest.
Tests run against LocalEmu
pytest creates the S3 bucket and DynamoDB table on LocalEmu, uploads test files, calls your application code, and verifies the results. The exact same tests that passed on your laptop run identically in CI.
Green checkmark
All tests pass. The LocalEmu container is automatically destroyed. No resources to clean up, no AWS bill, no credentials to rotate.
From now on, every push and every pull request will automatically run your integration tests. If a test fails, the PR gets a red X and the team knows immediately.
Why This Approach Works
Running integration tests against real AWS in CI is painful. LocalEmu eliminates every pain point.
| Real AWS in CI | LocalEmu in CI | |
|---|---|---|
| Test suite time | minutes, dominated by AWS API round-trips and the runner-to-region hop | seconds for an app-level suite, no network round-trips at all |
| AWS credentials | Required (IAM keys or OIDC) | Not needed |
| Cloud cost per run | $0.10 - $2.00+ | $0.00 |
| Test isolation | Shared account, resource conflicts | Fully isolated per job |
| Flakiness | Eventual consistency, rate limits | Deterministic, no network dependency |
Tips for Production Use
Pin your LocalEmu version
Use a specific tag like localemu/localemu:0.5.0 instead of latest in CI. This prevents unexpected behavior when new versions are released.
Use health checks, not sleep
The service container health check in the workflow above is the correct way to wait for readiness. Never add sleep 30 to your workflow steps.
Keep application code environment-agnostic
Use AWS_ENDPOINT_URL as the only difference between test and production. Your application code should never import or reference LocalEmu directly.
Split large test suites into parallel jobs
Because each LocalEmu container is fully isolated, you can run multiple test jobs in parallel. Each job gets its own fresh AWS environment with no resource conflicts.
Test Terraform too
You can validate Terraform configurations in CI using the same LocalEmu container. See the Terraform Infrastructure Testing guide for details.
Next Steps
You now have a complete, working CI pipeline that tests your AWS-dependent application without any AWS credentials or cloud costs. From here you can:
- Add more AWS services to your application (SQS, SNS, Lambda) and test them the same way.
- Set up local development workflows so your whole team develops against LocalEmu.
- Add Terraform validation to your CI pipeline.
- Build event-driven microservices with SQS, SNS, and Lambda, all tested locally.