Docs/ Use Cases/ Image Pipeline

Build an S3-triggered image pipeline and iterate it on your laptop

You are working on the user-profile feature of a small SaaS. Users keep uploading multi-megabyte photos as avatars and the app keeps serving the originals as-is. The product manager wants a fix: when a user uploads an image, the system should generate a small thumbnail automatically and serve that instead. Your team's pattern for this kind of thing is the standard AWS shape: an S3 bucket for incoming uploads, a Lambda fired by s3:ObjectCreated:* that makes the thumbnail with Pillow, a second S3 bucket for the output, and an SNS topic so the rest of the team's services can subscribe to the "avatar changed" event without redeploying this pipeline.

Building it against real AWS is slow. Every time you change a line in the handler you wait for the Lambda zip to upload, for the new version to swap in, then you upload another test image to your sandbox bucket and tail CloudWatch Logs to see what happened. Native Python wheels (Pillow is one) add another wrinkle: the wheel that works on your macOS or Windows laptop will not load inside the AWS Lambda runtime, and finding out involves a real deploy.

This tutorial builds the same pipeline end to end on LocalEmu: 8 Terraform resources, one 77-line handler, two integration tests, no AWS account. Deploy in 20 seconds, upload a real PNG and see the thumbnail in the processed bucket a second or two later, tear the whole thing down in 6. The same Terraform and the same Lambda zip apply to real AWS by passing aws instead of local to the deploy script.

What you will have working at the end

Three things, all running locally, with the real terminal output captured later on this page:

Architecture

user  ─PutObject─▶ s3: le-images-incoming
                           │  s3:ObjectCreated:*  (.png / .jpg / .jpeg)
                           ▼
                 lambda: le-images-processor     (Python 3.12 + Pillow)
                           │
        ┌──────────────────┼──────────────────┐
        ▼                  ▼                  ▼
s3: le-images-    sns: le-images-     cloudwatch:
processed         notifications       FilesProcessed metric
thumbnails/*           │
                       └─▶ subscribers (SQS, Lambda, e-mail, ...)

1. The handler

The handler is one Python file. S3 sends a JSON event whose Records list carries one entry per uploaded object; the handler iterates, downloads each object, generates the thumbnail with PIL.Image.thumbnail(), writes it to the processed bucket, publishes a JSON summary to SNS, and increments a CloudWatch counter. Errors are caught per-record so one bad image does not kill the batch:

src/handler.py
# src/handler.py: triggered by s3:ObjectCreated:* on the incoming bucket.

PROCESSED_BUCKET = os.environ["PROCESSED_BUCKET"]
SNS_TOPIC_ARN    = os.environ["SNS_TOPIC_ARN"]
METRIC_NAMESPACE = os.environ.get("METRIC_NAMESPACE", "LocalEmuImagePipeline")
THUMBNAIL_MAX    = int(os.environ.get("THUMBNAIL_MAX_DIM", "128"))

_s3  = boto3.client("s3")
_sns = boto3.client("sns")
_cw  = boto3.client("cloudwatch")


def handle(event, _ctx):
    # One S3 event can carry multiple records. Catch errors per-record so
    # a single bad upload cannot take down the whole batch.
    results = []
    for record in event.get("Records", []):
        bucket = record["s3"]["bucket"]["name"]
        key    = record["s3"]["object"]["key"]
        try:
            results.append({"key": key, "ok": True,
                            "summary": _process_one(bucket, key)})
        except Exception as exc:
            LOG.exception("failed to process %s", key)
            results.append({"key": key, "ok": False, "error": str(exc)})
    return {"processed": len(results), "results": results}


def _process_one(bucket, key):
    src = _s3.get_object(Bucket=bucket, Key=key)["Body"].read()
    img = Image.open(io.BytesIO(src))
    fmt = img.format or "PNG"
    original = img.size
    img.thumbnail((THUMBNAIL_MAX, THUMBNAIL_MAX))  # aspect ratio preserved

    out = io.BytesIO(); img.save(out, format=fmt); thumb = out.getvalue()
    thumb_key = f"thumbnails/{key}"
    _s3.put_object(Bucket=PROCESSED_BUCKET, Key=thumb_key, Body=thumb,
                   ContentType=f"image/{fmt.lower()}")

    summary = {"source":    {"bucket": bucket, "key": key,
                            "width": original[0], "height": original[1]},
               "thumbnail": {"bucket": PROCESSED_BUCKET, "key": thumb_key,
                            "width": img.size[0], "height": img.size[1]}}

    _sns.publish(TopicArn=SNS_TOPIC_ARN, Message=json.dumps(summary))
    _cw.put_metric_data(Namespace=METRIC_NAMESPACE, MetricData=[{
        "MetricName": "FilesProcessed", "Value": 1, "Unit": "Count"}])
    return summary

Three things in this code are worth pointing at. First, boto3 reads AWS_ENDPOINT_URL from the environment automatically, and LocalEmu sets that variable in every Lambda container it spawns. The handler does not have a single conditional on "are we local or in AWS". Same zip, both worlds. Second, img.thumbnail() modifies the image in place and preserves the aspect ratio: an 800x600 source becomes 128x96, not a squashed 128x128. Third, the try/except wraps one record at a time, so a corrupt or unsupported upload returns one failed result instead of throwing the entire Lambda invocation, which would be retried by S3 and could deadletter the whole batch.

2. The infrastructure

Eight Terraform resources: two S3 buckets (incoming + processed), one SNS topic, the Lambda function with its IAM role and policy, a Lambda permission letting S3 invoke it, and the S3 bucket notification that ties the upload event to the function. Nothing in the resources knows about LocalEmu; the local-versus-AWS switch lives in a single provider block, gated by a target variable (see the REST API tutorial for that block in full).

The S3 notification is the one piece of Terraform that surprises people the first time. S3 event filters do not OR across extensions, so each suffix gets its own lambda_function block inside a single notification resource:

terraform/main.tf (S3 notification)
# terraform/main.tf: one S3 notification, three suffix filters.
# S3 event filters do not OR across extensions, so one block per suffix.

resource "aws_s3_bucket_notification" "incoming" {
  bucket = aws_s3_bucket.incoming.id

  lambda_function {
    lambda_function_arn = aws_lambda_function.processor.arn
    events              = ["s3:ObjectCreated:*"]
    filter_suffix       = ".png"
  }
  lambda_function {
    lambda_function_arn = aws_lambda_function.processor.arn
    events              = ["s3:ObjectCreated:*"]
    filter_suffix       = ".jpg"
  }
  lambda_function {
    lambda_function_arn = aws_lambda_function.processor.arn
    events              = ["s3:ObjectCreated:*"]
    filter_suffix       = ".jpeg"
  }

  depends_on = [aws_lambda_permission.allow_s3]
}

The Lambda's IAM role gets exactly what the handler needs: s3:GetObject on the incoming bucket, s3:PutObject on the processed bucket, sns:Publish on the topic, cloudwatch:PutMetricData, and the logs permissions. LocalEmu can evaluate IAM policies the same way real AWS does (turn the enforcement on with the IAM enforcement docs); with enforcement on, dropping s3:GetObject from the role and redeploying will reproduce the same AccessDeniedException you would see on real AWS.

3. Building the Lambda zip with Pillow

Pillow ships as a compiled native wheel. The wheel pip install gives you on macOS or Windows will not load inside an AWS Lambda container, which runs Amazon Linux x86_64. The fix is one underused pip flag, --platform, which pulls the manylinux build cross-platform:

scripts/build.sh
# scripts/build.sh: build a Lambda zip with Linux wheels for Pillow,
# regardless of whether the host is macOS, Linux, or Windows.
#
# pip's --platform flag pulls the right manylinux wheels cross-platform,
# so no Docker is needed just to build the deployment package.

pip install --quiet \
  --platform manylinux2014_x86_64 \
  --only-binary=:all: \
  --python-version 3.12 \
  --target "$BUILD/pkg" \
  -r "$HERE/src/requirements.txt"

cp "$HERE/src/handler.py" "$BUILD/pkg/"
( cd "$BUILD/pkg" && zip -qr "$BUILD/lambda.zip" . )
# produces build/lambda.zip (~8 MB) that runs on both LocalEmu and real AWS

deploy.sh runs build.sh automatically whenever the handler or its requirements change. The resulting build/lambda.zip is the same artifact uploaded to both LocalEmu and real AWS.

4. Run the pipeline on your laptop

Clone the project, start LocalEmu in another terminal, then deploy:

$ git clone https://github.com/localemu/localemu-examples
$ cd localemu-examples/04-image-pipeline
$ localemu start   # in a separate terminal
$ ./scripts/deploy.sh local

Eight resources apply in roughly 20 seconds. Most of that time is Lambda's container cold-start (LocalEmu pulls public.ecr.aws/lambda/python:3.12 the first time and reuses it after):

Terminal: deploy
$ ./scripts/deploy.sh local
 built /Users/.../04-image-pipeline/build/lambda.zip

aws_s3_bucket.incoming:              Creation complete after 0s
aws_s3_bucket.processed:             Creation complete after 0s
aws_sns_topic.notifications:         Creation complete after 0s
aws_iam_role.lambda_role:            Creation complete after 0s
aws_iam_role_policy.lambda_policy:   Creation complete after 0s
aws_lambda_function.processor:       Creation complete after 5s
aws_lambda_permission.allow_s3:      Creation complete after 0s
aws_s3_bucket_notification.incoming: Creation complete after 0s

Apply complete! Resources: 8 added, 0 changed, 0 destroyed.

 deployed to local. outputs:
function_name    = "le-images-processor"
incoming_bucket  = "le-images-incoming"
processed_bucket = "le-images-processed"
topic_arn        = "arn:aws:sns:us-east-1:000000000000:le-images-notifications"

real    0m19.475s

Now drive scenario A by hand: build a test PNG with Pillow, upload it to the incoming bucket with the AWS CLI pointed at LocalEmu, then read the thumbnail back from the processed bucket:

Terminal: upload a PNG, get a thumbnail back
$ # A real 800x600 PNG, generated with Pillow. 2,836 bytes on disk.
$ python3 -c "from PIL import Image, ImageDraw
img = Image.new('RGB', (800, 600), 'lightyellow')
d = ImageDraw.Draw(img)
for i, c in enumerate(['red','green','blue']):
    d.rectangle([100+i*200, 100, 200+i*200, 500], fill=c)
img.save('/tmp/diagram.png')"
$ ls -la /tmp/diagram.png
-rw-r--r--  1 you  staff  2836 May 23 17:59 /tmp/diagram.png


$ # Upload to the incoming bucket. S3 fires Lambda automatically.
$ aws --endpoint-url http://localhost:4566 \
       s3 cp /tmp/diagram.png s3://le-images-incoming/diagram.png
upload: ../../../tmp/diagram.png to s3://le-images-incoming/diagram.png


$ # Within a few seconds, the thumbnail appears in the processed bucket.
$ aws --endpoint-url http://localhost:4566 \
       s3api head-object \
       --bucket le-images-processed --key thumbnails/diagram.png
{
  "AcceptRanges":          "bytes",
  "LastModified":          "Sat, 23 May 2026 15:59:09 GMT",
  "ContentLength":         702,
  "ETag":                  "\"dd4d93bbb166b7fa0954441a0496882a\"",
  "ContentType":           "image/png",
  "ServerSideEncryption":  "AES256",
  "Metadata":              {}
}


$ # Download it and verify the dimensions in Python:
$ aws --endpoint-url http://localhost:4566 \
       s3 cp s3://le-images-processed/thumbnails/diagram.png /tmp/thumb.png
$ python3 -c "from PIL import Image; t=Image.open('/tmp/thumb.png'); print(t.size, t.format)"
(128, 96) PNG

# Source: 800x600 (4:3). Thumbnail: 128x96 (4:3). Aspect ratio preserved,
# longest side capped at 128 by the THUMBNAIL_MAX_DIM env var.
# Source: 2,836 bytes. Thumbnail: 702 bytes. ~75% reduction.

Scenarios B (SNS) and C (CloudWatch metric) are exercised by the two integration tests in tests/test_pipeline.py. The first test subscribes an ephemeral SQS queue to the SNS topic, uploads a real PNG, waits for the thumbnail, re-opens its bytes with Pillow to check they are still a valid image, and then reads the JSON summary delivered to the SQS subscriber. The second test polls CloudWatch for the FilesProcessed metric and asserts it incremented:

Terminal: pytest
$ ./scripts/test.sh local

============================= test session starts ==============================
platform darwin -- Python 3.13.12, pytest-9.0.3
collected 2 items

tests/test_pipeline.py::test_pipeline_end_to_end          PASSED
tests/test_pipeline.py::test_cloudwatch_metric_emitted    PASSED

============================== 2 passed in 3.95s ===============================

Tear the stack back down:

Terminal: teardown
$ ./scripts/teardown.sh local
Destroy complete! Resources: 8 destroyed.

 verifying teardown for prefix 'le-images' on local
  clean: nothing left behind

real    0m5.679s

Deploy 20 seconds, tests 4 seconds, teardown 6 seconds. The teardown script does more than terraform destroy; it then queries S3 and Lambda by prefix and exits non-zero if anything survived. Cheap locally, priceless on AWS: an S3 bucket you forget about costs $0.023 per GB per month forever.

5. The same code on real AWS

Apply the same Terraform and run the same tests against real AWS by passing aws instead of local:

$ ./scripts/deploy.sh aws
$ ./scripts/test.sh     aws
$ ./scripts/teardown.sh aws

Three differences are worth knowing about up front because they are where LocalEmu and real AWS diverge in practice:

The broader comparison lives in LocalEmu vs Real AWS and Known Limitations.

Get the full project

git clone https://github.com/localemu/localemu-examples : the image pipeline lives in 04-image-pipeline/ with the Terraform, the handler, the cross-platform build script, the two integration tests, and the deploy / test / teardown scripts that produced every terminal output on this page.

Where to go next