Tuning Lambda Memory Without Guessing

Lambda memory is also CPU. More memory means more CPU, which means faster runs, which can mean lower bills because you pay per millisecond. The default 128MB is almost always wrong. Eight months of tuning on real workloads has given me a repeatable process that beats guessing.

The Cost Function

Lambda price = GB-seconds used. Doubling memory doubles the rate but often more than halves the duration on CPU-bound code. Net: the bill drops. This is the non-obvious part everyone misses.

On IO-bound code (most API handlers waiting for a database), more memory does not help duration much, and the bill goes up. The shape of the workload determines the right memory setting.

The Tuning Loop

plaintext

for size in [128, 256, 512, 1024, 1536]:
    deploy handler with memory=size
    run 100 real requests
    compute avg duration and cost
plot cost vs size, pick the minimum

This is what the aws-lambda-power-tuning tool automates. I run it on any handler doing meaningful work before I consider it optimized.

Real Numbers From My Stack

Content API (IO-bound): 256MB is optimal. Going higher does not cut duration but costs more.

Image thumbnailer (CPU-bound): 1536MB is optimal. At 128MB it took 4 seconds and cost X. At 1536MB it takes 400ms and costs 0.7X. More memory, less bill. Counter-intuitive until you plot it.

Newsletter renderer (mixed): 512MB. Sweet spot between CPU time during templating and IO time during database lookups.

The Cold Start Surprise

Cold starts also scale with memory. A 128MB handler cold-starts slower per-MB-of-code than a 1024MB one on CPU-bound init code (like loading Pydantic models). For a handler behind a user-facing API, the cold-start cost is often the deciding factor, not the hot-path cost.

What I Stopped Doing

Defaulting everything to 128MB to 'save money'. It costs more, not less, on CPU work, and it ruins the user experience on cold starts. Pick memory per-handler based on data, not gut.

Tune once, measure, forget. The numbers hold for the life of the handler.

When to Retune

Memory optima shift when the handler's workload shape changes. A new database call adds IO weight. A new JSON parse adds CPU weight. If a feature change touches the handler's compute profile, I rerun the tuning loop. The cost is twenty minutes; the savings compound for the life of the handler. Annual retuning is a reasonable default even without workload changes, because provider pricing and runtime performance both shift quietly over time.