Technical
Tuning Lambda Memory Without Guessing
Lambda memory is also CPU. More memory means more CPU, which means faster runs, which can mean lower bills because you pay per millisecond. The default 128MB is almost always wrong. Eight months of tuning on real workloads has given me a repeatable process that beats guessing.
The Cost Function
Lambda price = GB-seconds used. Doubling memory doubles the rate but often more than halves the duration on CPU-bound code. Net: the bill drops. This is the non-obvious part everyone misses.
On IO-bound code (most API handlers waiting for a database), more memory does not help duration much, and the bill goes up. The shape of the workload determines the right memory setting.
The Tuning Loop
for size in [128, 256, 512, 1024, 1536]:
deploy handler with memory=size
run 100 real requests
compute avg duration and cost
plot cost vs size, pick the minimumThis is what the aws-lambda-power-tuning tool automates. I run it on any handler doing meaningful work before I consider it optimized.
Real Numbers From My Stack
Content API (IO-bound): 256MB is optimal. Going higher does not cut duration but costs more.
Image thumbnailer (CPU-bound): 1536MB is optimal. At 128MB it took 4 seconds and cost X. At 1536MB it takes 400ms and costs 0.7X. More memory, less bill. Counter-intuitive until you plot it.
Newsletter renderer (mixed): 512MB. Sweet spot between CPU time during templating and IO time during database lookups.
The Cold Start Surprise
Cold starts also scale with memory. A 128MB handler cold-starts slower per-MB-of-code than a 1024MB one on CPU-bound init code (like loading Pydantic models). For a handler behind a user-facing API, the cold-start cost is often the deciding factor, not the hot-path cost.
What I Stopped Doing
Defaulting everything to 128MB to 'save money'. It costs more, not less, on CPU work, and it ruins the user experience on cold starts. Pick memory per-handler based on data, not gut.
Tune once, measure, forget. The numbers hold for the life of the handler.
When to Retune
Memory optima shift when the handler's workload shape changes. A new database call adds IO weight. A new JSON parse adds CPU weight. If a feature change touches the handler's compute profile, I rerun the tuning loop. The cost is twenty minutes; the savings compound for the life of the handler. Annual retuning is a reasonable default even without workload changes, because provider pricing and runtime performance both shift quietly over time.
RELATED READING
The Consulting Shift I Am Making In Year Two
After a year of writing and building, my consulting practice is changing shape. Shorter engagements. Sharper outcomes.
ReadThe Frontend Shift: Shipping Less JavaScript In Year Two
A year ago I reached for Next.js for everything. This year I often reach for nothing.
ReadThe Serverless Lesson I Would Write On A Sticky Note
After a year of shipping serverless projects, one rule explains most of the wins and all of the losses.
Read