our AWS bill arrives at the end of the month. Lambda costs have tripled, but the reasons remain unclear. Could it be the new feature deployment? Those AI integrations in testing? Or perhaps forgotten functions quietly running in the background?
As serverless computing becomes mainstream, managing Lambda costs has grown both more important and more complex. The good news? Lambda cost optimization doesn't need to be a black box. With the right knowledge and tools, any team can build cost-efficient serverless applications.
This guide breaks down everything from basic pricing concepts to advanced optimization strategies, helping teams keep Lambda costs in check without compromising performance.
Lambda pricing works like a modern utility bill - you pay for what you use, but several meters run simultaneously. Here's what makes up a Lambda bill:
AWS charges $0.20 per million Lambda function invocations. These invocations can come from multiple sources:
Each invocation counts towards your bill, regardless of whether the function executes successfully or fails.
AWS charges based on the time your function runs, measured in milliseconds. AWS Lambda functions can be configured to run upto 15 minutes per execution.
Functions can use anywhere from 128MB to 10GB of memory. Memory allocation affects more than just RAM - it determines CPU power, network bandwidth, and disk I/O.
Lambda allows you to configure memory in 1MB increments from 128MB to 10,240MB (10GB). AWS charges you based on the memory you configure, not what your function actually uses. So if you configure 1GB but only use 512MB, you're still paying for 1GB.
For example, if your function:
Your cost calculation considers:
Ever wondered why your Lambda functions cost more than they should? Let's talk about right-sizing - it's like finding the perfect fit for your functions. Not too big, not too small, but just right.
Here's something interesting: when you give your Lambda function more memory, it also gets more CPU power and network bandwidth. But here's the catch - you'll pay more per millisecond. However, because it runs faster, the total cost might actually be lower.
Let's talk about a common misconception with Lambda memory settings. Many developers default to 128MB thinking it'll be cheaper, but here's the reality: unless you're using Rust, 128MB is rarely the cost-effective choice, even for simple functions.
Here's why:
Right-sizing Lambda functions is about finding the optimal balance between memory, performance, and cost. Let's look at this with a typical image processing function:
def resize_image(event, context):
# Download 5MB image from S3
image = download_from_s3(event['bucket'], event['key'])
# Resize using Pillow
resized = resize_image(image, width=800, height=600)
# Upload back to S3
upload_to_s3(event['output_bucket'], event['key'], resized)
Running this with different memory configurations reveals:
The sweet spot? 512MB provides the best cost-performance ratio. Beyond this, performance gains diminish while costs continue to rise.
AWS Lambda Power Tuning, an open-source tool, automates this optimization process. It runs your function with different memory configurations and shows you the sweet spot between cost and performance. It:
When tuning your functions:
Please read an excellent article on using Lambda Power Tuning in practice.
Every time your Lambda function starts up, it runs all the code in your function file. Here's the key: code outside your handler function only runs during cold starts, while code inside handler function runs every single time the function is invoked.
Think about this common scenario:
# This code runs on every invocation - not efficient!
def handler(event, context):
# Loading configuration each time
config = load_configuration()
# Creating new database connection each time
db = create_db_connection()
# Process data
result = process_with_config(event['data'], config)
Instead, move expensive operations outside the handler:
# This runs only during cold starts
config = load_configuration()
db = create_db_connection()
def handler(event, context):
# Just use the already initialized resources
result = process_with_config(event['data'], config)
This simple change can save significant money. Why? Because you're not paying for the same initialization tasks over and over again. Common things to move outside the handler include database connections, HTTP clients, ML model loading, and SDK client initialization.
Opening and closing connections for every function invocation is like taking a new car from the dealership every time you need to drive somewhere. Not only is it slow, but it's also expensive in terms of compute time.
Here's what often happens:
def handler(event, context):
# Creating new connections every time - expensive!
db = create_db_connection()
http_client = create_http_client()
result = process_data(db, http_client)
# Closing connections
db.close()
http_client.close()
Instead, maintain connections across invocations:
# Create once, reuse many times
db = create_db_connection_pool()
http_client = create_reusable_http_client()
def handler(event, context):
# Just use existing connections
result = process_data(db, http_client)
Remember to handle connection errors gracefully - connections might go stale between invocations, so always have a fallback plan.
Processing items one by one is like doing your laundry one sock at a time - it works, but it's not efficient. When dealing with multiple records, batch processing can significantly reduce costs.
Here's a typical scenario:
# Processing one at a time - inefficient
def handler(event, context):
for record in event['Records']:
process_single_record(record) # Each record = new database call
A better approach:
def handler(event, context):
# Process records in batches of 25
for batch in chunk_records(event['Records'], 25):
process_batch(batch) # One database call for 25 records
Why does this matter? Because each database call or API request has overhead. By batching, you're reducing the number of calls, which means less compute time and lower costs. Just remember to handle errors appropriately - you don't want one bad record to fail the entire batch.
Lambda now supports filtering at the event source mapping level, reducing unnecessary invocations and costs. This works with services like SQS, DynamoDB Streams, and Kinesis. This is one of the most overlooked ways to save on Lambda costs.
Here's how event filtering makes a difference:
Without filtering, you might write code like this:
def handler(event, context):
for record in event['Records']:
# Your function gets charged even if it does nothing
if record['temperature'] < 30:
return
# Only care about high temperatures
send_temperature_alert(record)
A better way using event filtering:
aws lambda create-event-source-mapping \
--function-name temperature-evaluator \
--batch-size 100 \
--starting-position LATEST \
--event-source-arn arn:aws:kinesis:us-east-1:123456789012:stream/temperature-telemetry \
--filter-criteria '{"Filters": [{"Pattern": "{\"temperature\": [{\"numeric\": [\"<\", 30]}]}"}]}'
Now your function only runs (and costs money) when it actually needs to do something. This is particularly powerful when you're processing:
For message queues, you can filter based on message attributes:
def handler(event, context):
for record in event['Records']:
# No need for this anymore
if record['messageAttributes']['priority'] != 'HIGH':
return
process_priority_message(record)
Instead, set up the filter:
aws lambda create-event-source-mapping \
--function-name ProcessHighPriorityMessages \
--batch-size 10 \
--maximum-batching-window-in-seconds 5 \
--event-source-arn "arn:aws:sqs:us-east-1:123456789012:MyQueue" \
--filter-criteria '{
"Filters": [
{
"Pattern": "{\"body\": {\"priority\": [\"HIGH\"]}}"
}
]
}'
Now, here's something interesting: Graviton2 processors. AWS's Arm-based processors aren't just marketing hype - they can cut your Lambda costs by 20%. The best part? For interpreted languages like Python, Node.js, and Ruby, it's often as simple as changing one configuration:
aws lambda update-function-configuration \
--function-name my-function \
--architectures arm64
But remember to test thoroughly - some dependencies might need arm64-compatible versions.
Let's talk about Provisioned Concurrency - a powerful feature that's often misused. Think of it as reserving warmed-up instances of your function. Sounds great, right? But it's not always the answer.
Here's when you should consider it:
# Function with complex initialization
def handler(event, context):
if is_cold_start():
# Loading ML model - takes 5-10 seconds
model = load_large_ml_model()
# Loading custom libraries
initialize_dependencies()
# Warming up database connections
setup_database_pool()
For this kind of function, cold starts are painful. But before jumping to Provisioned Concurrency, ask yourself:
Here's how to implement it smartly. However instead of doing it manually you use Application Auto Scaling to adjust Provisioned Concurrency automatically.
# Before peak hours
aws lambda put-provisioned-concurrency-config \
--function-name my-function \
--qualifier my-alias \
--provisioned-concurrent-executions 10
# After peak hours
aws lambda put-provisioned-concurrency-config \
--function-name my-function \
--qualifier my-alias \
--provisioned-concurrent-executions 2
When it comes to Lambda functions, the size and organization of your dependencies directly impact both cold start times and overall performance. Here's how to optimize them:
Instead of packaging all dependencies with each function:
# Without layers - each function packages dependencies separately
├── function1/
│ ├── pandas/
│ ├── numpy/
│ └── handler.py
├── function2/
│ ├── pandas/
│ ├── numpy/
│ └── handler.py
Create shared layers:
# Create layer
aws lambda publish-layer-version \
--layer-name common-dependencies \
--description "Common data processing libraries" \
--zip-file fileb://dependencies.zip \
--compatible-runtimes python3.8 python3.9
# Attach to function
aws lambda update-function-configuration \
--function-name MyFunction \
--layers arn:aws:lambda:region:account:layer:common-dependencies:1
# Bad: Importing inside handler
def handler(event, context):
import pandas as pd # Cold start penalty
return process_data(event)
# Good: Import at module level
import pandas as pd
def handler(event, context):
return process_data(event)
You know what's funny about Lambda costs? It's rarely the compute time that breaks the bank. Let's talk about those hidden costs that can turn your serverless dream into a billing nightmare.
Ever had a developer who loves debug logs? You know, the type who logs everything "just in case"? Don't log everything, be smart and log what matters. I know it is easier said than done.
Pro tip: Set CloudWatch log retention periods. Nobody needs debug logs from six months ago!
Here's a classic mistake: processing data in the wrong region. Imagine your S3 bucket is in us-east-1, but your Lambda is in us-west-2. Every byte transferred is adding to your bill! Keep your resources in the same region when possible. When you can't, consider using S3 cross-region replication instead of pulling data on-demand.
Let's talk about how Lambda functions can trigger unexpected costs in other AWS services. It's like pulling a thread on a sweater - one small action can unravel into bigger costs.
API Gateway Calls
# Expensive: Calling Lambda through API Gateway for internal services
def handler(event, context):
# Don't do this for internal service communication
response = requests.get(
'https://api-gateway-url/prod/internal-service'
)
Instead, use direct Lambda invocation or EventBridge for service-to-service communication. API Gateway is great for external APIs, but it adds unnecessary costs for internal calls.
Remember: The best Lambda function is often the one that does less. Keep it focused, keep it simple.
Instead of trying to optimize everything at once, focus on high-impact changes first:
These are optimizations that give significant benefits with minimal risk. Common examples include:
These are larger changes that require more planning but offer substantial benefits:
# Before: Synchronous chain of operations
def handler(event, context):
# Process order synchronously
order = validate_order(event['order'])
payment = process_payment(order)
inventory = update_inventory(order)
notification = send_notification(order)
# Wait for all operations to complete
return {
'order': order,
'payment': payment,
'inventory': inventory,
'notification': notification
}
# After: Event-driven processing
def handler(event, context):
# Only validate and start the process
order = validate_order(event['order'])
# Publish event for asynchronous processing
eventbridge.put_events(
Entries=[{
'Source': 'order-service',
'DetailType': 'OrderValidated',
'Detail': json.dumps({
'orderId': order['id'],
'customerId': order['customerId'],
'items': order['items']
})
}]
)
return {
'orderId': order['id'],
'status': 'processing'
}
Each downstream service (payment, inventory, notification) can then process the order independently, reducing the main function's duration and cost.
Think of CloudWatch metrics as your Lambda function's vital signs. Just like a doctor monitors your heart rate and blood pressure, you need to keep an eye on certain key indicators of your function's health and cost efficiency.
Here's what you should watch closely:
Watching for patterns is like being a detective. You're looking for clues that might indicate wasteful spending:
Don't wait for the bill to know there's a problem. Set up alerts that act like an early warning system:
The key is to catch issues before they become expensive problems. For example, if a function normally processes images in 2 seconds, set an alert for anything taking over 5 seconds - it might indicate a problem that's costing you money.
Make it a habit to review your Lambda costs weekly. Look for:
Think of it like reviewing your monthly credit card statement - regular checks help you catch unnecessary spending before it gets out of hand.
Let's talk about making cost optimization part of your team's DNA without adding bureaucracy. It's about finding that sweet spot between control and flexibility.
Start with design reviews. Before any new Lambda function goes live, have a quick chat about:
In addition, set some ground rules that everyone can follow:
CloudYali makes tracking these practices simple with its tag standardization dashboard and custom cost reports - you can easily see which functions need attention and how costs break down across teams and environments. Plus, it helps identify easy optimization opportunities so teams can focus on high-impact improvements first.
Implement proper error handling to prevent wasteful retries.
Avoid direct Lambda chaining:
# Don't do this
def handler(event, context):
result = process_data(event)
# Direct Lambda invocation
lambda_client.invoke(
FunctionName='next-function',
Payload=json.dumps(result)
)
# Do this instead - use event-driven patterns
def handler(event, context):
result = process_data(event)
# Publish event for asynchronous processing
eventbridge.put_events(
Entries=[{
'Source': 'my-service',
'DetailType': 'DataProcessed',
'Detail': json.dumps(result)
}]
)
Optimizing AWS Lambda costs doesn't have to be complicated. Start with the basics - right-sizing, efficient coding, and proper monitoring - and build from there. Remember, it's not about minimizing every cost, but about finding the right balance between performance and spending.
The landscape of serverless computing keeps evolving, and in 2025, having a good handle on your Lambda costs is more important than ever.
Keep this guide handy as you optimize your Lambda functions. And if you're looking for a way to make cost optimization easier, give CloudYali a try - it'll help you spot optimization opportunities and keep your Lambda costs in check.
Remember: The best cost optimization strategy is the one your team will actually use. Start small, measure the impact, and keep improving. Your AWS bill will thank you!
Get the latest updates, news, and exclusive offers delivered to your inbox.
Stay up to date with our informative blog posts.