Configure your Lambda functions like a champ and let your code sail smoothly to Production

* Latest update: March 25th, 2017 - Added examples on how to use Lambda Environment Variables

First off, I have to say I am a big fan of AWS Lambda. AWS Lambda is a service that allows you to upload code to the cloud and run it without you maintaining any servers at all (a.k.a. ‘serverless’ architecture). You pay for the number of executions of your code as well as consumed resources, such as memory and compute time per execution.

AWS Lambda has allowed me to quickly build many application components. I just write my code, test it and hook it up to a trigger, such as an HTTP endpoint created using API Gateway. S3 or CloudWatch Events are also great ways to trigger Lambda functions. When I have to write new application code, AWS Lambda is becoming my preferred option.

That being said, I’ve experienced some challenges when trying to move my code across stages (or environments, however you want to call them). You know, you write your code in a development stage, once it looks good move it to a place where you can test it properly, and when it’s ready for prime-time, move it to your Production environment.

The problem

How can I keep configuration and application logic separate when I move my AWS Lambda function code across stages?

Example:

Imagine a Lambda function that fetches a file from S3, does some processing and when it’s done publishes a message to an SNS topic. In this scenario I should have separate S3 buckets and SNS topics in my DEV, TEST and PROD stages (I shouldn’t place test files in Production buckets or send test messages to Production topics, right?) Also, when I move my Lambda function from DEV to TEST and from TEST to PROD, I should not change any code in it; I should only change configuration.

Our hypothetical app would look like this:

Lambda Config scenario

The solutions

In the server world this problem is often tackled using environment variables or configuration files stored in the local file system. But in the AWS Lambda serverless world, we only have functions and not a typical server environment. We have, thankfully, function versions and aliases…

Function versions and aliases in AWS Lambda

A version in AWS Lambda is essentially a snapshot of your application code. Once you write your function and test it, you can publish a version. When you publish a version, you cannot make changes to that particular version and it is assigned a sequential number (1,2,3, etc.). AWS Lambda automatically names the working version of your code as $LATEST.

Publish new version

Over time you will have multiple versions (or snapshots) of your Lambda function. This brings us to aliases. An alias is a pointer to a particular version of your function. You name your function aliases and you control which version they point to. For example, you can have an alias named DEV that points to version $LATEST, a TEST alias that points to version 4 and a PROD alias that points to version 3. Typically your PROD alias would point to an older version than TEST and TEST to an older version than DEV.

Lambda versions and aliases

When you run a Lambda function, you can specify which alias to run. This means that you choose to run either a DEV, TEST or PROD version of your code. Also, during function execution your code can know which alias is running by looking at the invoked function ARN, which is available in the Context object. The context object is available to the Lambda function code.

For example, the invoked function ARN for my DEV alias would be:

arn:aws:lambda:us-east-1:123456789012:function:helloStagedWorld:DEV

… and it can be accessed using the context object:

context.invokedFunctionArn

So now I have versions, aliases and a way to access them in my function code. Since my function code knows which stage it is running in, the only thing I need is somewhere outside my function where I can put configuration values that vary according to the stage.

In this post I explore four ways of storing configuration and their price and performance:

  1. Use Lambda Environment Variables
  2. Store config files in an S3 bucket
  3. Store config files in a DynamoDB table
  4. Place my config files in the same package as my function code

Option 1) Use Lambda Environment Variables

AWS Lambda has a feature called Environment Variables. Environment Variables are key/value pairs that you assign to your function. You can set Environment Variables using the AWS Management console, CLI, SDK or in CloudFormation templates.

Once configured, Environment Variables are bound to a particular version of your function ($LATEST, 1, 2, 3, etc.) and are immutable.

Env Variables

Once I move this version through my different stages (DEV -> TEST -> PROD), the values for these environment variables will remain the same. The way Environment Variables work means that my immutable variable s3bucket=mylatestbucket will be carried over to DEV, TEST and PROD, which it’s not desirable. In other words, I don’t want to end up with my PROD stage having s3bucket=mylatestbucket, I want PROD to have s3bucket=myprodbucket.

That’s why I have to create one environment variable per stage and have a way to link the variable to that particular stage. Something like this:

Multiple Env Variables

This can be a problem if you have many variables and many stages. The ideal situation would be if AWS Lambda allowed us to link Environment Variables to Stages, not Versions.

Pros:

  • This is AWS’s built-in method, which means you can configure Environment Variables using CloudFormation, SAM (Serverless Application Model), the AWS console, CLI or SDK.
  • Low latency. Your function will access config values without having to fetch them from an external source.
  • You can encrypt variables using KMS.

Cons:

  • Environment Variables are linked to function versions, not aliases. This means each key/value pair is carried over across multiple stages and you end up with Production stages having access to variables that are meant for other stages, such as Development or Test. Or Test stages having access to Production values.
  • You have to define your environment variables for all stages upfront, which means you have to configure Production values while developing your function. This opens the door to human error during development that can affect Production operations.
  • While it’s true that you can use encryption helpers to encrypt the values that are displayed in the Lambda console, any developer of the function can remove a variable from the configuration and affect Production operations by mistake.
  • Overall, this solution is not as clean as it should be. You have to include stages in the variable names and write code to get the stage and variable name accordingly. It would have been much cleaner for Lambda to allow having variables linked to stages, not versions. This way you would just go to a particular stage (environment) and update variables accordingly, which is what environment variables are supposed to be about in the first place.

TIPS:

  • Enable encryption using KMS and encryption helpers in the console. This way you can restrict access to Production values to only certain IAM users. This is what the Lambda console looks like when I use encryption helpers to protect PROD values.

Encryption helper

Here is some sample code:

See Lambda function code in GitHub

Option 2) Store configuration files in S3

For this solution we create a bucket that stores the application’s configuration and create one folder per stage: $LATEST, DEV, TEST, PROD.

S3 config bucket

Why a $LATEST stage? You would typically point to DEV resources in the $LATEST stage, but it’s good to have the flexibility to point to other resources when you’re building a function. In some orgs, DEV resources are shared by developers and you don’t want to break other people’s work by tinkering with shared DEV configurations.

Now the next step is to create one config file per stage. For example, I create a file called env-config.json in the $LATEST folder of my config bucket, with the following entries:

{	
	"s3bucket": "mylatestbucket",
	"snstopic": "mylatesttopic"
}

I make sure this config file has the same name in all stages (in this example, env-config.json). The values in it will change according to which stage we are accessing (DEV, TEST, PROD)

We then implement code that accesses the invoked function alias (using context.invokedFunctionArn) and based on the alias value the function knows the location of the config file to fetch from S3.

See Lambda function code in GitHub

Pros:

  • Works well if you have many Lambda functions that use the same configuration. This way you can update a single config file and all your functions will be updated.
  • If you need to make config updates, they will be applied to separate files in each stage. Therefore you minimize the risk of breaking your PROD config due to a change in DEV or TEST.

Cons:

  • Every single Lambda function execution makes an extra API call to S3, which adds a bit of latency to your lambda execution. See performance results below.

TIPS:

  • Enable versioning in S3 for this bucket. This way it will be easier to roll back any config changes.
  • Set up bucket and IAM policies for each config file. For example, you can give your DEV alias permissions to only access the DEV config folder, the TEST alias to only access the TEST folder and so on. You can also control which users in your org have access to DEV, TEST and PROD configuration files.
  • Consider using KMS for S3 server-side encryption. This will give you additional confidence that any secrets are only accessed by the right entities.

Option 3) Store configuration values in DynamoDB

This is similar to Option 1, but instead of storing config values in S3 we do it in DynamoDB.

First we create a DynamoDB table. I am creating a table named lambda-config with a hash key named stage. I am assigning 5 read capacity units and 1 write capacity unit (since I won’t be writing to this table often). Using the AWS CLI I can create my table using the following command:

aws dynamodb create-table --table-name lambda-config --attribute-definitions AttributeName=stage,AttributeType=S --key-schema AttributeName=stage,KeyType=HASH --provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=1

Now I will insert my config values to this table.

For $LATEST:

aws dynamodb put-item --table-name lambda-config --item '{"stage": {"S": "mykey_$LATEST"},"s3bucket": {"S": "myDEVS3bucket"},"snstopic": {"S": "myDEVsnstopic"}}'

For DEV:

aws dynamodb put-item --table-name lambda-config --item '{"stage": {"S": "mykey_DEV"},"s3bucket": {"S": "myDEVS3bucket"},"snstopic": {"S": "myDEVsnstopic"}}'

TEST:

aws dynamodb put-item --table-name lambda-config --item '{"stage": {"S": "mykey_TEST"},"s3bucket": {"S": "myTESTS3bucket"},"snstopic": {"S": "myTESTsnstopic"}}'

PROD:

aws dynamodb put-item --table-name lambda-config --item '{"stage": {"S": "mykey_PROD"},"s3bucket": {"S": "myPRODS3bucket"},"snstopic": {"S": "myPRODsnstopic"}}'

I should see something like this in the Dynamo DB console:

DynamoDB config table items

See Lambda function code in GitHub

Pros:

  • Similarly to the S3-backed method, if you update a single config record in Dynamo DB multiple functions could potentially be updated.
  • Better performance than S3-backed configurations (see performance section below)

Cons:

  • Similarly to the S3-backed configuration, every single Lambda function makes an extra API call to Dynamo DB, which adds a bit of latency to the lambda function execution (see performance section below)
  • There is a cost associated to having a Dynamo DB table, but it’s not a considerable one. You would pay $1 per month for a table with 5 read operations/second. If you needed, let’s say, 100 read operations/second, you would pay about $5/month.

TIPS:

  • You can assign fine-grained IAM permissions to each item in your config tables. For example, you can configure an IAM group to have read-only access to certain items in a table, or to a whole table. You could configure a different IAM group to have write access. This way you can control who can read or write items to your DEV, TEST and PROD configurations.
  • You can use the AWS Key Management Service (KMS) to perform envelope encryption, so you can store your configuration records encrypted in your Dynamo DB table. With KMS you can generate one data encryption key per record and you can then store the encrypted data key that KMS generates for you in the corresponding Dynamo DB record. Using KMS and IAM you can protect which functions and aliases have access to which master keys. For more on envelope encryption, read this.

Option 4) Package configuration files with function code

For this solution we create a deployment package and store a configuration file in it. I created the following folder structure:

Packaged config

Under config, I placed a JSON configuration file (env-config.json) with the following entries:

{
	"$LATEST":{
  			"s3bucket": "mylatestbucket",
  			"snstopic": "mylatesttopic"
  	},
	"DEV":{
  			"s3bucket": "mydevbucket",
  			"snstopic": "mydevtopic"
  	},
	"TEST":{
  			"s3bucket": "mytestbucket",
  			"snstopic": "myteststagetopic"
	},
	"PROD":{
  			"s3bucket": "myprodbucket",
  			"snstopic": "myprodtopic"
  	}
}    

The main difference is that here a single file contains entries for all stages, as opposed to having one separate file per stage and storing these files in S3. If you need to store API keys or credentials for an external service then it’s definitely NOT a good solution to expose your PROD API keys in a file that can be accessed by any developer in the DEV stage (or to expose credentials anywhere in a code repository!)

See Lambda function code in GitHub

Pros:

  • Faster execution, since your function doesn’t have to call S3 or Dynamo. If function completion time is critical to your application, you can cut about 70ms of execution time compared to the Dynamo DB solution, or cut about 100ms execution time compared to the S3-backed configuration approach. See performance comparison below.
  • Cheaper to execute, since you don’t incur in S3 or Dynamo DB charges. However, the cost savings are not that significant (see price comparison below)
  • If you are the only developer or it does not represent a security risk to expose PROD configs to all your developers, then it might be easier to maintain a single configuration file.

Cons:

  • If you have multiple functions that access the same resources, you will have to update all functions. For example, you have 10 functions that publish to the same TEST SNS topic and one day you have to use a different TEST topic. You’ll end up updating and deploying 10 function packages. Even if your main function code is not affected, you still need to redeploy your function package if you need to make a config change.
  • You will expose PROD config values in the same file that is accessed by DEV. If you store API keys in your config files, then I would highly recommend NOT using this method.

Performance Comparison

I ran a test for 10 minutes, sending 1 request per second. I used the same code included in this post. The only logic implemented in the functions is to grab configuration values and nothing else. The S3-backed configuration took an average of ~110ms, the DynamoDB-backed configuration took an average of ~70ms and the packaged configuration approach took an average of ~2ms. All resources (Lambda functions, S3 buckets, Dynamo DB tables) were provisioned in the us-east-1 region. The metric shown below is the duration of each Lambda function execution.

Performance comparison

Price Comparison

Using Matthew Fuller’s Lambda calculator, I did a bit of number crunching:

S3-backed configuration: 1 million executions, 128MB memory each

  • 200ms billable execution time per execution (an average of 110ms rounds up to 200ms billable time): $0.62
  • S3 charges: 1 million GET requests at $0.004 per 10,000 requests = $0.4.
  • Data transfer = $0 (since we are transferring from S3 to EC2 within the same region).
  • Total = $1.02 per 1 million requests

DynamoDB-backed configuration: 1 million executions, 128MB memory each

  • 100ms billable execution time per execution (average of 70ms rounds up to 100ms billable time): $0.41
  • Dynamo DB charges (5 read, 1 write capacity units, eventually consistent): $0.97 per month
    • If I needed 100 read, 1 write, eventually consistent, I would pay $5.32/month

Packaged configuration, Environment Variables: 1 million executions with 128MB memory each

  • 100ms billable execution time per execution (an average of 3ms rounds up to 100ms billable time)
  • Total = $0.41 per 1 million requests.

With AWS Lambda you get 1 million free executions per month, even after 1 year.

To summarize

You can use AWS Lambda versions and aliases to fetch configuration values stored outside your Lambda function code. Or you can include a config file inside a function package. It costs ~100ms to fetch config values from S3 and ~70ms to fetch config values from Dynamo DB (within the same AWS region). If you use S3 or Dynamo DB to store configuration values you can implement fine-grained IAM permissions to restrict which users in your account can read or write configuration values in each development stage.

Common advantages of separating code and configuration files:

  • Once we define the values for our config files in DEV, TEST and PROD we shouldn’t change them often so that we can focus exclusively on application development.
  • The function logic doesn’t change if we are running in either DEV, TEST or PROD. This means we can move our code around stages and the only thing we need to change is versions/aliases. This will make our application code more reliable.
  • Code is easier to roll back if something goes wrong after a deployment. In such a case, you would only have to point your alias to the previous version to be back in business.
  • Deployment automation is much easier when code and configs are independent from each other.

Ernesto Marquez

ErnestoMarquezProfilePic

I am the Project Director at Concurrency Labs Ltd, ex-Amazon (AWS), Certified AWS Solutions Architect and I want to help you run AWS optimally, so your applications reliably generate revenue for your business.

Running an optimal AWS infrastructure is complicated - that's why I follow a methodology that makes it simpler to run applications that will support your business growth.

Do you want to learn more? Do you have other questions related to AWS? Click on the button below to schedule a free 30-minute consultation.

Do you have any comments or questions about this post, or my services?