Creating AWS Lambda zip files with Pex

Update: See also my new tool that simplifies this.

So, you want to deploy a Python script to AWS Lambda and you have a few dependencies with native code. How do you build the .zip deployment package for it?

Let’s say that you have your script in lambda_function.py and your dependencies listed in requirements.txt. If you want to follow along at home, I’ve prepared a GitHub repo with an example script.

AWS Lambda’s documentation suggests a pip invocation that looks like this for installing the dependencies in the directory package:

pip install \
  --platform manylinux2014_x86_64 \
  --target=package \
  --implementation cp \
  --python-version 3.12 \
  --only-binary=:all: --upgrade \
  -r requirements.txt

You can then create a .zip file like this:

cd package
zip -r ../package.zip .
cd ..
zip package.zip lambda_function.py

And that’s it: package.zip is your deployment artifact.

However, the pip invocation above does not always result in a correct deployment package due to a shortcoming in pip. If your local environment does not match the AWS Lambda platform, the result may be wrong.

Hopefully the issue is fixed some day. Meanwhile, a common solution is to run the same command inside a Docker container. That works, but Docker on macOS is annoyingly slow. Wouldn’t it be great to have a correct solution without Docker?

Turns out pex offers one.

Let’s do it with pex

Pex is a tool for generating Python Executable files. It allows you take a Python program and all its dependencies and wrap them into a single .pex file that can be executed with python. The idea is similar to uberjars that are used to deploy Java and Clojure programs.

Taking a Python program and its dependencies and wrapping them into a single .zip file is what we want to do for AWS Lambda and pex’s new pex3 variant can do that. You can provide it with “complete platform information” that allows it to choose the right wheels, unlike pip.

You can install pex with pipx:

pipx install pex

Note that this installs two binaries, pex and pex3. They have different features and command-line interfaces. We will use pex3.

Getting the complete platform information

You can get the complete platform information for your local environment like this:

pex3 interpreter inspect --markers --tags

The result is a large JSON blob containing environment information such as Python version and the list of platform tags compatible with your environment.

However, we do not want platform information for your laptop. Instead, we need it for your AWS Lambda environment. Huon Wilson offers a solution on the issue tracker of Pants build system: upload the following code to AWS Lambda and run it.

import subprocess

def lambda_handler(event, context):
    subprocess.run(
        """
        pip install --target=/tmp/subdir pex
        PYTHONPATH=/tmp/subdir /tmp/subdir/bin/pex3 interpreter inspect --markers --tags
        """,
        shell=True
    )
    return {
        'statusCode': 200,
        'body': "{}",
    }

Grab the result from the logs and store it in a file called complete_platform.json.

It’s crude but effective. I’ve run the code on AWS Lambda for Python 3.12 on x86_64. You can see the result on GitHub.

I don’t know how often AWS changes their Python environment in such a way that you would need to generate a new file. My guess would be that not very often.

Building the deployment package

pex3 venv create will build a Lambda-compatible zip file for you if you use --layout flat-zipped. Like this:

pex3 venv create \
  --layout flat-zipped \
  --dir package \
  --complete-platform complete_platform.json
  --no-build \
  -r requirements.txt
zip package.zip lambda_function.py

And that’s it! Upload package.zip to AWS Lambda and try it out.

What if there are no pre-built wheels?

You might encounter a dependency with native code and no pre-built wheels. Unfortunately Pex won’t magically set up a cross-compiling environment for you. Using Docker might really be the easiest solution for ensuring a consistent build environment.

An exercise for the reader

Wouldn’t it be cool if there was a simple tool that built AWS Lambda deployment packages quickly and correctly? Pip and pex and Docker get the job done, but it feels complicated.

Considering how popular both Python and AWS Lambda are, I’m surprised that there does not seem to be popular tool that would just do it.

poetry bundle lambda, anyone?