In the Deployments tutorial, we looked at serving a flow that enables scheduling or creating flow runs via the Prefect API.
With our Python script in hand, we can build a Docker image for our script, allowing us to serve our flow in various remote environments. We'll use Kubernetes in this guide, but you can use any Docker-compatible infrastructure.
In this guide we'll:
Write a Dockerfile to build an image that stores our Prefect flow code.
Build a Docker image for our flow.
Deploy and run our Docker image on a Kubernetes cluster.
Look at the Prefect-maintained Docker images and discuss options for use
Note that in this guide we'll create a Dockerfile from scratch. Alternatively, Prefect makes it convenient to build a Docker image as part of deployment creation. You can even include environment variables and specify additional Python packages to install at runtime.
If creating a deployment with a prefect.yaml file, the build step makes it easy to customize your Docker image and push it to the registry of your choice. See an example here.
Deployment creation with a Python script that includes flow.deploy similarly allows you to customize your Docker image with keyword arguments as shown below.
The next file we'll add to the prefect-docker-guide directory is a requirements.txt. We'll include all dependencies required for our prefect-docker-guide-flow.py script in the Docker image we'll build.
# ensure you run this line from the top level of the `prefect-docker-guide` directory
touchrequirements.txt
Here's what we'll put in our requirements.txt file:
requirements.txt
prefect>=2.12.0
httpx
Next, we'll create a Dockerfile that we'll use to create a Docker image that will also store the flow code.
touchDockerfile
We'll add the following content to our Dockerfile:
Dockerfile
# We're using the latest version of Prefect with Python 3.10FROMprefecthq/prefect:2-python3.10# Add our requirements.txt file to the image and install dependenciesCOPYrequirements.txt.
RUNpipinstall-rrequirements.txt--trusted-hostpypi.python.org--no-cache-dir
# Add our flow code to the imageCOPYflows/opt/prefect/flows
# Run our flow script when the container startsCMD["python","flows/prefect-docker-guide-flow.py"]
After running the above command, the container should start up and serve the flow within the container!
Our container will need an API URL and network access to communicate with the Prefect API.
For this guide, we'll assume the Prefect API is running on the same machine that we'll run our container on and the Prefect API was started with prefect server start. If you're running a different setup, check out the Hosting a Prefect server guide for information on how to connect to your Prefect API instance.
To ensure that our flow container can communicate with the Prefect API, we'll set our PREFECT_API_URL to http://host.docker.internal:4200/api. If you're running Linux, you'll need to set your PREFECT_API_URL to http://localhost:4200/api and use the --network="host" option instead.
To ensure the process serving our flow is always running, we'll create a Kubernetes deployment. If our flow's container ever crashes, Kubernetes will automatically restart it, ensuring that we won't miss any scheduled runs.
First, we'll create a deployment-manifest.yaml file in our prefect-docker-guide directory:
touchdeployment-manifest.yaml
And we'll add the following content to our deployment-manifest.yaml file:
deployment-manifest.yaml
apiVersion:apps/v1kind:Deploymentmetadata:name:prefect-docker-guidespec:replicas:1selector:matchLabels:flow:get-repo-infotemplate:metadata:labels:flow:get-repo-infospec:containers:-name:flow-containerimage:prefect-docker-guide-image:latestenv:-name:PREFECT_API_URLvalue:YOUR_PREFECT_API_URL-name:PREFECT_API_KEYvalue:YOUR_API_KEY# Never pull the image because we're using a local imageimagePullPolicy:Never
Keep your API key secret
In the above manifest we are passing in the Prefect API URL and API key as environment variables. This approach is simple, but it is not secure. If you are deploying your flow to a remote cluster, you should use a Kubernetes secret to store your API key.
deployment-manifest.yaml
apiVersion:apps/v1kind:Deploymentmetadata:name:prefect-docker-guidespec:replicas:1selector:matchLabels:flow:get-repo-infotemplate:metadata:labels:flow:get-repo-infospec:containers:-name:flow-containerimage:prefect-docker-guide-image:latestenv:-name:PREFECT_API_URLvalue:<http://host.docker.internal:4200/api># Never pull the image because we're using a local imageimagePullPolicy:Never
Linux users
If you're running Linux, you'll need to set your PREFECT_API_URL to use the IP address of your machine instead of host.docker.internal.
This manifest defines how our image will run when deployed in our Kubernetes cluster. Note that we will be running a single replica of our flow container. If you want to run multiple replicas of your flow container to keep up with an active schedule, or because our flow is resource-intensive, you can increase the replicas value.
When a release is published, images are built for all of Prefect's supported Python versions.
These images are tagged to identify the combination of Prefect and Python versions contained.
Additionally, we have "convenience" tags which are updated with each release to facilitate automatic updates.
For example, when release 2.11.5 is published:
Images with the release packaged are built for each supported Python version (3.8, 3.9, 3.10, 3.11) with both standard Python and Conda.
These images are tagged with the full description, e.g. prefect:2.1.1-python3.10 and prefect:2.1.1-python3.10-conda.
For users that want more specific pins, these images are also tagged with the SHA of the git commit of the release, e.g. sha-88a7ff17a3435ec33c95c0323b8f05d7b9f3f6d2-python3.10
For users that want to be on the latest 2.1.x release, receiving patch updates, we update a tag without the patch version to this release, e.g. prefect.2.1-python3.10.
For users that want to be on the latest 2.x.y release, receiving minor version updates, we update a tag without the minor or patch version to this release, e.g. prefect.2-python3.10
Finally, for users who want the latest 2.x.y release without specifying a Python version, we update 2-latest to the image for our highest supported Python version, which in this case would be equivalent to prefect:2.1.1-python3.10.
Choose image versions carefully
It's a good practice to use Docker images with specific Prefect versions in production.
Use care when employing images that automatically update to new versions (such as prefecthq/prefect:2-python3.11 or prefecthq/prefect:2-latest).
If your flow relies on dependencies not found in the default prefecthq/prefect images, you may want to build your own image. You can either
base it off of one of the provided prefecthq/prefect images, or build your own image.
See the Work pool deployment guide for discussion of how Prefect can help you build custom images with dependencies specified in a requirements.txt file.
By default, Prefect work pools that use containers refer to the 2-latest image.
You can specify another image at work pool creation.
The work pool image choice can be overridden in individual deployments.
The options described above have different complexity (and performance) characteristics. For choosing a strategy, we provide the following recommendations:
If your flow only makes use of tasks defined in the same file as the flow, or tasks that are part of prefect itself, then you can rely on the default provided prefecthq/prefect image.
If your flow requires a few extra dependencies found on PyPI, you can use the default prefecthq/prefect image and set prefect.deployments.steps.pip_install_requirements: in the pullstep to install these dependencies at runtime.
If the installation process requires compiling code or other expensive operations, you may be better off building a custom image instead.
If your flow (or flows) require extra dependencies or shared libraries, we recommend building a shared custom image with all the extra dependencies and shared task definitions you need. Your flows can then all rely on the same image, but have their source stored externally. This option can ease development, as the shared image only needs to be rebuilt when dependencies change, not when the flow source changes.
To learn more about deploying flows, check out the Deployments concept doc!
For advanced infrastructure requirements, such as executing each flow run within its own dedicated Docker container, learn more in the Work pool deployment guide.