Run flows on Kubernetes

Run flows on Kubernetes#

This guide explains how to run flows on Kubernetes. Though much of the guide is general to any Kubernetes cluster, it focuses on Amazon Elastic Kubernetes Service (EKS). Prefect is tested against Kubernetes 1.26.0 and newer minor versions.

Prerequisites#

A Prefect Cloud account
A cloud provider (AWS, GCP, or Azure) account
Python and Prefect installed
Helm installed
Kubernetes CLI (kubectl)installed
Admin access for Prefect Cloud and your cloud provider. You can downgrade it after this setup.

Create a cluster#

If you already have one, skip ahead to the next section.

One easy way to get set up with a cluster in EKS is with [`eksctl`](https://eksctl.io/).
Node pools can be backed by either EC2 instances or FARGATE.
Choose FARGATE so there's less to manage.
The following command takes around 15 minutes and must not be interrupted:

```bash
# Replace the cluster name with your own value
eksctl create cluster --fargate --name <CLUSTER-NAME>

# Authenticate to the cluster.
aws eks update-kubeconfig --name <CLUSTER-NAME>
```

You can get a GKE cluster up and running with a few commands using the 
[`gcloud` CLI](https://cloud.google.com/sdk/docs/install).
This builds a bare-bones cluster that is accessible over the open 
internet - but it should **not** be used in a production environment.
To deploy the cluster, your project must have a VPC network configured.

First, authenticate to GCP by setting the following configuration options:

```bash
# Authenticate to gcloud
gcloud auth login

# Specify the project & zone to deploy the cluster to
# Replace the project name with your GCP project name
gcloud config set project <GCP-PROJECT-NAME>
gcloud config set compute/zone <AVAILABILITY-ZONE>
```

Next, deploy the cluster. This command takes ~15 minutes to complete.
Once the cluster has been created, authenticate to the cluster.

```bash
# Create cluster
# Replace the cluster name with your own value
gcloud container clusters create <CLUSTER-NAME> --num-nodes=1 \
--machine-type=n1-standard-2

# Authenticate to the cluster
gcloud container clusters <CLUSTER-NAME> --region <AVAILABILITY-ZONE>
```

**GCP potential errors**

ERROR: (gcloud.container.clusters.create) ResponseError: code=400, message=Service account "000000000000-compute@developer.gserviceaccount.com" is disabled.

You must enable the default service account in the IAM console, or specify a different service account with the appropriate permissions.

creation failed: Constraint constraints/compute.vmExternalIpAccess violated for project 000000000000. Add instance projects/<GCP-PROJECT-NAME>/zones/us-east1-b/instances/gke-gke-guide-1-default-pool-c369c84d-wcfl to the constraint to use external IP with it."

Organization policy blocks creation of external (public) IPs. Override this policy (if you have the appropriate permissions) under the Organizational Policy page within IAM.

Create an AKS cluster using the 
[Azure CLI](https://learn.microsoft.com/en-us/cli/azure/get-started-with-azure-cli), 
or use the Cloud Shell directly from the Azure portal [shell.azure.com](https://shell.azure.com).

First, authenticate to Azure if not already done.

```bash
  az login
```

Next, deploy the cluster - this command takes ~4 minutes to complete. 
Once the cluster is created, authenticate to the cluster.

```bash

  # Create a Resource Group at the desired location, e.g. westus
  az group create --name <RESOURCE-GROUP-NAME> --location <LOCATION>

  # Create a kubernetes cluster with default kubernetes version, default SKU load balancer (Standard) and default vm set type (VirtualMachineScaleSets)
  az aks create --resource-group <RESOURCE-GROUP-NAME> --name <CLUSTER-NAME>

  # Configure kubectl to connect to your Kubernetes cluster
  az aks get-credentials --resource-group <RESOURCE-GROUP-NAME> --name <CLUSTER-NAME>

  # Verify the connection by listing the cluster nodes
  kubectl get nodes
```

Create a container registry#

Besides a cluster, the other critical resource is a container registry. A registry is not strictly required, but in most cases you’ll want to use custom images and/or have more control over where images are stored. If you already have a registry, skip ahead to the next section.

Create a registry using the AWS CLI and authenticate the docker daemon to 
that registry:

```bash
# Replace the image name with your own value
aws ecr create-repository --repository-name <IMAGE-NAME>

# Login to ECR
# Replace the region and account ID with your own values
aws ecr get-login-password --region <REGION> | docker login \
  --username AWS --password-stdin <AWS_ACCOUNT_ID>.dkr.ecr.<REGION>.amazonaws.com
```

Create a registry using the gcloud CLI and authenticate the docker daemon to that registry:

```bash
# Create artifact registry repository to host your custom image
# Replace the repository name with your own value; it can be the
# same name as your image
gcloud artifacts repositories create <REPOSITORY-NAME> \
--repository-format=docker --location=us

# Authenticate to artifact registry
gcloud auth configure-docker us-docker.pkg.dev
```

Create a registry using the Azure CLI and authenticate the docker daemon to that registry:

```bash
# Name must be a lower-case alphanumeric
# Tier SKU can easily be updated later, e.g. az acr update --name <REPOSITORY-NAME> --sku Standard
az acr create --resource-group <RESOURCE-GROUP-NAME> \
  --name <REPOSITORY-NAME> \
  --sku Basic

# Attach ACR to AKS cluster
# You need Owner, Account Administrator, or Co-Administrator role on your Azure subscription as per Azure docs
az aks update --resource-group <RESOURCE-GROUP-NAME> --name <CLUSTER-NAME> --attach-acr <REPOSITORY-NAME>

# You can verify AKS can now reach ACR
az aks check-acr --resource-group RESOURCE-GROUP-NAME> --name <CLUSTER-NAME> --acr <REPOSITORY-NAME>.azurecr.io

```

Create a Kubernetes work pool#

Work pools allow you to manage deployment infrastructure. This section shows you how to configure the default values for your Kubernetes base job template. These values can be overridden by individual deployments.

Switch to the Prefect Cloud UI to create a new Kubernetes work pool. (Alternatively, you could use the Prefect CLI to create a work pool.)

Click on the Work Pools tab on the left sidebar
Click the + button at the top of the page
Select Kubernetes as the work pool type
Click Next to configure the work pool settings
Set the namespace field to prefect

If you set a different namespace, use your selected namespace instead of `prefect` in all commands below.

You may come back to this page to configure the work pool options at any time.

Configure work pool options#

Here are some popular configuration options.

Environment Variables

Add environment variables to set when starting a flow run. If you are using a Prefect-maintained image and haven’t overwritten the image’s entrypoint, you can specify Python packages to install at runtime with {"EXTRA_PIP_PACKAGES":"my_package"}. For example {"EXTRA_PIP_PACKAGES":"pandas==1.2.3"} installs pandas version 1.2.3. Alternatively, you can specify package installation in a custom Dockerfile, which
allows you to use image caching. As shown below, Prefect can help create a Dockerfile with your flow code and the packages specified in a requirements.txt file baked in.

Namespace

Set the Kubernetes namespace to create jobs within, such as prefect. By default, set to default.

Image

Specify the Docker container image for created jobs. If not set, the latest Prefect 3 image is used (for example, prefecthq/prefect:3-latest). You can override this on each deployment through job_variables.

Image Pull Policy

Select from the dropdown options to specify when to pull the image. When using the IfNotPresent policy, make sure to use unique image tags, or
old images may get cached on your nodes.

Finished Job TTL

Number of seconds before finished jobs are automatically cleaned up by the Kubernetes controller. Set to 60 so completed flow runs are cleaned up after a minute.

Pod Watch Timeout Seconds

Number of seconds for pod creation to complete before timing out. Consider setting to 300, especially if using a serverless type node pool, as these tend to have longer startup times.

Kubernetes cluster config

Specify a KubernetesClusterConfig block to configure the Kubernetes cluster for job creation. In most cases, leave the cluster config blank since the worker should already have appropriate access and permissions. We recommend using this setting when deploying a worker to a cluster that differs from the one executing the flow runs.

**Advanced Settings**

Modify the default base job template to add other fields or delete existing 
fields.

Select the **Advanced** tab and edit the JSON representation of the base job template.

For example, to set a CPU request, add the following section under variables:

```json
"cpu_request": {
  "title": "CPU Request",
  "description": "The CPU allocation to request for this pod.",
  "default": "default",
  "type": "string"
},
```

Next add the following to the first `containers` item under `job_configuration`:

```json
...
"containers": [
  {
    ...,
    "resources": {
      "requests": {
        "cpu": "{{ cpu_request }}"
      }
    }
  }
],
...
```

Running deployments with this work pool will request the specified CPU.

After configuring the work pool settings, move to the next screen.

Give the work pool a name and save.

Your new Kubernetes work pool should appear in the list of work pools.

Create a Prefect Cloud API key#

If you already have a Prefect Cloud API key, you can skip these steps.

To create a Prefect Cloud API key:

Log in to the Prefect Cloud UI.
Click on your profile avatar picture in the top right corner.
Click on your name to go to your profile settings.
In the left sidebar, click on API Keys.
Click the + button to create a new API key.
Securely store the API key, ideally using a password manager.

Deploy a worker using Helm#

After you create a cluster and work pool, the next step is to deploy a worker. The worker sets up the necessary Kubernetes infrastructure to run your flows. The recommended method for deploying a worker is with the Prefect Helm Chart.

Add the Prefect Helm repository#

Add the Prefect Helm repository to your Helm client:

helm repo add prefect https://prefecthq.github.io/prefect-helm
helm repo update

Create a namespace#

Create a new namespace in your Kubernetes cluster to deploy the Prefect worker:

kubectl create namespace prefect

Create a Kubernetes secret for the Prefect API key#

kubectl create secret generic prefect-api-key \
--namespace=prefect --from-literal=key=your-prefect-cloud-api-key

Configure Helm chart values#

Create a values.yaml file to customize the Prefect worker configuration. Add the following contents to the file:

worker:
  cloudApiConfig:
    accountId: <target account ID>
    workspaceId: <target workspace ID>
  config:
    workPool: <target work pool name>

These settings ensure that the worker connects to the proper account, workspace, and work pool.

View your Account ID and Workspace ID in your browser URL when logged into Prefect Cloud. For example: <https://app.prefect.cloud/account/abc-my-account-id-is-here/workspaces/123-my-workspace-id-is-here>.

Create a Helm release#

Install the Prefect worker using the Helm chart with your custom values.yaml file:

helm install prefect-worker prefect/prefect-worker \
  --namespace=prefect \
  -f values.yaml

Verify deployment#

Check the status of your Prefect worker deployment:

kubectl get pods -n prefect

Define a flow#

Start simple with a flow that just logs a message. In a directory named flows, create a file named hello.py with the following contents:

from prefect import flow, tags
from prefect.logging import get_run_logger

@flow
def hello(name: str = "Marvin"):
    logger = get_run_logger()
    logger.info(f"Hello, {name}!")

if __name__ == "__main__":
    with tags("local"):
        hello()

Run the flow locally with python hello.py to verify that it works. Use the tags context manager to tag the flow run as local. This step is not required, but does add some helpful metadata.

Define a Prefect deployment#

Prefect has two recommended options for creating a deployment with dynamic infrastructure. You can define a deployment in a Python script using the flow.deploy mechanics or in a prefect.yaml definition file. The prefect.yaml file currently allows for more customization in terms of push and pull steps.

To learn about the Python deployment creation method with flow.deploy see Workers.

The prefect.yaml file is used by the prefect deploy command to deploy your flows. As a part of that process it also builds and pushes your image. Create a new file named prefect.yaml with the following contents:

# Generic metadata about this project
name: flows
prefect-version: 3.0.0

# build section allows you to manage and build docker images
build:
- prefect_docker.deployments.steps.build_docker_image:
    id: build-image
    requires: prefect-docker>=0.4.0
    image_name: "{{ $PREFECT_IMAGE_NAME }}"
    tag: latest
    dockerfile: auto
    platform: "linux/amd64"

# push section allows you to manage if and how this project is uploaded to remote locations
push:
- prefect_docker.deployments.steps.push_docker_image:
    requires: prefect-docker>=0.4.0
    image_name: "{{ build-image.image_name }}"
    tag: "{{ build-image.tag }}"

# pull section allows you to provide instructions for cloning this project in remote 
locations
pull:
- prefect.deployments.steps.set_working_directory:
    directory: /opt/prefect/flows

# the definitions section allows you to define reusable components for your deployments
definitions:
  tags: &common_tags
    - "eks"
  work_pool: &common_work_pool
    name: "kubernetes"
    job_variables:
      image: "{{ build-image.image }}"

# the deployments section allows you to provide configuration for deploying flows
deployments:
- name: "default"
  tags: *common_tags
  schedule: null
  entrypoint: "flows/hello.py:hello"
  work_pool: *common_work_pool

- name: "arthur"
  tags: *common_tags
  schedule: null
  entrypoint: "flows/hello.py:hello"
  parameters:
    name: "Arthur"
  work_pool: *common_work_pool

We define two deployments of the hello flow: default and arthur. By specifying dockerfile: auto, Prefect automatically creates a dockerfile that installs any requirements.txt and copies over the current directory.

You can pass a custom Dockerfile instead with dockerfile: Dockerfile or dockerfile: path/to/Dockerfile. We are specifically building for the linux/amd64 platform. This specification is often necessary when images are built on Macs with M series chips but run on cloud provider instances.

**Deployment specific build, push, and pull**

You can override the build, push, and pull steps for each deployment.
This allows for more custom behavior, such as specifying a different image for each 
deployment.

Define your requirements in a requirements.txt file:

prefect>=3.0.0
prefect-docker>=0.4.0
prefect-kubernetes>=0.3.1

The directory should now look something like this:

.
├── prefect.yaml
└── flows
    ├── requirements.txt
    └── hello.py

Tag images with a Git SHA#

If your code is stored in a GitHub repository, it’s good practice to tag your images with the Git SHA of the code used to build it. Do this in the prefect.yaml file with a few minor modifications, since it’s not yet an option with the Python deployment creation method.

Use the run_shell_script command to grab the SHA and pass it to the tag parameter of build_docker_image:

build:
- prefect.deployments.steps.run_shell_script:
    id: get-commit-hash
    script: git rev-parse --short HEAD
    stream_output: false
- prefect_docker.deployments.steps.build_docker_image:
    id: build-image
    requires: prefect-docker>=0.4.0
    image_name: "{{ $PREFECT_IMAGE_NAME }}"
    tag: "{{ get-commit-hash.stdout }}"
    dockerfile: auto
    platform: "linux/amd64"

Set the SHA as a tag for easy identification in the UI:

definitions:
  tags: &common_tags
    - "eks"
    - "{{ get-commit-hash.stdout }}"
  work_pool: &common_work_pool
    name: "kubernetes"
    job_variables:
      image: "{{ build-image.image }}"

Authenticate to Prefect#

Before deploying the flows to Prefect, you need to authenticate through the Prefect CLI. You also need to ensure that all of your flow’s dependencies are present at deploy time.

This example uses a virtual environment to ensure consistency across environments.

# Create a virtualenv & activate it
virtualenv prefect-demo
source prefect-demo/bin/activate

# Install dependencies of your flow
prefect-demo/bin/pip install -r requirements.txt

# Authenticate to Prefect & select the appropriate
# workspace to deploy your flows to
prefect-demo/bin/prefect cloud login

Deploy the flows#

You’re ready to deploy your flows to build your images. The image name determines its registry. You have configured our prefect.yaml file to get the image name from the PREFECT_IMAGE_NAME environment variable, so set that first:

```bash
export PREFECT_IMAGE_NAME=<AWS_ACCOUNT_ID>.dkr.ecr.<REGION>.amazonaws.com/<IMAGE-NAME>
```

```bash
export PREFECT_IMAGE_NAME=us-docker.pkg.dev/<GCP-PROJECT-NAME>/<REPOSITORY-NAME>/<IMAGE-NAME>
```

```bash
export PREFECT_IMAGE_NAME=<REPOSITORY-NAME>.azurecr.io/<IMAGE-NAME>
```

To deploy your flows, ensure your Docker daemon is running. Deploy all the flows with prefect deploy --all or deploy them individually by name: prefect deploy -n hello/default or prefect deploy -n hello/arthur.

Run the flows#

Once the deployments are successfully created, you can run them from the UI or the CLI:

prefect deployment run hello/default
prefect deployment run hello/arthur

You can now check the status of your two deployments in the UI.