Learn about workers#
Workers are lightweight polling services that retrieve scheduled runs from a work pool and execute them.
Workers each have a type corresponding to the execution environment to submit flow runs to. Workers can only poll work pools that match their type. As a result, when deployments are assigned to a work pool, you know in which execution environment scheduled flow runs for that deployment will run.
The following diagram summarizes the architecture of a worker-based work pool deployment:
%%{
init: {
'theme': 'neutral',
'themeVariables': {
'margin': '10px'
}
}
}%%
flowchart TD
subgraph your_infra["Your Execution Environment"]
worker["Worker"]
subgraph flow_run_infra[Infrastructure]
flow_run_a(("Flow Run A"))
end
subgraph flow_run_infra_2[Infrastructure]
flow_run_b(("Flow Run B"))
end
end
subgraph api["Prefect API"]
Deployment --> |assigned to| work_pool
work_pool(["Work Pool"])
end
worker --> |polls| work_pool
worker --> |creates| flow_run_infra
worker --> |creates| flow_run_infra_2
The worker is in charge of provisioning the flow run infrastructure.
Worker types#
Below is a list of available worker types. Most worker types require installation of an additional package.
Worker Type |
Description |
Required Package |
---|---|---|
Executes flow runs in subprocesses |
||
Executes flow runs as Kubernetes jobs |
|
|
Executes flow runs within Docker containers |
|
|
Executes flow runs as ECS tasks |
|
|
Executes flow runs as Google Cloud Run jobs |
|
|
Executes flow runs as Google Cloud Vertex AI jobs |
|
|
Execute flow runs in ACI containers |
|
If you don’t see a worker type that meets your needs, consider developing a new worker type.
Worker options#
Workers poll for work from one or more queues within a work pool. If the worker references a work queue that doesn’t exist, it is created automatically. The worker CLI infers the worker type from the work pool. Alternatively, you can specify the worker type explicitly. If you supply the worker type to the worker CLI, a work pool is created automatically if it doesn’t exist (using default job settings).
Configuration parameters you can specify when starting a worker include:
Option |
Description |
---|---|
|
The name to give to the started worker. If not provided, a unique name will be generated. |
|
The work pool the started worker should poll. |
|
One or more work queue names for the worker to pull from. If not provided, the worker pulls from all work queues in the work pool. |
|
The type of worker to start. If not provided, the worker type is inferred from the work pool. |
|
The amount of time before a flow run’s scheduled start time to begin submission. Default is the value of |
|
Only run worker polling once. By default, the worker runs forever. |
|
The maximum number of flow runs to start simultaneously. |
|
Start a healthcheck server for the worker. |
|
Install policy to use workers from Prefect integration packages. |
You must start a worker within an environment to access or create the required infrastructure to execute flow runs.
The worker will deploy flow runs to the infrastructure corresponding to the worker type. For example, if you start a worker with
type kubernetes
, the worker deploys flow runs to a Kubernetes cluster.
Prefect must be installed in any environment (for example, virtual environment, Docker container) where you intend to run the worker or execute a flow run.
`PREFECT_API_URL` must be set for the environment where your worker is running. You must also have a user or service account
with the `Worker` role, which you can configure by setting the `PREFECT_API_KEY`.
Worker status#
Workers have two statuses: ONLINE
and OFFLINE
. A worker is online if it sends regular heartbeat messages to the Prefect API.
If a worker misses three heartbeats, it is considered offline. By default, a worker is considered offline a maximum of 90 seconds
after it stopped sending heartbeats, but you can configure the threshold with the PREFECT_WORKER_HEARTBEAT_SECONDS
setting.
Worker logs#
Workers send logs to the Prefect Cloud API if you’re connected to Prefect Cloud.
All worker logs are automatically sent to the Prefect Cloud API
Logs are accessible through both the Prefect Cloud UI and API
Each flow run will include a link to its associated worker’s logs
Worker details#
The Worker Details page shows you three key areas of information:
Worker status
Installed Prefect version
Installed Prefect integrations (e.g.,
prefect-aws
,prefect-gcp
)Live worker logs (if worker logging is enabled)
Access a worker’s details by clicking on the worker’s name in the Work Pool list.
Start a worker#
Use the prefect worker start
CLI command to start a worker. You must pass at least the work pool name.
If the work pool does not exist, it will be created if the --type
flag is used.
prefect worker start -p [work pool name]
For example:
prefect worker start -p "my-pool"
Results in output like this:
Discovered worker type 'process' for work pool 'my-pool'.
Worker 'ProcessWorker 65716280-96f8-420b-9300-7e94417f2673' started!
In this case, Prefect automatically discovered the worker type from the work pool.
To create a work pool and start a worker in one command, use the --type
flag:
prefect worker start -p "my-pool" --type "process"
Worker 'ProcessWorker d24f3768-62a9-4141-9480-a056b9539a25' started!
06:57:53.289 | INFO | prefect.worker.process.processworker d24f3768-62a9-4141-9480-a056b9539a25 - Worker pool 'my-pool' created.
In addition, workers can limit the number of flow runs to start simultaneously with the --limit
flag.
For example, to limit a worker to five concurrent flow runs:
prefect worker start --pool "my-pool" --limit 5
Configure prefetch#
By default, the worker submits flow runs 10 seconds before they are scheduled to run. This allows time for the infrastructure to be created so the flow run can start on time.
In some cases, infrastructure takes longer than 10 seconds to start the flow run. You can increase the prefetch time with the
--prefetch-seconds
option or the PREFECT_WORKER_PREFETCH_SECONDS
setting.
If this value is more than the amount of time it takes for the infrastructure to start, the flow run will wait until its scheduled start time.
Polling for work#
Workers poll for work every 15 seconds by default. You can configure this interval in your profile settings
with the
PREFECT_WORKER_QUERY_SECONDS
setting.
Install policy#
The Prefect CLI can install the required package for Prefect-maintained worker types automatically. Configure this behavior
with the --install-policy
option. The following are valid install policies:
Install Policy |
Description |
---|---|
|
Always install the required package. Updates the required package to the most recent version if already installed. |
|
Install the required package if it is not already installed. |
|
Never install the required package. |
|
Prompt the user to choose whether to install the required package. This is the default install policy. |
If |
Additional resources#
See how to daemonize a Prefect worker.
See more information on overriding a work pool’s job variables.