How do I set a custom gunicorn worker timeout when serving an MLflow model with the "mlflow models s

Publish date: 2024-06-16

When serving an MLflow Python model with the "pyfunc" backend (https://github.com/mlflow/mlflow/blob/master/mlflow/pyfunc/backend.py), how can I set a custom gunicorn worker timeout? The default timeout of 60 seconds may be insufficient when serving large models that take a long time to load.

2 Answers

As of MLflow 1.2, you can set a custom gunicorn timeout by specifying the GUNICORN_CMD_ARGS environment variable. The following example serves a model with a worker timeout of 120 seconds

GUNICORN_CMD_ARGS="--timeout 120" mlflow models serve --model-uri /path/to/model

0

mlflow allows setting these options from the cli:

Example: mlflow models serve ... --timeout 180

Official documentation (mlflow models serve --help):

-t, --timeout TEXT Timeout in seconds to serve a request (default: 60).

ncG1vNJzZmirpJawrLvVnqmfpJ%2Bse6S7zGiorp2jqbawutJobHBtZ22AdoCOoaawZZSkeqp50p6rZpldmMK0wM6mZKCtnp6wsL7NZq6oqpuav27AyKacqK2kYsSpsc1mqp6qpp67qHnAp2SmpJahvLh5zKibnqRdrLa1tA%3D%3D