Welcome to Happy_vLLM

Happy_vLLM is a REST API, production ready. It is based on the popular library vLLM and provide an API for it.

Installation

You can install happy_vLLM using pip:

pip install happy_vllm

Or build it from source:

git clone https://github.com/OSS-Pole-Emploi/happy_vllm.git
cd happy_vllm
pip install -e .

Quickstart

Just use the entrypoint happy-vllm (see arguments for a list of all possible arguments)

happy-vllm --model path_to_model --host 127.0.0.1 --port 5000 --model-name my_model

It will launch the API and you can directly query it for example with

curl 127.0.0.1:5000/v1/info

To get various information on the application or

curl 127.0.0.1:5000/v1/completions -d '{"prompt": "Hey,", "model": "my_model"}'

if you want to generate your first LLM response using happy_vLLM. See endpoints for more details on all the endpoints provided by happy_vLLM.

Deploy with Docker image

A docker image is available from the Github Container Registry :

docker pull ghcr.io/oss-pole-emploi/happy_vllm:latest

See deploying_with_docker for more details on how to serve happy_vLLM with docker.

Swagger

You can reach the swagger UI at the /docs endpoint (so for example by default at 127.0.0.1:5000/docs). You will be provided all the endpoints and examples on how to use them.