Distil labs inference playground
You can use the distil labs inference playground to test your trained model. The playground provides a hosted deployment endpoint that supports OpenAI-compatible inference.
The inference playground deployments are not intended for production use. Once you’re ready for production, contact us at contact@distillabs.ai and we’ll set you up.
Activating a deployment
Create a deployment for your trained model using the API:
The response includes all the information you need to query your model:
The deployment_status field indicates the current state:
building- Deployment is being provisionedactive- Ready to accept requestsinactive- Deployment has been deactivatedcredits_exhausted- No credits remaining
The client_script field contains example Python code you can use to query your model. It is important that use the exact prompt format shown in this script when querying your model (see Querying your model below).
After your deployment is set up, you can also retrieve information about it (the format will be the same as shown above).
Querying your model
The easiest way to query your model is to use the client script included in the deployment response. This script has the correct prompt format and API key already embedded.
First, extract the client script from your deployment and save it to a file (you will need jq installed):
Then run the script with your question and context. You will need the openai Python package available locally.
It’s important to use the correct system prompt and message formatting when querying your SLM. SLMs are specialized and expect exactly the same format as seen during training. Using a different system prompt or formatting will result in poor performance.
Deactivating a deployment
When you’re done testing, deactivate your deployment to conserve credits:
Using the web dashboard
You can also manage deployments through the web interface:
- Open your model from the distil labs dashboard
- Click “Deploy model” in the left navigation bar
- On the “Deploy on distil labs” tab, click the “Deploy Model” button

After clicking, the deployment process might take a few minutes. Once ready, you will see:
- the deployment endpoint URL,
- the API key,
- an example Python script to make requests against the endpoint.

Credits
Inference playground deployments require credits. When you run out of credits, you won’t be able to create new deployments and your existing deployments will be deactivated. All users get $30 of free starting credits - reach out to us at contact@distillabs.ai when you need more.
