Multi-turn tool calling data preparation
Before the training can start, you need to upload all the necessary ingredients to start the training job.
For this example, we will focus on multi-turn tool calling on a file system navigation dataset, where given a conversation history of user requests and assistant tool calls, the model needs to respond with the appropriate next tool call. Unlike single-turn tool calling where each input is independent, multi-turn tool calling maintains context across the conversation. Each assistant turn produces exactly one function call. To train a model for this purpose, we will need the following:
Model compatibility
Tool calling tasks have specific model compatibility requirements:
Student models: Only GPTOSS, Qwen3 and Llama 3-family models are supported for tool-calling-closed-book and multi-turn-tool-calling-closed-book tasks.
Teacher models for multi-turn: Multi-turn tool calling (multi-turn-tool-calling-closed-book) requires one of the following teacher models:
Qwen3-235B-A22B-Instruct-2507Llama-3.1-405B-Instructopenai.gpt-oss-120b
Job description
Describes the work you expect the model to perform; you can think of it as an LLM prompt that would help the model solve your task. In practice for a multi-turn tool calling problem, we expect two components: task_description that describes the main task and tools, a list of JSON Schemas describing the available tools and their parameters (the tool schemas should follow the OpenAI format).
The expected format is a JSON file:
Train/test data
We need a training dataset to fine-tune the model for your specific task and a test dataset to evaluate its performance after fine-tuning. The more diverse and bigger the datasets, the better, but for the training stage we need only a few dozen examples (we’ll generate much more based on the examples you provide).
The expected format is CSV or JSONL with the following columns:
Conversation format
The question field contains a JSON array of conversation turns. Each turn is an object with:
role: Either"user"or"assistant"content: The text content of the message (empty string for assistant turns with tool calls)tool_calls: (assistant only) An array containing exactly one tool call made by the assistant
This format allows the model to understand the full context of the conversation before generating the next tool call.
JSONL format
question field contains a stringified JSON array representing the conversation. The answer field should contain a string representing a JSON object, not an actual JSON object (note the escaped double quotes).Understanding the conversation structure
Here’s an expanded view of what a single conversation looks like:
The model receives this conversation history and should output: {"name": "cat", "parameters": {"file_name": "config.txt"}}
Key differences from single-turn tool calling
Configuration file
The configuration file specifies the task type and training parameters.
The expected format is YAML:
For additional configuration options, see Configuration file →
