Language model settings
The language model (LLM) is the brain of your agent. It determines how well your agent understands context, follows instructions, and generates responses.
Choosing a provider and model
Navigate to the LLM tab in your agent’s configuration to select:
Providers
| Provider | Models | Best for |
|---|
| OpenAI | GPT-4o, GPT-4o-mini | General purpose, strong instruction following |
| Anthropic | Claude Sonnet, Claude Haiku | Nuanced conversations, safety-conscious responses |
| Google | Gemini models | Multi-modal capabilities |
Selecting a model
Choose a model based on your needs:
- Higher capability models (GPT-4o, Claude Sonnet) - Better at complex reasoning, nuanced conversations, and following detailed instructions. Slightly higher latency.
- Faster models (GPT-4o-mini, Claude Haiku) - Lower latency and cost, good for simpler use cases.
Model parameters
Fine-tune your model’s behavior with these parameters:
Temperature
Controls randomness in responses.
- 0.0 - Deterministic, always gives the most likely response
- 0.5 - Balanced (recommended for most use cases)
- 1.0 - More creative and varied responses
Max tokens
The maximum length of the model’s response. Higher values allow longer responses but may increase latency.
Top P
Controls diversity of word choices. Lower values make responses more focused; higher values make them more diverse. Default is 1.0.
Frequency penalty
Reduces repetition of specific words. Values from 0.0 (no penalty) to 2.0 (strong penalty). Default is 0.0.
Presence penalty
Encourages the model to talk about new topics. Values from 0.0 (no penalty) to 2.0 (strong penalty). Default is 0.0.
For most voice agents, start with the default parameters. Only adjust if you notice specific issues like repetitive responses or responses that are too long.