Skip to main content

Language model settings

The language model (LLM) is the brain of your agent. It determines how well your agent understands context, follows instructions, and generates responses.

Choosing a provider and model

Navigate to the LLM tab in your agent’s configuration to select:

Providers

ProviderModelsBest for
OpenAIGPT-4o, GPT-4o-miniGeneral purpose, strong instruction following
AnthropicClaude Sonnet, Claude HaikuNuanced conversations, safety-conscious responses
GoogleGemini modelsMulti-modal capabilities

Selecting a model

Choose a model based on your needs:
  • Higher capability models (GPT-4o, Claude Sonnet) - Better at complex reasoning, nuanced conversations, and following detailed instructions. Slightly higher latency.
  • Faster models (GPT-4o-mini, Claude Haiku) - Lower latency and cost, good for simpler use cases.

Model parameters

Fine-tune your model’s behavior with these parameters:

Temperature

Controls randomness in responses.
  • 0.0 - Deterministic, always gives the most likely response
  • 0.5 - Balanced (recommended for most use cases)
  • 1.0 - More creative and varied responses

Max tokens

The maximum length of the model’s response. Higher values allow longer responses but may increase latency.

Top P

Controls diversity of word choices. Lower values make responses more focused; higher values make them more diverse. Default is 1.0.

Frequency penalty

Reduces repetition of specific words. Values from 0.0 (no penalty) to 2.0 (strong penalty). Default is 0.0.

Presence penalty

Encourages the model to talk about new topics. Values from 0.0 (no penalty) to 2.0 (strong penalty). Default is 0.0.
For most voice agents, start with the default parameters. Only adjust if you notice specific issues like repetitive responses or responses that are too long.