This guide covers how LLM agents select, configure, and swap models for different purposes—including the main agent, sub-agents, specialized tools, and background processing tasks.
┌─────────────────────────────────────────────────────────────────────────────┐
│ MODEL USAGE IN AGENT SYSTEMS │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ MAIN AGENT │ │
│ │ Primary model for user interaction and task execution │ │
│ │ Model: User-configured (e.g., gpt-5-codex, claude-sonnet) │ │
│ └──────────────────────────────┬──────────────────────────────────────┘ │
│ │ │
│ ┌───────────────────────┼───────────────────────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────────┐ ┌───────────────────┐ ┌─────────────────────┐ │
│ │ SUB-AGENTS │ │ BACKGROUND TASKS │ │ SPECIALIZED TOOLS │ │
│ │ │ │ │ │ │ │
│ │ • Review │ │ • Compaction │ │ • Code analysis │ │
│ │ • Analysis │ │ • Summarization │ │ • Search ranking │ │
│ │ • Testing │ │ • Auto-compact │ │ • Validation │ │
│ │ │ │ │ │ │ │
│ │ Model: │ │ Model: │ │ Model: │ │
│ │ Configurable│ │ Same as main OR │ │ Often smaller/ │ │
│ │ per-agent │ │ dedicated compact │ │ specialized │ │
│ └─────────────┘ └───────────────────┘ └─────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Codex groups models into "families" that share characteristics:
pub struct ModelFamily {
/// Full model slug (e.g., "gpt-5.1-codex-2025-01-01")
pub slug: String,
/// Family name (e.g., "gpt-5.1-codex")
pub family: String,
/// Whether model needs special apply_patch instructions
pub needs_special_apply_patch_instructions: bool,
/// Whether model supports reasoning summaries
pub supports_reasoning_summaries: bool,
/// Default reasoning effort (low/medium/high)
pub default_reasoning_effort: Option<ReasoningEffort>,
/// Whether model supports parallel tool calls
pub supports_parallel_tool_calls: bool,
/// Type of apply_patch tool to use
pub apply_patch_tool_type: Option<ApplyPatchToolType>,
/// Base system prompt for this family
pub base_instructions: String,
/// Experimental tools available to this family
pub experimental_supported_tools: Vec<String>,
/// Effective context window (percentage of total)
pub effective_context_window_percent: i64,
/// Preferred shell tool type
pub shell_type: ConfigShellToolType,
/// Default truncation policy for tool outputs
pub truncation_policy: TruncationPolicy,
}
Models are matched to families by prefix:
pub fn find_family_for_model(slug: &str) -> ModelFamily {
if slug.starts_with("o3") {
model_family!(
slug, "o3",
supports_reasoning_summaries: true,
needs_special_apply_patch_instructions: true,
)
} else if slug.starts_with("gpt-5.1-codex-max") {
model_family!(
slug, slug,
supports_reasoning_summaries: true,
base_instructions: GPT_5_1_CODEX_MAX_INSTRUCTIONS.to_string(),
apply_patch_tool_type: Some(ApplyPatchToolType::Freeform),
shell_type: ConfigShellToolType::ShellCommand,
supports_parallel_tool_calls: true,
truncation_policy: TruncationPolicy::Tokens(10_000),
)
} else if slug.starts_with("gpt-5-codex") {
model_family!(
slug, slug,
supports_reasoning_summaries: true,
base_instructions: GPT_5_CODEX_INSTRUCTIONS.to_string(),
apply_patch_tool_type: Some(ApplyPatchToolType::Freeform),
// ...
)
} else {
// Default family for unknown models
derive_default_model_family(slug)
}
}
Different model families get different system prompts:
// Each family can have its own prompt
const BASE_INSTRUCTIONS: &str = include_str!("../../prompt.md");
const GPT_5_CODEX_INSTRUCTIONS: &str = include_str!("../../gpt_5_codex_prompt.md");
const GPT_5_1_INSTRUCTIONS: &str = include_str!("../../gpt_5_1_prompt.md");
const GPT_5_1_CODEX_MAX_INSTRUCTIONS: &str = include_str!("../../gpt-5.1-codex-max_prompt.md");
Model properties can be updated from a remote API:
impl ModelFamily {
pub fn with_remote_overrides(mut self, remote_models: Vec<ModelInfo>) -> Self {
for model in remote_models {
if model.slug == self.slug {
// Override with server-provided values
self.default_reasoning_effort = Some(model.default_reasoning_level);
self.shell_type = model.shell_type;
}
}
self
}
}
Claude Code allows each sub-agent to specify its own model:
---
name: code-reviewer
description: Reviews code for bugs and security issues
model: sonnet # Can be: inherit, sonnet, opus, haiku
color: red
tools: [Glob, Grep, Read]
---
You are an expert code reviewer...
| Value | Behavior |
|---|---|
inherit |
Use the same model as the parent agent |
sonnet |
Use Claude Sonnet (balanced) |
opus |
Use Claude Opus (most capable) |
haiku |
Use Claude Haiku (fast, efficient) |
## Model Selection Guidelines
### Use `inherit` (default)
- Most agents should inherit
- Ensures consistency with user's choice
- Reduces complexity
### Use `haiku`
- Simple, fast tasks
- Validation checks
- Quick lookups
- Cost-sensitive operations
### Use `sonnet`
- Balanced performance
- Code analysis
- Documentation generation
- Most development tasks
### Use `opus`
- Complex reasoning
- Architecture decisions
- Security analysis
- Multi-file refactoring
# Simple validation - use haiku for speed
---
name: syntax-checker
model: haiku
tools: [Read]
---
# Complex analysis - use opus for capability
---
name: code-reviewer
model: opus
tools: [Glob, Grep, Read]
---
# Standard task - inherit user's choice
---
name: test-generator
model: inherit
tools: [Read, Write, Bash]
---
Codex swaps models for specific tasks like review:
async fn start_review_conversation(
session: Arc<SessionTaskContext>,
ctx: Arc<TurnContext>,
input: Vec<UserInput>,
cancellation_token: CancellationToken,
) -> Option<Receiver<Event>> {
let config = ctx.client.config();
// Create modified config for the review sub-agent
let mut sub_agent_config = config.as_ref().clone();
// Restrict to read-only sandbox
sub_agent_config.sandbox_policy = SandboxPolicy::new_read_only_policy();
// Clear outer user instructions (reviewer has its own rubric)
sub_agent_config.user_instructions = None;
// Don't load project docs
sub_agent_config.project_doc_max_bytes = 0;
// Disable certain features for review
sub_agent_config.features
.disable(Feature::WebSearchRequest)
.disable(Feature::ViewImageTool);
// Set review-specific instructions
sub_agent_config.base_instructions = Some(REVIEW_PROMPT.to_string());
// Launch sub-conversation with modified config
run_codex_conversation_one_shot(
sub_agent_config,
session.auth_manager(),
session.models_manager(),
input,
session.clone_session(),
ctx.clone(),
cancellation_token,
None,
).await
}
When spawning a sub-agent, Codex constructs a new model family:
pub async fn run_codex_conversation_one_shot(
config: Config,
auth_manager: Arc<AuthManager>,
models_manager: Arc<ModelsManager>,
input: Vec<UserInput>,
// ...
) -> Result<CodexIO, CodexErr> {
// The config.model determines which model family is used
let model_family = models_manager
.construct_model_family(&config.model, &config)
.await;
// Sub-agent gets its own client with this model family
let client = Client::new(
provider.clone(),
config.model.clone(),
model_family,
config.reasoning_effort,
// ...
);
// ...
}
Codex uses the same model for compaction by default:
pub async fn run_compact_task(
sess: Arc<Session>,
turn_context: Arc<TurnContext>,
input: Vec<UserInput>,
) {
// Uses the same client (and model) as the main session
let prompt = Prompt {
input: turn_context.history.get_history_for_prompt(),
tools: vec![], // No tools for compaction
parallel_tool_calls: false,
base_instructions_override: turn_context.base_instructions.clone(),
output_schema: None,
};
// Stream through same client
let stream = turn_context.client.clone().stream(&prompt).await?;
// ...
}
For some auth modes, Codex uses a dedicated compaction endpoint:
pub fn should_use_remote_compact_task(session: &Session) -> bool {
session.services.auth_manager.auth()
.is_some_and(|auth| auth.mode == AuthMode::ChatGPT)
&& session.enabled(Feature::RemoteCompaction)
}
async fn run_remote_compact_task_inner_impl(
sess: &Arc<Session>,
turn_context: &Arc<TurnContext>,
) -> CodexResult<()> {
// Uses a dedicated compact endpoint that may use a different model
let new_history = turn_context.client
.compact_conversation_history(&prompt)
.await?;
sess.replace_history(new_history).await;
// ...
}
Different model families expose different tools:
pub struct ToolsConfig<'a> {
model_family: &'a ModelFamily,
experimental_enabled: bool,
mcp_tools: Vec<ToolSpec>,
features: Features,
}
impl<'a> ToolsConfig<'a> {
pub fn create_tools(&self) -> Vec<ToolSpec> {
let mut tools = vec![];
// Shell tool type depends on model family
let shell_tool = match self.model_family.shell_type {
ConfigShellToolType::Default => create_shell_tool(),
ConfigShellToolType::Local => create_local_shell_tool(),
ConfigShellToolType::ShellCommand => create_shell_command_tool(),
ConfigShellToolType::UnifiedExec => create_unified_exec_tool(),
};
tools.push(shell_tool);
// Experimental tools based on model family
for tool_name in &self.model_family.experimental_supported_tools {
if let Some(tool) = create_experimental_tool(tool_name) {
tools.push(tool);
}
}
// Apply patch tool type depends on model
if let Some(patch_type) = &self.model_family.apply_patch_tool_type {
tools.push(create_apply_patch_tool(patch_type));
}
tools
}
}
Tool output truncation varies by model:
// Different models have different truncation policies
pub truncation_policy: TruncationPolicy,
// GPT-5 Codex: Token-based truncation
model_family!(
"gpt-5-codex",
truncation_policy: TruncationPolicy::Tokens(10_000),
)
// GPT-4: Byte-based truncation
model_family!(
"gpt-4o",
truncation_policy: TruncationPolicy::Bytes(10_000),
)
Codex maintains a manager for available models:
pub struct ModelsManager {
/// Available model presets (UI display)
pub available_models: RwLock<Vec<ModelPreset>>,
/// Remote model info (server-provided)
pub remote_models: RwLock<Vec<ModelInfo>>,
/// Auth manager for API access
pub auth_manager: Arc<AuthManager>,
}
impl ModelsManager {
/// Refresh available models from server
pub async fn refresh_available_models(
&self,
provider: &ModelProviderInfo,
) -> CoreResult<Vec<ModelInfo>> {
let client = ModelsClient::new(transport, api_provider, api_auth);
let response = client.list_models(version, headers).await?;
// Update cached models
*self.remote_models.write().await = response.models.clone();
*self.available_models.write().await = self.build_available_models().await;
Ok(response.models)
}
/// Construct model family with remote overrides
pub async fn construct_model_family(&self, model: &str, config: &Config) -> ModelFamily {
find_family_for_model(model)
.with_config_overrides(config)
.with_remote_overrides(self.remote_models.read().await.clone())
}
}
Users can switch models mid-session:
// TUI handles /model command
async fn handle_model_switch(&mut self, new_model: String) {
// Update config
self.config.model = new_model.clone();
// Construct new model family
let model_family = self.models_manager
.construct_model_family(&new_model, &self.config)
.await;
// Create new client with updated model
self.turn_context = self.create_turn_context(model_family);
// Notify user
self.show_notification(format!("Switched to {}", new_model));
}
Most sub-agents should inherit the parent's model:
# Claude Code agent
model: inherit
// Codex: use same config
let sub_config = parent_config.clone();
Use specific models for specialized tasks:
# Complex code review needs best model
model: opus
# Quick validation can use fast model
model: haiku
Sub-agents can have restricted capabilities:
// Review sub-agent has limited features
sub_agent_config.sandbox_policy = SandboxPolicy::new_read_only_policy();
sub_agent_config.features
.disable(Feature::WebSearchRequest)
.disable(Feature::ViewImageTool);
Expose different tools based on model capabilities:
// Only models that support parallel calls get full tool set
if model_family.supports_parallel_tool_calls {
tools.extend(parallel_capable_tools());
}
// Experimental tools only for certain families
for tool in &model_family.experimental_supported_tools {
tools.push(create_tool(tool));
}
# Prefer inherit unless you have a specific reason
model: inherit
| Task Type | Recommended Model |
|---|---|
| Simple validation | haiku |
| Code analysis | sonnet |
| Architecture decisions | opus |
| User-facing tasks | inherit |
// For background tasks, consider using smaller models
if is_background_task {
config.model = "fast-model".to_string();
}
// Main conversation should use user's chosen model
// Only override for specific sub-tasks
let main_model = user_config.model;
let review_model = if needs_deep_analysis {
"opus"
} else {
&main_model
};
// Gracefully fall back if preferred model unavailable
let model = if available_models.contains(&preferred) {
preferred
} else {
default_model
};
// Different models may need different instructions
let instructions = if model_family.needs_special_apply_patch_instructions {
format!("{}\n{}", base, APPLY_PATCH_INSTRUCTIONS)
} else {
base.to_string()
};
Model selection in LLM agents involves:
The key insight is that different parts of an agent system may benefit from different models—the main agent using the user's choice, sub-agents using task-appropriate models, and background tasks using efficient models.