Context Gathering and Management for LLM Agents

This guide explains how LLM agents gather, structure, manage, and optimize context for effective model interactions. Context is the lifeblood of an agent—the model can only reason about what it sees.

Overview: The Context Pipeline

┌─────────────────────────────────────────────────────────────────────────────┐
│                        CONTEXT PIPELINE                                     │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                     1. STATIC CONTEXT                                │   │
│  │  (loaded once at session start)                                      │   │
│  │                                                                      │   │
│  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────────────────┐   │   │
│  │  │ System Prompt│  │ AGENTS.md    │  │ Skills/Capabilities      │   │   │
│  │  │ (behavior)   │  │ (project)    │  │ (available helpers)      │   │   │
│  │  └──────────────┘  └──────────────┘  └──────────────────────────┘   │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                    ▼                                        │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                     2. ENVIRONMENT CONTEXT                           │   │
│  │  (injected per-turn)                                                 │   │
│  │                                                                      │   │
│  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────────────────┐   │   │
│  │  │ CWD, Shell   │  │ Permissions  │  │ Sandbox Mode             │   │   │
│  │  │              │  │ (approval)   │  │ (read-only, write, etc.) │   │   │
│  │  └──────────────┘  └──────────────┘  └──────────────────────────┘   │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                    ▼                                        │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                     3. CONVERSATION HISTORY                          │   │
│  │  (accumulated over session)                                          │   │
│  │                                                                      │   │
│  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────────────────┐   │   │
│  │  │ User Messages│  │ Assistant   │  │ Tool Calls & Outputs     │   │   │
│  │  │              │  │ Responses    │  │                          │   │   │
│  │  └──────────────┘  └──────────────┘  └──────────────────────────┘   │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                    ▼                                        │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                     4. CONTEXT MANAGEMENT                            │   │
│  │  (keep within context window)                                        │   │
│  │                                                                      │   │
│  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────────────────┐   │   │
│  │  │ Truncation   │  │ Compaction   │  │ Token Tracking           │   │   │
│  │  │              │  │ (summarize)  │  │                          │   │   │
│  │  └──────────────┘  └──────────────┘  └──────────────────────────┘   │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Part 1: Static Context (Session Initialization)

System Prompt (Base Instructions)

The system prompt establishes the agent's identity and capabilities. It's loaded once at session start:

// From Codex's model_family.rs
pub struct ModelFamily {
    /// Base system prompt for this model family
    pub base_instructions: Arc<str>,
    
    /// Whether this model needs special tool instructions
    pub needs_special_apply_patch_instructions: bool,
    
    /// Default truncation policy for tool outputs
    pub truncation_policy: TruncationPolicy,
}

// The prompt is loaded from markdown files
pub const BASE_PROMPT: &str = include_str!("../prompt.md");

The system prompt typically includes:

Agent identity: Who the agent is and what it does
Capabilities: Available tools and how to use them
Behavioral guidelines: Communication style, planning, validation
Tool-specific instructions: How to format tool calls

Project Documentation (AGENTS.md)

Codex discovers project-specific instructions by walking up the directory tree:

/// Discovery algorithm for AGENTS.md files
pub fn discover_project_doc_paths(config: &Config) -> std::io::Result<Vec<PathBuf>> {
    let mut dir = config.cwd.clone();
    
    // Build chain from cwd upwards and detect git root
    let mut chain: Vec<PathBuf> = vec![dir.clone()];
    let mut git_root: Option<PathBuf> = None;
    
    while let Some(parent) = dir.parent() {
        // Check for .git marker
        let git_marker = dir.join(".git");
        if git_marker.exists() {
            git_root = Some(dir.clone());
            break;
        }
        chain.push(parent.to_path_buf());
        dir = parent.to_path_buf();
    }
    
    // Search from git root down to cwd
    let search_dirs = if let Some(root) = git_root {
        chain.iter()
            .rev()
            .skip_while(|p| *p != &root)
            .cloned()
            .collect()
    } else {
        vec![config.cwd.clone()]
    };
    
    // Look for AGENTS.md or fallbacks in each directory
    let mut found = Vec::new();
    let candidates = ["AGENTS.override.md", "AGENTS.md"];
    
    for dir in search_dirs {
        for name in &candidates {
            let path = dir.join(name);
            if path.is_file() {
                found.push(path);
                break; // Only one per directory
            }
        }
    }
    
    Ok(found)
}

Discovery rules:

Start from current working directory
Walk up to find Git root (.git marker)
Collect AGENTS.md from root down to cwd
AGENTS.override.md takes precedence in each directory
Contents are concatenated in order

Skills Discovery

Codex supports "skills" - reusable instruction bundles discovered at startup:

pub fn render_skills_section(skills: &[SkillMetadata]) -> Option<String> {
    if skills.is_empty() {
        return None;
    }

    let mut lines = Vec::new();
    lines.push("## Skills".to_string());
    lines.push("These skills are discovered at startup from ~/.codex/skills...".to_string());

    for skill in skills {
        lines.push(format!(
            "- {}: {} (file: {})",
            skill.name, skill.description, skill.path.display()
        ));
    }

    // Add usage rules
    lines.push(SKILL_USAGE_RULES.to_string());

    Some(lines.join("\n"))
}

Skills are referenced but not inlined - the model is told where to find them and loads details on-demand to keep context lean.

Combining Initial Context

All static context is merged together:

pub async fn get_user_instructions(config: &Config) -> Option<String> {
    // Load skills (names + descriptions, not full content)
    let skills_section = if config.features.enabled(Feature::Skills) {
        let skills = load_skills(config);
        render_skills_section(&skills.skills)
    } else {
        None
    };

    // Load project docs (AGENTS.md files)
    let project_docs = read_project_docs(config).await?;

    // Merge project docs with skills
    let combined = merge_project_docs_with_skills(project_docs, skills_section);

    // Combine with any base user instructions
    let mut parts = Vec::new();
    
    if let Some(instructions) = config.user_instructions.clone() {
        parts.push(instructions);
    }
    
    if let Some(project_doc) = combined {
        if !parts.is_empty() {
            parts.push("\n\n--- project-doc ---\n\n".to_string());
        }
        parts.push(project_doc);
    }

    if parts.is_empty() { None } else { Some(parts.concat()) }
}

Part 2: Environment Context (Per-Turn)

Each turn includes environment information so the model knows its current state:

pub struct EnvironmentContext {
    pub cwd: Option<PathBuf>,           // Current working directory
    pub approval_policy: Option<AskForApproval>,  // Permission level
    pub sandbox_mode: Option<SandboxMode>,        // Sandbox restrictions
    pub network_access: Option<NetworkAccess>,    // Network availability
    pub writable_roots: Option<Vec<PathBuf>>,     // Allowed write paths
    pub shell: Shell,                             // Shell type (bash, zsh, etc.)
}

impl EnvironmentContext {
    /// Serialize to XML for model consumption
    pub fn serialize_to_xml(self) -> String {
        let mut lines = vec!["<environment_context>".to_string()];
        
        if let Some(cwd) = self.cwd {
            lines.push(format!("  <cwd>{}</cwd>", cwd.display()));
        }
        if let Some(policy) = self.approval_policy {
            lines.push(format!("  <approval_policy>{}</approval_policy>", policy));
        }
        if let Some(mode) = self.sandbox_mode {
            lines.push(format!("  <sandbox_mode>{}</sandbox_mode>", mode));
        }
        // ... more fields
        
        lines.push("</environment_context>".to_string());
        lines.join("\n")
    }
}

Example environment context sent to model:

<environment_context>
  <cwd>/home/user/my-project</cwd>
  <approval_policy>on-request</approval_policy>
  <sandbox_mode>workspace-write</sandbox_mode>
  <network_access>restricted</network_access>
  <writable_roots>
    <root>/home/user/my-project</root>
    <root>/tmp</root>
  </writable_roots>
  <shell>bash</shell>
</environment_context>

User Instructions (AGENTS.md Content)

Project-specific instructions are formatted specially:

impl From<UserInstructions> for ResponseItem {
    fn from(ui: UserInstructions) -> Self {
        ResponseItem::Message {
            role: "user".to_string(),
            content: vec![ContentItem::InputText {
                text: format!(
                    "# AGENTS.md instructions for {directory}\n\n<INSTRUCTIONS>\n{contents}\n</INSTRUCTIONS>",
                    directory = ui.directory,
                    contents = ui.text
                ),
            }],
        }
    }
}

Part 3: Conversation History

The context manager tracks the full conversation:

pub struct ContextManager {
    /// Items ordered from oldest to newest
    items: Vec<ResponseItem>,
    
    /// Token usage information
    token_info: Option<TokenUsageInfo>,
}

impl ContextManager {
    /// Record new items into history
    pub fn record_items<I>(&mut self, items: I, policy: TruncationPolicy)
    where
        I: IntoIterator<Item = ResponseItem>,
    {
        for item in items {
            // Skip non-API items
            if !is_api_message(&item) {
                continue;
            }
            
            // Process (potentially truncate) the item
            let processed = self.process_item(&item, policy);
            self.items.push(processed);
        }
    }

    /// Get history prepared for sending to model
    pub fn get_history_for_prompt(&mut self) -> Vec<ResponseItem> {
        // Normalize: ensure all tool calls have outputs
        self.normalize_history();
        
        // Remove internal items (like GhostSnapshots)
        let mut history = self.contents();
        Self::remove_ghost_snapshots(&mut history);
        
        history
    }
}

What Goes Into History

fn is_api_message(message: &ResponseItem) -> bool {
    match message {
        // User and assistant messages
        ResponseItem::Message { role, .. } => role != "system",
        
        // Tool interactions
        ResponseItem::FunctionCall { .. } => true,
        ResponseItem::FunctionCallOutput { .. } => true,
        ResponseItem::CustomToolCall { .. } => true,
        ResponseItem::CustomToolCallOutput { .. } => true,
        ResponseItem::LocalShellCall { .. } => true,
        
        // Model reasoning
        ResponseItem::Reasoning { .. } => true,
        
        // Web search results
        ResponseItem::WebSearchCall { .. } => true,
        
        // Compaction summaries
        ResponseItem::CompactionSummary { .. } => true,
        
        // Internal items excluded
        ResponseItem::GhostSnapshot { .. } => false,
        ResponseItem::Other => false,
    }
}

History Normalization

The history must maintain invariants - every tool call needs a corresponding output:

fn normalize_history(&mut self) {
    // Ensure all function/tool calls have corresponding outputs
    normalize::ensure_call_outputs_present(&mut self.items);
    
    // Remove orphaned outputs without corresponding calls
    normalize::remove_orphan_outputs(&mut self.items);
}

Part 4: Tool Output Truncation

Tool outputs can be large. Truncation prevents context overflow:

#[derive(Debug, Clone, Copy)]
pub enum TruncationPolicy {
    Bytes(usize),   // Truncate by byte count
    Tokens(usize),  // Truncate by token estimate
}

impl TruncationPolicy {
    /// Approximate token budget
    pub fn token_budget(&self) -> usize {
        match self {
            TruncationPolicy::Bytes(bytes) => bytes / 4, // ~4 bytes per token
            TruncationPolicy::Tokens(tokens) => *tokens,
        }
    }
}

Truncation Strategy

Codex preserves both the beginning and end of output:

fn truncate_with_byte_estimate(s: &str, policy: TruncationPolicy) -> String {
    let max_bytes = policy.byte_budget();
    
    if s.len() <= max_bytes {
        return s.to_string();
    }

    // Split budget: half for beginning, half for end
    let (left_budget, right_budget) = (max_bytes / 2, max_bytes - max_bytes / 2);

    // Split string on UTF-8 boundaries
    let (removed_chars, left, right) = split_string(s, left_budget, right_budget);

    // Create truncation marker
    let marker = format!("…{removed_chars} chars truncated…");

    // Assemble: beginning + marker + end
    format!("{left}{marker}{right}")
}

Example truncated output:

Total output lines: 5000

drwxr-xr-x  5 user user  160 Jan 1 12:00 .
drwxr-xr-x  3 user user   96 Jan 1 11:00 ..
-rw-r--r--  1 user user  234 Jan 1 12:00 package.json
…4850 chars truncated…
-rw-r--r--  1 user user 1234 Jan 1 12:00 README.md
-rw-r--r--  1 user user  567 Jan 1 12:00 tsconfig.json

Per-Item Truncation in History

Tool outputs are truncated when recorded:

fn process_item(&self, item: &ResponseItem, policy: TruncationPolicy) -> ResponseItem {
    match item {
        ResponseItem::FunctionCallOutput { call_id, output } => {
            // Truncate the content
            let truncated = truncate_text(&output.content, policy);
            
            // Also truncate any structured content items
            let truncated_items = output.content_items.as_ref().map(|items| {
                truncate_function_output_items_with_policy(items, policy)
            });
            
            ResponseItem::FunctionCallOutput {
                call_id: call_id.clone(),
                output: FunctionCallOutputPayload {
                    content: truncated,
                    content_items: truncated_items,
                    success: output.success,
                },
            }
        }
        // Other items pass through unchanged
        _ => item.clone(),
    }
}

Part 5: Context Compaction (Summarization)

When the context window fills up, Codex can compact the conversation into a summary:

pub async fn run_compact_task(
    sess: Arc<Session>,
    turn_context: Arc<TurnContext>,
    input: Vec<UserInput>,
) {
    // Get current history
    let mut history = sess.clone_history().await;
    
    // Iteratively remove oldest items if context is too large
    let mut truncated_count = 0;
    loop {
        let turn_input = history.get_history_for_prompt();
        
        match drain_to_completed(&sess, &turn_context, &prompt).await {
            Ok(()) => break,
            Err(CodexErr::ContextWindowExceeded) => {
                // Remove oldest item and retry
                history.remove_first_item();
                truncated_count += 1;
            }
            Err(e) => {
                // Handle other errors...
            }
        }
    }

    // Extract user messages for preservation
    let history_snapshot = sess.clone_history().await.get_history();
    let user_messages = collect_user_messages(&history_snapshot);
    
    // Get summary from model
    let summary_text = get_last_assistant_message_from_turn(&history_snapshot)
        .unwrap_or_default();

    // Build compacted history
    let initial_context = sess.build_initial_context(&turn_context);
    let new_history = build_compacted_history(
        initial_context,
        &user_messages,
        &summary_text
    );
    
    // Replace history with compacted version
    sess.replace_history(new_history).await;
}

Compaction Strategy

fn build_compacted_history(
    mut history: Vec<ResponseItem>,
    user_messages: &[String],
    summary_text: &str,
) -> Vec<ResponseItem> {
    // Budget for preserved user messages
    let max_tokens = 20_000;
    let mut remaining = max_tokens;
    let mut selected_messages = Vec::new();

    // Keep recent user messages (working backwards)
    for message in user_messages.iter().rev() {
        if remaining == 0 { break; }
        
        let tokens = approx_token_count(message);
        if tokens <= remaining {
            selected_messages.push(message.clone());
            remaining -= tokens;
        } else {
            // Truncate and include partial
            let truncated = truncate_text(message, TruncationPolicy::Tokens(remaining));
            selected_messages.push(truncated);
            break;
        }
    }
    selected_messages.reverse();

    // Add preserved user messages to history
    for message in &selected_messages {
        history.push(ResponseItem::Message {
            role: "user".to_string(),
            content: vec![ContentItem::InputText { text: message.clone() }],
        });
    }

    // Add summary as the final message
    history.push(ResponseItem::Message {
        role: "user".to_string(),
        content: vec![ContentItem::InputText { 
            text: format!("{SUMMARY_PREFIX}\n{summary_text}")
        }],
    });

    history
}

Part 6: Token Tracking

Codex tracks token usage to anticipate when compaction is needed:

impl ContextManager {
    /// Estimate token count using byte-based heuristics
    pub fn estimate_token_count(&self, turn_context: &TurnContext) -> Option<i64> {
        let model_family = turn_context.client.get_model_family();
        
        // Base tokens from system prompt
        let base_tokens = approx_token_count(&model_family.base_instructions);

        // Sum tokens from all items
        let items_tokens = self.items.iter().fold(0i64, |acc, item| {
            acc + match item {
                // Skip internal items
                ResponseItem::GhostSnapshot { .. } => 0,
                
                // Estimate reasoning (encrypted content)
                ResponseItem::Reasoning { encrypted_content: Some(content), .. } => {
                    estimate_reasoning_length(content.len())
                }
                
                // Serialize and estimate other items
                item => {
                    let serialized = serde_json::to_string(item).unwrap_or_default();
                    approx_token_count(&serialized) as i64
                }
            }
        });

        Some(base_tokens.saturating_add(items_tokens))
    }

    /// Update with actual token usage from API response
    pub fn update_token_info(&mut self, usage: &TokenUsage, context_window: Option<i64>) {
        self.token_info = TokenUsageInfo::new_or_append(
            &self.token_info,
            &Some(usage.clone()),
            context_window,
        );
    }
}

Part 7: Dynamic Context Gathering (Tool-Based)

Beyond initial context, agents gather information dynamically through tool calls:

Read File Tool

// Model calls: read_file(path: "src/main.rs", offset: 0, limit: 100)
// Returns file content as tool output, which becomes part of context

async fn handle_read_file(params: ReadFileParams) -> Result<ToolOutput> {
    let content = tokio::fs::read_to_string(&params.path).await?;
    
    // Apply line limits
    let lines: Vec<&str> = content.lines().collect();
    let start = params.offset.unwrap_or(0);
    let end = start + params.limit.unwrap_or(lines.len());
    
    let selected: String = lines[start..end.min(lines.len())]
        .iter()
        .enumerate()
        .map(|(i, line)| format!("{:6}|{}", start + i + 1, line))
        .collect::<Vec<_>>()
        .join("\n");

    Ok(ToolOutput::Function {
        content: selected,
        success: Some(true),
        ..Default::default()
    })
}

Shell Command Tool

// Model calls: shell(command: ["ls", "-la"])
// Returns command output, which becomes part of context

async fn handle_shell(params: ShellParams) -> Result<ToolOutput> {
    let output = Command::new(&params.command[0])
        .args(&params.command[1..])
        .current_dir(&params.workdir.unwrap_or_default())
        .output()
        .await?;

    let formatted = format_exec_output_for_model(&ExecToolCallOutput {
        stdout: String::from_utf8_lossy(&output.stdout).into(),
        stderr: String::from_utf8_lossy(&output.stderr).into(),
        exit_code: output.status.code().unwrap_or(-1),
        duration: elapsed,
    });

    Ok(ToolOutput::Function {
        content: formatted,
        success: Some(output.status.success()),
        ..Default::default()
    })
}

Search/Grep Tools

// Model calls: grep_files(pattern: "TODO", path: "src/")
// Returns matches, which become part of context

async fn handle_grep(params: GrepParams) -> Result<ToolOutput> {
    let matches = ripgrep_search(&params.pattern, &params.path)?;
    
    // Format matches with file:line:content
    let output = matches.iter()
        .map(|m| format!("{}:{}:{}", m.file, m.line, m.content))
        .collect::<Vec<_>>()
        .join("\n");

    Ok(ToolOutput::Function {
        content: output,
        success: Some(true),
        ..Default::default()
    })
}

Part 8: Claude Code's Plugin Context

Claude Code uses a different approach with plugins that provide contextual capabilities:

Plugin-Based Context

Plugins in Claude Code can:

Add commands that inject context
Define agents with specialized context
Use hooks to modify context dynamically

---
name: code-reviewer
description: Reviews code for bugs and security issues
tools: [Glob, Grep, Read]  # Limited tool access
model: sonnet
---

You are an expert code reviewer...
[This becomes the sub-agent's system prompt - its context]

Dynamic File References

Claude Code supports @file syntax in commands to inject file content:

---
description: Analyze the specified file
---

Review the following file for issues:

@$ARGUMENTS

Focus on:
1. Code quality
2. Security concerns

When a user runs /review src/app.ts, the @src/app.ts expands to include the file's content in the context.

Inline Command Execution

Commands can include shell output:

---
description: Review recent changes
---

Files changed since last commit:
!`git diff --name-only HEAD~1`

Review each file...

Best Practices Summary

1. Layer Context Appropriately

Priority 1: System prompt (always present)
Priority 2: Project docs (session-level)
Priority 3: Environment context (turn-level)  
Priority 4: Conversation history (accumulated)
Priority 5: Tool outputs (on-demand)

2. Keep Context Lean

Use references instead of inlining large content
Truncate tool outputs while preserving useful parts
Compact history before hitting context limits
Use progressive disclosure (load details on-demand)

3. Preserve Important Information

Keep recent user messages during compaction
Maintain tool call/output pairs
Preserve environment state between turns
Track what the model has seen vs. referenced

4. Handle Context Overflow Gracefully

Monitor token usage continuously
Implement automatic compaction
Truncate oldest items first (preserve cache)
Notify users when significant context is lost

5. Format Context for Model Comprehension

Use consistent XML/structured formats for metadata
Add line numbers to file content
Include exit codes and durations for commands
Clearly separate different context types

Part 9: Reminders and Mid-Conversation Injections

Both systems support injecting context mid-conversation as reminders or guidance.

Claude Code: Hook-Based System Messages

Hooks can inject systemMessage content that gets added to the context whenever they fire:

{
  "hookSpecificOutput": {
    "permissionDecision": "allow"
  },
  "systemMessage": "Remember to check for SQL injection when handling user input"
}

Key hook events for injecting reminders:

Event	When It Fires	Common Use
`SessionStart`	Beginning of session	Load project context, set behavioral mode
`PreToolUse`	Before every tool call	Security warnings, validation reminders
`PostToolUse`	After tool completes	Feedback, quality checks
`Stop`	Agent wants to stop	Completion validation
`UserPromptSubmit`	User sends message	Context enrichment

Example: Security Reminder Hook

Claude Code includes a security reminder system that warns once per session about risky patterns:

# From security_reminder_hook.py
SECURITY_PATTERNS = [
    {
        "ruleName": "child_process_exec",
        "substrings": ["child_process.exec", "exec("],
        "reminder": """⚠️ Security Warning: Using child_process.exec() 
        can lead to command injection vulnerabilities.
        
        Use execFileNoThrow() instead for safety.""",
    },
    {
        "ruleName": "eval_injection", 
        "substrings": ["eval("],
        "reminder": "⚠️ Security Warning: eval() executes arbitrary code...",
    },
]

def main():
    # Track shown warnings per session to avoid repetition
    shown_warnings = load_state(session_id)
    
    # Check if pattern matches
    rule_name, reminder = check_patterns(file_path, content)
    
    if rule_name and warning_key not in shown_warnings:
        # Show warning once per session
        shown_warnings.add(warning_key)
        save_state(session_id, shown_warnings)
        
        # Output to stderr - gets added to Claude's context
        print(reminder, file=sys.stderr)
        sys.exit(2)  # Block and show warning

Session-Scoped Reminders:

Each warning is shown once per session per file/rule combination
State is tracked in ~/.claude/security_warnings_state_{session_id}.json
Old state files are cleaned up periodically (10% chance per run)

Claude Code: SessionStart Hook for Behavioral Modes

Plugins can inject behavioral instructions at session start:

#!/usr/bin/env bash
# learning-output-style/hooks-handlers/session-start.sh

cat << 'EOF'
{
  "hookSpecificOutput": {
    "hookEventName": "SessionStart",
    "additionalContext": "You are in 'learning' output style mode...
    
    ## Learning Mode Philosophy
    Instead of implementing everything yourself, identify opportunities 
    where the user can write 5-10 lines of meaningful code...
    
    ## When to Request User Contributions
    - Business logic with multiple valid approaches
    - Error handling strategies
    - Algorithm implementation choices..."
  }
}
EOF

This additionalContext becomes part of the model's initial context for every session.

Codex: Developer Instructions

Codex supports a developer role message that supplements the system prompt:

// Configuration
pub struct Config {
    /// Developer instructions override injected as a separate message
    pub developer_instructions: Option<String>,
}

// Injection into context
pub fn build_initial_context(&self, turn_context: &TurnContext) -> Vec<ResponseItem> {
    let mut items = Vec::new();
    
    // Developer instructions come first
    if let Some(developer_instructions) = turn_context.developer_instructions.as_deref() {
        items.push(DeveloperInstructions::new(developer_instructions).into());
    }
    
    // Then user instructions (AGENTS.md)
    if let Some(user_instructions) = turn_context.user_instructions.as_deref() {
        items.push(UserInstructions { 
            text: user_instructions,
            directory: turn_context.cwd.to_string_lossy().into_owned(),
        }.into());
    }
    
    // Then environment context
    items.push(EnvironmentContext::new(...).into());
    
    items
}

The developer message uses a special "developer" role:

impl From<DeveloperInstructions> for ResponseItem {
    fn from(di: DeveloperInstructions) -> Self {
        ResponseItem::Message {
            role: "developer".to_string(),  // Special role
            content: vec![ContentItem::InputText { text: di.text }],
        }
    }
}

Codex: Re-injection After Compaction

When context is compacted, initial context is re-injected:

async fn run_compact_task(...) {
    // ... summarization happens ...
    
    // Re-build initial context (developer instructions, AGENTS.md, environment)
    let initial_context = sess.build_initial_context(turn_context.as_ref());
    
    // Build new history with initial context + preserved user messages + summary
    let new_history = build_compacted_history(
        initial_context,  // <-- Re-injected!
        &user_messages,
        &summary_text
    );
    
    sess.replace_history(new_history).await;
}

This ensures the model never loses critical instructions even after long conversations.

Best Practices for Reminders

1. Session-Scope Deduplication Don't spam the same reminder repeatedly:

if warning_key not in shown_warnings:
    shown_warnings.add(warning_key)
    save_state(session_id, shown_warnings)
    show_reminder()

2. Context-Appropriate Triggers Use the right hook event:

SessionStart: One-time behavioral setup
PreToolUse: Warnings before risky operations
PostToolUse: Feedback on results
Stop: Validation before completing

3. Concise Messages Keep reminders brief - they consume context tokens:

# Good: Focused warning
"⚠️ eval() is a security risk. Consider JSON.parse() instead."

# Bad: Essay-length explanation
"⚠️ The eval() function evaluates arbitrary JavaScript code, which 
poses significant security risks including... [500 more words]"

4. Structured Output Use the expected JSON format for hooks:

{
  "hookSpecificOutput": {...},
  "systemMessage": "Brief reminder text"
}

This architecture ensures the model always has the context it needs while managing the inherent limitations of context windows.