Building Tool Systems for LLM Agents

This collection of guides covers the design and implementation of tool calling systems for LLM agents, drawing from analysis of two production systems: Claude Code (Anthropic) and Codex (OpenAI).

Guides

01 - Overview Comparison

High-level comparison of Claude Code and Codex approaches to tool calling. Covers architectural philosophy, key differentiators, and recommendations for choosing between approaches.

Key topics:

Plugin-based vs registry-based architecture
Runtime vs compile-time extensibility
Hooks vs sandbox+approval safety models
Design pattern recommendations

02 - Tool Architecture

Deep dive into building tool systems: schemas, registries, handlers, routers, and output formatting.

Key topics:

JSON Schema for tool parameters
Tool registry pattern
Handler trait implementation
Tool routing and dispatch
Configuration-driven registration

03 - MCP Integration

Guide to integrating Model Context Protocol (MCP) for external tool access and resources.

Key topics:

MCP protocol overview
Plugin-scoped servers (Claude Code)
Native protocol support (Codex)
Schema conversion and sanitization
Connection management

04 - Sandboxing and Safety

Comprehensive guide to safety mechanisms: hooks, sandboxing, approval policies, and command classification.

Key topics:

Defense in depth strategy
Hook-based validation (Claude Code)
OS-level sandboxing (Codex)
Approval workflows
Command safety classification

05 - Parallel Execution

Strategies for safe parallel tool execution, synchronization, and race condition prevention.

Key topics:

Read/write lock pattern
Tool parallel compatibility
Cancellation handling
Race condition prevention
Performance optimization

06 - Tool Instructions and Parsing

Deep dive into how LLMs are instructed about tools and how their responses are parsed.

Key topics:

System prompts and tool schemas
API request construction
Response parsing and routing
Argument extraction
Sub-agent triggering (Claude Code)
Complete request/response cycle

07 - Context Gathering

Comprehensive guide to how agents gather, structure, manage, and optimize context.

Key topics:

Static context (system prompts, AGENTS.md, skills)
Environment context (CWD, permissions, sandbox)
Conversation history management
Tool output truncation strategies
Context compaction (summarization)
Token tracking and overflow handling
Dynamic context via tool calls
Reminders and mid-conversation injections
Session-scoped security warnings
Re-injection after compaction

08 - Model Selection and Swapping

Guide to selecting and swapping models for agents, sub-agents, and background tasks.

Key topics:

Model families and their characteristics
Sub-agent model configuration (inherit, sonnet, opus, haiku)
Background task model selection (compaction, summarization)
Tool configuration by model capability
Dynamic model management and switching
Model-aware prompting and instructions
Best practices for model selection

09 - Context Management

Comprehensive guide to managing context within LLM context windows.

Key topics:

Understanding context windows and effective limits
Token usage tracking and estimation
Truncation strategies (per-item, head/tail preservation)
Eviction (oldest-first removal)
Compaction (summarization) and auto-compaction triggers
History normalization (maintaining invariants)
Context window exceeded handling
Image handling in context
Best practices for context management

Quick Reference

Claude Code Approach

Language: TypeScript/JavaScript
Extensibility: Runtime (Markdown plugins)
Safety: Hook system (PreToolUse, PostToolUse, Stop)
MCP: Plugin-scoped servers
Parallelism: Implicit (model-driven)

Codex Approach

Language: Rust
Extensibility: Compile-time (trait implementation)
Safety: Sandbox + approval policies
MCP: Native protocol with schema conversion
Parallelism: Explicit (read/write locks)

Getting Started

If you're building an LLM agent tool system:

Start with the overview to understand the tradeoffs
Read tool architecture for core implementation patterns
Add MCP support if you need external tool integration
Implement safety mechanisms before production use
Consider parallelism for performance optimization

Source Repositories

These guides are based on analysis of:

Claude Code: ./claude-code/ - Plugin system, hooks, commands, agents
Codex: ./codex/ - Rust implementation, sandboxing, tool handlers

Contributing

Found an error or want to add more patterns? The guides are in this repository's guides/ directory.