Tool Calling in LLM Agents: A Comparative Analysis

Overview

This guide compares and contrasts the tool calling approaches used in two production LLM coding agent systems: Claude Code (Anthropic's CLI coding assistant) and Codex (OpenAI's agent CLI). Both systems enable AI agents to interact with the user's environment through tools, but they differ significantly in architecture, implementation language, and design philosophy.

High-Level Comparison

Aspect Claude Code Codex
Implementation Language TypeScript/JavaScript Rust
Tool Architecture Plugin-based, extensible Registry-based, compiled
Extensibility Model Markdown files + hooks Compile-time handlers
Safety Model Hooks + permission modes Sandbox + approval policies
MCP Integration Plugin-scoped servers Native protocol support
Parallel Execution Implicit (Claude handles) Explicit read/write locks
Custom Tools Markdown commands/agents Handler trait implementation

Architectural Philosophy

Claude Code: Plugin-First Architecture

Claude Code takes a declarative, extensible approach where tools and capabilities are defined through:

  1. Markdown-based configuration: Commands, agents, and skills are Markdown files with YAML frontmatter
  2. Hook system: Python/bash scripts intercept tool calls at various lifecycle points
  3. MCP servers: External tools connected via Model Context Protocol
  4. Permission allowlists: YAML frontmatter specifies which tools a command can use

The philosophy here is composability over control—users and plugin authors can extend the system without modifying core code.

---
description: Review code for security issues
allowed-tools: Read, Grep, Bash(git:*)
model: sonnet
---

Review this code for security vulnerabilities including:
- SQL injection
- XSS attacks
- Authentication bypass

Codex: Registry-Based Architecture

Codex uses a trait-based, compiled approach where tools are:

  1. Rust structs implementing ToolHandler: Each tool is a handler registered by name
  2. JSON Schema definitions: Tool parameters are defined as typed schemas
  3. Orchestrator pattern: A central orchestrator manages approval, sandboxing, and retry logic
  4. Parallel execution guards: Explicit read/write locks control concurrent tool access

The philosophy here is safety through types—compile-time guarantees ensure tools are correctly wired and parameterized.

#[async_trait]
impl ToolHandler for ShellHandler {
    fn kind(&self) -> ToolKind {
        ToolKind::Function
    }

    async fn handle(&self, invocation: ToolInvocation) -> Result<ToolOutput, FunctionCallError> {
        // Parse and execute command...
    }
}

Core Components Comparison

Tool Definition

Claude Code defines tools through multiple mechanisms:

  1. Built-in tools: Core capabilities like Read, Write, Edit, Bash, Grep
  2. Slash commands: Markdown files in .claude/commands/ or plugin commands/
  3. Agents: Autonomous sub-agents defined in plugin agents/ directories
  4. MCP tools: External tools via MCP server connections

Codex defines tools through:

  1. ToolSpec enum: Union type for Function, LocalShell, WebSearch, Freeform tools
  2. ToolRegistryBuilder: Builder pattern for registering handlers and specs
  3. JSON Schema: Each tool has a typed parameter schema
  4. MCP tools: Converted to OpenAI function format at runtime

Tool Schema Example

Codex tool schema definition:

fn create_shell_command_tool() -> ToolSpec {
    let mut properties = BTreeMap::new();
    properties.insert(
        "command".to_string(),
        JsonSchema::String {
            description: Some("The shell script to execute".to_string()),
        },
    );
    properties.insert(
        "workdir".to_string(),
        JsonSchema::String {
            description: Some("The working directory".to_string()),
        },
    );

    ToolSpec::Function(ResponsesApiTool {
        name: "shell_command".to_string(),
        description: "Runs a shell command and returns its output.".to_string(),
        strict: false,
        parameters: JsonSchema::Object {
            properties,
            required: Some(vec!["command".to_string()]),
            additional_properties: Some(false.into()),
        },
    })
}

Claude Code tool declaration (in command):

---
allowed-tools: [
  "Read",
  "Write", 
  "Bash(git:*)",
  "mcp__plugin_asana_asana__asana_create_task"
]
---

Tool Invocation Flow

Claude Code:

User Input → Claude Model → Tool Call Decision
    ↓
PreToolUse Hooks (validate/modify/deny)
    ↓
Tool Execution
    ↓
PostToolUse Hooks (react/log)
    ↓
Response to Model

Codex:

User Input → Model API → ResponseItem (tool call)
    ↓
ToolRouter.build_tool_call() (parse payload)
    ↓
ToolOrchestrator.run() (approval + sandbox selection)
    ↓
ToolRuntime.run() (execute with sandbox)
    ↓
ToolOutput → ResponseInputItem

Key Differentiators

1. Extensibility Model

Claude Code favors runtime extensibility:

Codex favors compile-time safety:

2. Safety Architecture

Claude Code uses hooks for safety:

Codex uses sandbox + approval policies:

3. Parallel Execution

Claude Code handles parallelism implicitly:

Codex handles parallelism explicitly:

let _guard = if supports_parallel {
    Either::Left(lock.read().await)  // Multiple parallel
} else {
    Either::Right(lock.write().await) // Exclusive
};

4. MCP Integration

Claude Code integrates MCP via plugin configuration:

Codex integrates MCP via native protocol support:

Design Patterns for Building Similar Systems

Pattern 1: Tool Registry

Both systems use a registry pattern to map tool names to handlers:

┌─────────────────┐     ┌──────────────────┐
│  Tool Request   │────▶│   Tool Router    │
│  (name, args)   │     │  (name → handler)│
└─────────────────┘     └────────┬─────────┘
                                 │
                        ┌────────▼─────────┐
                        │  Tool Handler    │
                        │  (execute logic) │
                        └──────────────────┘

Pattern 2: Orchestration Layer

Both systems separate execution from policy:

┌──────────────┐
│ Orchestrator │
├──────────────┤
│ • Approval   │◀── Policy decisions
│ • Sandbox    │◀── Security enforcement  
│ • Retry      │◀── Error recovery
└──────┬───────┘
       │
┌──────▼───────┐
│   Runtime    │
├──────────────┤
│ • Execute    │◀── Actual tool logic
│ • Output     │◀── Result formatting
└──────────────┘

Pattern 3: Event-Driven Hooks

Claude Code's hook system provides an extensible interception pattern:

┌────────────────┐     ┌────────────────┐
│  Tool Event    │────▶│  Hook Matcher  │
│  (PreToolUse)  │     │  (tool_name)   │
└────────────────┘     └───────┬────────┘
                               │
                      ┌────────▼────────┐
                      │  Hook Handlers  │
                      │  (parallel exec)│
                      └────────┬────────┘
                               │
                      ┌────────▼────────┐
                      │  Hook Output    │
                      │  (decision/msg) │
                      └─────────────────┘

Recommendations for New Implementations

Choose Claude Code's Approach When:

Choose Codex's Approach When:

Summary

Both systems solve the same fundamental problem—enabling LLMs to safely execute tools in user environments—but make different tradeoffs:

Concern Claude Code Codex
Safety Hooks + permissions Sandbox + approvals
Extensibility Runtime (Markdown/plugins) Compile-time (Rust)
Parallelism Implicit Explicit locks
Tool Definition Declarative YAML Typed schemas
MCP Plugin-scoped Native conversion

Understanding these tradeoffs helps you choose the right architecture for your own LLM agent implementations.