Tool Calling in LLM Agents: A Comparative Analysis

Overview

This guide compares and contrasts the tool calling approaches used in two production LLM coding agent systems: Claude Code (Anthropic's CLI coding assistant) and Codex (OpenAI's agent CLI). Both systems enable AI agents to interact with the user's environment through tools, but they differ significantly in architecture, implementation language, and design philosophy.

High-Level Comparison

Aspect	Claude Code	Codex
Implementation Language	TypeScript/JavaScript	Rust
Tool Architecture	Plugin-based, extensible	Registry-based, compiled
Extensibility Model	Markdown files + hooks	Compile-time handlers
Safety Model	Hooks + permission modes	Sandbox + approval policies
MCP Integration	Plugin-scoped servers	Native protocol support
Parallel Execution	Implicit (Claude handles)	Explicit read/write locks
Custom Tools	Markdown commands/agents	Handler trait implementation

Architectural Philosophy

Claude Code: Plugin-First Architecture

Claude Code takes a declarative, extensible approach where tools and capabilities are defined through:

Markdown-based configuration: Commands, agents, and skills are Markdown files with YAML frontmatter
Hook system: Python/bash scripts intercept tool calls at various lifecycle points
MCP servers: External tools connected via Model Context Protocol
Permission allowlists: YAML frontmatter specifies which tools a command can use

The philosophy here is composability over control—users and plugin authors can extend the system without modifying core code.

---
description: Review code for security issues
allowed-tools: Read, Grep, Bash(git:*)
model: sonnet
---

Review this code for security vulnerabilities including:
- SQL injection
- XSS attacks
- Authentication bypass

Codex: Registry-Based Architecture

Codex uses a trait-based, compiled approach where tools are:

Rust structs implementing ToolHandler: Each tool is a handler registered by name
JSON Schema definitions: Tool parameters are defined as typed schemas
Orchestrator pattern: A central orchestrator manages approval, sandboxing, and retry logic
Parallel execution guards: Explicit read/write locks control concurrent tool access

The philosophy here is safety through types—compile-time guarantees ensure tools are correctly wired and parameterized.

#[async_trait]
impl ToolHandler for ShellHandler {
    fn kind(&self) -> ToolKind {
        ToolKind::Function
    }

    async fn handle(&self, invocation: ToolInvocation) -> Result<ToolOutput, FunctionCallError> {
        // Parse and execute command...
    }
}

Core Components Comparison

Tool Definition

Claude Code defines tools through multiple mechanisms:

Built-in tools: Core capabilities like Read, Write, Edit, Bash, Grep
Slash commands: Markdown files in .claude/commands/ or plugin commands/
Agents: Autonomous sub-agents defined in plugin agents/ directories
MCP tools: External tools via MCP server connections

Codex defines tools through:

ToolSpec enum: Union type for Function, LocalShell, WebSearch, Freeform tools
ToolRegistryBuilder: Builder pattern for registering handlers and specs
JSON Schema: Each tool has a typed parameter schema
MCP tools: Converted to OpenAI function format at runtime

Tool Schema Example

Codex tool schema definition:

fn create_shell_command_tool() -> ToolSpec {
    let mut properties = BTreeMap::new();
    properties.insert(
        "command".to_string(),
        JsonSchema::String {
            description: Some("The shell script to execute".to_string()),
        },
    );
    properties.insert(
        "workdir".to_string(),
        JsonSchema::String {
            description: Some("The working directory".to_string()),
        },
    );

    ToolSpec::Function(ResponsesApiTool {
        name: "shell_command".to_string(),
        description: "Runs a shell command and returns its output.".to_string(),
        strict: false,
        parameters: JsonSchema::Object {
            properties,
            required: Some(vec!["command".to_string()]),
            additional_properties: Some(false.into()),
        },
    })
}

Claude Code tool declaration (in command):

---
allowed-tools: [
  "Read",
  "Write", 
  "Bash(git:*)",
  "mcp__plugin_asana_asana__asana_create_task"
]
---

Tool Invocation Flow

Claude Code:

User Input → Claude Model → Tool Call Decision
    ↓
PreToolUse Hooks (validate/modify/deny)
    ↓
Tool Execution
    ↓
PostToolUse Hooks (react/log)
    ↓
Response to Model

Codex:

User Input → Model API → ResponseItem (tool call)
    ↓
ToolRouter.build_tool_call() (parse payload)
    ↓
ToolOrchestrator.run() (approval + sandbox selection)
    ↓
ToolRuntime.run() (execute with sandbox)
    ↓
ToolOutput → ResponseInputItem

Key Differentiators

1. Extensibility Model

Claude Code favors runtime extensibility:

Plugins are discovered at session start
Commands/agents are Markdown files evaluated dynamically
Hooks can intercept and modify tool behavior
No recompilation needed to add capabilities

Codex favors compile-time safety:

Tools registered via typed builder pattern
Handlers implement async traits with explicit error types
Schema validation happens at build time
New tools require code changes and recompilation

2. Safety Architecture

Claude Code uses hooks for safety:

PreToolUse: Validate/deny before execution
PostToolUse: React to results
Stop: Validate task completion
Prompt-based hooks for LLM-powered decisions
Command hooks for deterministic checks

Codex uses sandbox + approval policies:

macOS Seatbelt, Linux Landlock+seccomp, Windows AppContainer
Approval modes: Never, OnFailure, OnRequest, UnlessTrusted
Sandbox modes: read-only, workspace-write, full-access
Automatic sandbox escalation with re-approval

3. Parallel Execution

Claude Code handles parallelism implicitly:

Claude model decides which tools to call in parallel
Hooks run in parallel automatically
No explicit synchronization in tool execution

Codex handles parallelism explicitly:

ToolCallRuntime uses read/write locks
Tools marked supports_parallel_tool_calls acquire read lock
Non-parallel tools acquire exclusive write lock
Prevents race conditions in file operations

let _guard = if supports_parallel {
    Either::Left(lock.read().await)  // Multiple parallel
} else {
    Either::Right(lock.write().await) // Exclusive
};

4. MCP Integration

Claude Code integrates MCP via plugin configuration:

Servers defined in plugin .mcp.json files
Tools namespaced as mcp__plugin_<name>_<server>__<tool>
Authentication configured in plugin settings
Tools discoverable via /mcp command

Codex integrates MCP via native protocol support:

MCP tools converted to OpenAI function format
Schema sanitization handles non-standard JSON schemas
Direct server name in tool invocations
Resources accessible via read_mcp_resource tool

Design Patterns for Building Similar Systems

Pattern 1: Tool Registry

Both systems use a registry pattern to map tool names to handlers:

┌─────────────────┐     ┌──────────────────┐
│  Tool Request   │────▶│   Tool Router    │
│  (name, args)   │     │  (name → handler)│
└─────────────────┘     └────────┬─────────┘
                                 │
                        ┌────────▼─────────┐
                        │  Tool Handler    │
                        │  (execute logic) │
                        └──────────────────┘

Pattern 2: Orchestration Layer

Both systems separate execution from policy:

┌──────────────┐
│ Orchestrator │
├──────────────┤
│ • Approval   │◀── Policy decisions
│ • Sandbox    │◀── Security enforcement  
│ • Retry      │◀── Error recovery
└──────┬───────┘
       │
┌──────▼───────┐
│   Runtime    │
├──────────────┤
│ • Execute    │◀── Actual tool logic
│ • Output     │◀── Result formatting
└──────────────┘

Pattern 3: Event-Driven Hooks

Claude Code's hook system provides an extensible interception pattern:

┌────────────────┐     ┌────────────────┐
│  Tool Event    │────▶│  Hook Matcher  │
│  (PreToolUse)  │     │  (tool_name)   │
└────────────────┘     └───────┬────────┘
                               │
                      ┌────────▼────────┐
                      │  Hook Handlers  │
                      │  (parallel exec)│
                      └────────┬────────┘
                               │
                      ┌────────▼────────┐
                      │  Hook Output    │
                      │  (decision/msg) │
                      └─────────────────┘

Recommendations for New Implementations

Choose Claude Code's Approach When:

You need runtime extensibility without recompilation
Users/teams will create custom workflows
Plugin ecosystem is a priority
LLM-powered validation logic is desired
TypeScript/JavaScript is your stack

Choose Codex's Approach When:

Compile-time safety is paramount
Performance is critical (Rust's zero-cost abstractions)
OS-level sandboxing is required
Fine-grained parallel execution control is needed
You're building a standalone binary distribution

Summary

Both systems solve the same fundamental problem—enabling LLMs to safely execute tools in user environments—but make different tradeoffs:

Concern	Claude Code	Codex
Safety	Hooks + permissions	Sandbox + approvals
Extensibility	Runtime (Markdown/plugins)	Compile-time (Rust)
Parallelism	Implicit	Explicit locks
Tool Definition	Declarative YAML	Typed schemas
MCP	Plugin-scoped	Native conversion

Understanding these tradeoffs helps you choose the right architecture for your own LLM agent implementations.