Allowing LLMs to execute tools—especially shell commands—requires robust safety mechanisms. This guide covers how Claude Code and Codex approach safety, and how to build similar protections.
Both systems use multiple layers of protection:
┌─────────────────────────────────────────────────────────┐
│ USER LAYER │
│ • Approval prompts for dangerous operations │
│ • Permission modes (read-only, workspace, full) │
└─────────────────────────────────────────────────────────┘
│
┌─────────────────────────────────────────────────────────┐
│ APPLICATION LAYER │
│ • Hooks for tool validation (Claude Code) │
│ • Approval policies (Codex) │
│ • Command safety classification │
└─────────────────────────────────────────────────────────┘
│
┌─────────────────────────────────────────────────────────┐
│ SYSTEM LAYER │
│ • OS-level sandboxing (Seatbelt, Landlock, seccomp) │
│ • Filesystem restrictions │
│ • Network isolation │
└─────────────────────────────────────────────────────────┘
Claude Code uses hooks at key lifecycle points:
| Hook Event | Purpose | Use Case |
|---|---|---|
PreToolUse |
Validate before execution | Block dangerous commands |
PostToolUse |
React to results | Log operations, alert on issues |
Stop |
Validate completion | Ensure tests pass before stopping |
UserPromptSubmit |
Validate user input | Add security reminders |
Block dangerous file operations:
{
"hooks": {
"PreToolUse": [
{
"matcher": "Write|Edit",
"hooks": [
{
"type": "command",
"command": "bash ${CLAUDE_PLUGIN_ROOT}/hooks/validate-write.sh",
"timeout": 10
}
]
}
]
}
}
#!/bin/bash
# validate-write.sh
set -euo pipefail
input=$(cat)
file_path=$(echo "$input" | jq -r '.tool_input.file_path')
# Deny path traversal
if [[ "$file_path" == *".."* ]]; then
echo '{"hookSpecificOutput": {"permissionDecision": "deny"}, "systemMessage": "Path traversal not allowed"}' >&2
exit 2
fi
# Deny sensitive files
SENSITIVE_PATTERNS=(".env" ".ssh" ".aws" "credentials" "secrets")
for pattern in "${SENSITIVE_PATTERNS[@]}"; do
if [[ "$file_path" == *"$pattern"* ]]; then
echo '{"hookSpecificOutput": {"permissionDecision": "deny"}, "systemMessage": "Cannot modify sensitive file: $pattern"}' >&2
exit 2
fi
done
# Allow operation
echo '{}'
exit 0
Use LLM reasoning for complex validation:
{
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "prompt",
"prompt": "Evaluate if this bash command is safe to execute. Consider: system damage, data loss, network access, privilege escalation. Command: $TOOL_INPUT. Return 'approve' or 'deny' with reason.",
"timeout": 30
}
]
}
]
}
Claude Code's allowed-tools in command frontmatter:
---
# Restrictive: only specific tools
allowed-tools: [Read, Grep]
---
# Bash with restrictions
allowed-tools: [Bash(git:*)] # Only git commands
# All tools (dangerous)
allowed-tools: [*]
Codex defines when user approval is required:
pub enum AskForApproval {
Never, // Never ask, auto-approve all
OnFailure, // Ask only when sandbox denies
OnRequest, // Ask for operations outside sandbox
UnlessTrusted, // Always ask unless whitelisted
}
OS-level isolation for command execution:
pub enum SandboxPolicy {
ReadOnly, // No writes, no network
WorkspaceWrite, // Write in workspace only
DangerFullAccess, // No restrictions
}
macOS (Seatbelt):
// Uses Apple's sandbox-exec with custom profiles
pub fn create_seatbelt_profile(policy: &SandboxPolicy, cwd: &Path) -> String {
match policy {
SandboxPolicy::ReadOnly => {
format!(r#"
(version 1)
(deny default)
(allow file-read*)
(deny network*)
"#)
}
SandboxPolicy::WorkspaceWrite => {
format!(r#"
(version 1)
(deny default)
(allow file-read*)
(allow file-write* (subpath "{}"))
(allow file-write* (subpath "/tmp"))
(deny network*)
"#, cwd.display())
}
SandboxPolicy::DangerFullAccess => {
"(version 1)\n(allow default)".to_string()
}
}
}
Linux (Landlock + seccomp):
pub fn apply_landlock_policy(policy: &SandboxPolicy, cwd: &Path) -> Result<(), Error> {
let mut ruleset = Ruleset::new()?;
match policy {
SandboxPolicy::ReadOnly => {
// Read-only access to entire filesystem
ruleset.add_rule(PathBeneathRules::new(
"/",
AccessFs::ReadFile | AccessFs::ReadDir,
))?;
}
SandboxPolicy::WorkspaceWrite => {
// Read everywhere
ruleset.add_rule(PathBeneathRules::new(
"/",
AccessFs::ReadFile | AccessFs::ReadDir,
))?;
// Write only in workspace
ruleset.add_rule(PathBeneathRules::new(
cwd,
AccessFs::all(),
))?;
}
_ => {}
}
ruleset.apply()?;
Ok(())
}
Codex's ToolOrchestrator manages the safety flow:
pub struct ToolOrchestrator {
sandbox: SandboxManager,
}
impl ToolOrchestrator {
pub async fn run<Rq, Out, T>(
&mut self,
tool: &mut T,
req: &Rq,
tool_ctx: &ToolCtx<'_>,
turn_ctx: &TurnContext,
approval_policy: AskForApproval,
) -> Result<Out, ToolError>
where
T: ToolRuntime<Rq, Out>,
{
// 1. CHECK APPROVAL REQUIREMENT
let requirement = tool.exec_approval_requirement(req)
.unwrap_or_else(|| default_exec_approval_requirement(approval_policy, &turn_ctx.sandbox_policy));
match requirement {
ExecApprovalRequirement::Skip { .. } => {
// Auto-approved, continue
}
ExecApprovalRequirement::Forbidden { reason } => {
return Err(ToolError::Rejected(reason));
}
ExecApprovalRequirement::NeedsApproval { reason, .. } => {
// Request user approval
let decision = tool.start_approval_async(req, ctx).await;
match decision {
ReviewDecision::Denied | ReviewDecision::Abort => {
return Err(ToolError::Rejected("rejected by user".into()));
}
ReviewDecision::Approved | ReviewDecision::ApprovedForSession => {
// Continue
}
}
}
}
// 2. SELECT SANDBOX MODE
let initial_sandbox = self.sandbox.select_initial(&turn_ctx.sandbox_policy);
// 3. FIRST ATTEMPT (sandboxed)
let attempt = SandboxAttempt {
sandbox: initial_sandbox,
policy: &turn_ctx.sandbox_policy,
manager: &self.sandbox,
sandbox_cwd: &turn_ctx.cwd,
};
match tool.run(req, &attempt, tool_ctx).await {
Ok(out) => Ok(out),
// 4. HANDLE SANDBOX DENIAL
Err(ToolError::Codex(CodexErr::Sandbox(SandboxErr::Denied { output }))) => {
// Ask for escalation approval
if !tool.wants_no_sandbox_approval(approval_policy) {
return Err(ToolError::Codex(CodexErr::Sandbox(SandboxErr::Denied { output })));
}
// Request escalation
let decision = tool.start_approval_async(req, escalation_ctx).await;
if decision != ReviewDecision::Approved {
return Err(ToolError::Rejected("escalation rejected".into()));
}
// 5. RETRY WITHOUT SANDBOX
let escalated_attempt = SandboxAttempt {
sandbox: SandboxType::None,
..attempt
};
tool.run(req, &escalated_attempt, tool_ctx).await
}
other => other,
}
}
}
Codex classifies commands as safe/dangerous:
pub fn is_known_safe_command(command: &[String]) -> bool {
let safe_commands = [
// Read-only commands
"ls", "cat", "head", "tail", "less", "more",
"grep", "find", "which", "pwd", "env",
// Git read operations
"git status", "git log", "git diff", "git show",
// Build info
"cargo --version", "npm --version", "node --version",
];
let cmd_str = command.join(" ");
safe_commands.iter().any(|safe| cmd_str.starts_with(safe))
}
pub fn is_dangerous_command(command: &[String]) -> Option<&'static str> {
let dangerous_patterns = [
("rm -rf /", "Recursive delete of root"),
("chmod 777", "Overly permissive permissions"),
("curl | bash", "Remote code execution"),
("sudo", "Privilege escalation"),
("> /dev/", "Writing to device files"),
];
let cmd_str = command.join(" ");
for (pattern, reason) in dangerous_patterns {
if cmd_str.contains(pattern) {
return Some(reason);
}
}
None
}
pub enum ApprovalRequirement {
/// No approval needed
Skip,
/// User must approve
NeedsApproval { reason: String },
/// Operation forbidden entirely
Forbidden { reason: String },
}
pub fn determine_approval_requirement(
tool: &str,
args: &ToolArgs,
policy: &SecurityPolicy,
) -> ApprovalRequirement {
// Check if tool is in allowlist
if policy.allowed_tools.contains(tool) {
return ApprovalRequirement::Skip;
}
// Check for dangerous operations
if let Some(reason) = is_dangerous_operation(tool, args) {
if policy.strict_mode {
return ApprovalRequirement::Forbidden { reason };
}
return ApprovalRequirement::NeedsApproval { reason };
}
// Default based on policy
match policy.default_approval {
DefaultApproval::Allow => ApprovalRequirement::Skip,
DefaultApproval::Ask => ApprovalRequirement::NeedsApproval {
reason: "Operation requires approval".into(),
},
DefaultApproval::Deny => ApprovalRequirement::Forbidden {
reason: "Operation not permitted".into(),
},
}
}
Avoid asking for the same approval repeatedly:
pub struct ApprovalCache {
session_approvals: HashSet<ApprovalKey>,
}
impl ApprovalCache {
pub fn is_approved(&self, key: &ApprovalKey) -> bool {
self.session_approvals.contains(key)
}
pub fn approve_for_session(&mut self, key: ApprovalKey) {
self.session_approvals.insert(key);
}
}
// Usage
pub async fn get_approval(
cache: &mut ApprovalCache,
key: ApprovalKey,
prompt_user: impl FnOnce() -> Future<Output = bool>,
) -> bool {
if cache.is_approved(&key) {
return true;
}
let approved = prompt_user().await;
if approved {
cache.approve_for_session(key);
}
approved
}
pub trait Sandbox {
fn prepare(&self, cwd: &Path) -> Result<SandboxEnv, SandboxError>;
fn execute(&self, cmd: &[String], env: SandboxEnv) -> Result<Output, SandboxError>;
fn is_supported() -> bool;
}
pub struct SandboxManager {
preferred: Box<dyn Sandbox>,
fallbacks: Vec<Box<dyn Sandbox>>,
}
impl SandboxManager {
pub fn new() -> Self {
#[cfg(target_os = "macos")]
let preferred = Box::new(SeatbeltSandbox::new());
#[cfg(target_os = "linux")]
let preferred = Box::new(LandlockSandbox::new());
#[cfg(target_os = "windows")]
let preferred = Box::new(AppContainerSandbox::new());
Self {
preferred,
fallbacks: vec![Box::new(NoSandbox::new())],
}
}
pub fn execute(&self, cmd: &[String], cwd: &Path, policy: &SandboxPolicy) -> Result<Output, Error> {
if matches!(policy, SandboxPolicy::DangerFullAccess) {
return self.execute_unsandboxed(cmd, cwd);
}
// Try preferred sandbox
match self.preferred.prepare(cwd) {
Ok(env) => return self.preferred.execute(cmd, env),
Err(e) => tracing::warn!("Primary sandbox unavailable: {e}"),
}
// Try fallbacks
for fallback in &self.fallbacks {
if let Ok(env) = fallback.prepare(cwd) {
return fallback.execute(cmd, env);
}
}
Err(Error::NoSandboxAvailable)
}
}
interface Hook {
matcher: string | RegExp;
handler: (context: HookContext) => Promise<HookResult>;
}
interface HookResult {
decision: 'allow' | 'deny' | 'ask';
message?: string;
modifiedInput?: Record<string, unknown>;
}
class HookRunner {
private hooks: Map<HookEvent, Hook[]> = new Map();
register(event: HookEvent, hook: Hook): void {
const hooks = this.hooks.get(event) || [];
hooks.push(hook);
this.hooks.set(event, hooks);
}
async run(event: HookEvent, context: HookContext): Promise<HookResult[]> {
const hooks = this.hooks.get(event) || [];
const matchingHooks = hooks.filter(h => this.matches(h.matcher, context.toolName));
// Run all matching hooks in parallel
const results = await Promise.all(
matchingHooks.map(h => h.handler(context))
);
return results;
}
private matches(matcher: string | RegExp, toolName: string): boolean {
if (matcher === '*') return true;
if (typeof matcher === 'string') {
return matcher.split('|').includes(toolName);
}
return matcher.test(toolName);
}
}
// Integration with tool execution
async function executeToolWithHooks(
hookRunner: HookRunner,
tool: Tool,
args: Record<string, unknown>,
): Promise<ToolResult> {
// Run PreToolUse hooks
const preResults = await hookRunner.run('PreToolUse', {
toolName: tool.name,
toolInput: args,
});
// Check for denials
const denied = preResults.find(r => r.decision === 'deny');
if (denied) {
return { error: denied.message || 'Denied by hook' };
}
// Apply any input modifications
let modifiedArgs = args;
for (const result of preResults) {
if (result.modifiedInput) {
modifiedArgs = { ...modifiedArgs, ...result.modifiedInput };
}
}
// Execute tool
const result = await tool.execute(modifiedArgs);
// Run PostToolUse hooks
await hookRunner.run('PostToolUse', {
toolName: tool.name,
toolInput: modifiedArgs,
toolResult: result,
});
return result;
}
Start restricted, escalate on failure:
┌─────────────────┐
│ Try Sandboxed │
└────────┬────────┘
│ Denied?
┌────────▼────────┐
│ Request Approval │
└────────┬────────┘
│ Approved?
┌────────▼────────┐
│ Retry Unsandboxed│
└─────────────────┘
Maintain lists of known-safe operations:
# safe-commands.yaml
read_operations:
- ls
- cat
- head
- tail
- grep
- find
git_operations:
- git status
- git log
- git diff
- git branch
build_operations:
- npm install
- npm test
- cargo build
- cargo test
Use the LLM itself for safety decisions:
Evaluate this command for safety risks:
Command: {{command}}
Working directory: {{cwd}}
Consider:
1. Data loss potential (file deletion, overwrite)
2. System impact (permissions, services)
3. Network access (external requests)
4. Credential exposure
5. Privilege escalation
Respond with:
- SAFE: No significant risks
- REVIEW: Needs human approval with explanation
- BLOCK: Should not be executed with reason
# Conservative defaults
approval_policy = "on-request"
sandbox_mode = "workspace-write"
# Allow network in sandbox
[sandbox_workspace_write]
network_access = false
# Profiles for different use cases
[profiles.read_only]
approval_policy = "never"
sandbox_mode = "read-only"
[profiles.full_auto]
approval_policy = "on-request"
sandbox_mode = "workspace-write"
[profiles.dangerous]
approval_policy = "never"
sandbox_mode = "danger-full-access"
{
"description": "Security validation hooks",
"hooks": {
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "bash ${CLAUDE_PLUGIN_ROOT}/hooks/validate-bash.sh",
"timeout": 5
}
]
},
{
"matcher": "Write|Edit",
"hooks": [
{
"type": "prompt",
"prompt": "Check if writing to $TOOL_INPUT.file_path is safe. Consider sensitive files, system paths, credentials. Return approve/deny.",
"timeout": 10
}
]
}
]
}
}
Safety in LLM tool execution requires:
| Layer | Mechanism | Example |
|---|---|---|
| User | Approval prompts | "Allow write to /etc/hosts?" |
| Application | Hooks/policies | Block rm -rf, validate paths |
| System | OS sandboxing | Seatbelt, Landlock, seccomp |
The next guide covers parallel tool execution strategies.