22/03/2026

Claude Code Hooks: The Deterministic Security Layer Your AI Agent Needs
> APPSEC_ENGINEERING // CLAUDE_CODE // FIELD_REPORT

Claude Code Hooks: The Deterministic Security Layer Your AI Agent Needs

CLAUDE.md rules are suggestions. Hooks are enforced gates. exit 2 = blocked. No negotiation. If you're letting an AI agent write code without guardrails, here's how you fix that.

// March 2026 • 12 min read • security-first perspective

Why This Matters (Or: How Your AI Agent Became an Insider Threat)

Since the corporate suits decided to go all in with AI (and fire half of the IT population), the market has changed dramatically, let's cut through the noise. The suits in the boardroom are excited about AI agents. "Autonomous productivity!" they say. "Digital workforce!" they cheer. Meanwhile, those of us who actually hack things for a living are watching these agents get deployed with shell access, API keys, and service-level credentials — and zero security controls beyond a politely worded system prompt.

The numbers are brutal. According to a 2026 survey of 1,253 security professionals, 91% of organizations only discover what an AI agent did after it already executed the action. Only 9% can intervene before an agent completes a harmful action. The other 91%? 35% find it in logs after the fact. 32% have no visibility at all. Let that sink in: for every ten organizations running agentic AI, fewer than one can stop an agent from deleting a repository, modifying a customer record, or escalating a privilege before it happens.

And this isn't theoretical. 37% of organizations experienced AI agent-caused operational issues in the past twelve months. 8% were significant enough to cause outages or data corruption. Agents are already autonomously moving data to untrusted locations, deleting configs, and making decisions that no human reviewed.

NVIDIA's AI red team put it bluntly: LLM-generated code must be treated as untrusted output. Sanitization alone is not enough — attackers can craft prompts that evade filters, manipulate trusted library functions, and exploit model behaviors in ways that bypass traditional controls. An agent that generates and runs code on the fly creates a pathway where a crafted prompt escalates into remote code execution. That's not a bug. That's the architecture working as designed.

Krebs on Security ran a piece this month on autonomous AI assistants that proactively take actions without being prompted. The comments section was full of hackers (the good kind) asking the same question: "Who's watching the watchers?" Because your SIEM and EDR tools were built to detect anomalies in human behavior. An agent that runs code perfectly 10,000 times in sequence looks normal to these systems. But that agent might be executing an attacker's will.

OWASP saw this coming. They released a dedicated Top 10 for Agentic AI Applications — the #1 risk is Agent Goal Hijacking, where an attacker manipulates an agent's objectives through poisoned inputs. The agent can't tell the difference between legitimate instructions and malicious data. A single poisoned email, document, or web page can redirect your agent to exfiltrate data using its own legitimate access.

So here's the thing. You can write all the CLAUDE.md rules you want. You can put "never delete production data" in your system prompt. But those are requests, not guarantees. The model might ignore them. Prompt injection can override them. They're advisory — and advisory doesn't cut it when the agent has kubectl access to your prod cluster.

Hooks are the answer. They're the deterministic layer that sits between intent and execution. They don't ask the model nicely. They enforce. exit 2 = blocked, period. The model cannot bypass a hook. It's not running in the model's context — it's a plain shell script triggered by the system, outside the LLM entirely.

If you're an AppSec hacker who's been watching this AI agent gold rush with growing anxiety — this post is your field manual. We're going to cover what hooks are, how to wire them up, and the 5 production hooks that should be non-negotiable on every Claude Code deployment. The suits can keep their "digital workforce." We're going to make sure it can't burn the house down.

TL;DR

Claude Code hooks are user-defined scripts that fire at specific lifecycle events — before a tool runs, after it completes, when a session starts, or when Claude stops responding. They run outside the LLM as plain scripts, not prompts. exit 0 = allow. exit 2 = block. As of March 2026: 21 lifecycle events, 4 handler types (command, HTTP, prompt, agent), async execution, and JSON structured output. This post covers what they are, how to configure them, and 5 production hooks you should deploy today.

What Are Claude Code Hooks?

Hooks are shell commands, HTTP endpoints, or LLM prompts that execute automatically at specific points in Claude Code's lifecycle. They run outside the LLM — plain scripts triggered by Claude's actions, not prompts interpreted by the model. Think of them as tripwires you set around your agent's execution path.

This distinction is what makes them powerful. Function calling extends what an AI can do. Hooks constrain what an AI does. The AI doesn't request a hook — the hook intercepts the AI. The model has zero say in whether the hook fires. It's not a polite suggestion in a system prompt that the model can "forget" when it's 50 messages deep. It's a shell script with exit 2. Deterministic. Unavoidable.

Claude Code execution
Event fires
Matcher evaluates
Hook executes

Your hook receives JSON context via stdin — session ID, working directory, tool name, tool input. It inspects, decides, and optionally returns a decision. exit 0 = allow. exit 2 = block. exit 1 = non-blocking warning (action still proceeds).

// HACKERS: READ THIS FIRST

Exit code 1 is NOT a security control. It only logs a warning — the action still goes through. Every security hook must use exit 2, or you've built a monitoring tool, not a gate. This is the rookie mistake I see everywhere. If your hook exits 1, the agent smiled at your warning and kept going.


The 21 Lifecycle Events

Here are the critical events. The ones you'll use 90% of the time are PreToolUse, PostToolUse, and Stop.

EventWhen It FiresBlocks?Use Case
SessionStartSession begins, resumes, clears, or compactsNOEnvironment setup, context injection
PreToolUseBefore any tool executionYES — deny/allow/escalateSecurity gates, input validation, command blocking
PostToolUseAfter tool completes successfullyYES — blockAuto-formatting, test runners, security scans
PostToolUseFailureAfter a tool failsYES — blockError handling, retry logic
PermissionRequestPermission dialog about to showYES — allow/denyAuto-approve safe ops, deny risky ones
UserPromptSubmitUser submits a promptYES — blockPrompt validation, injection detection
StopClaude finishes respondingYES — blockOutput validation, prevent premature stops
SubagentStopSubagent completesYES — blockSubagent task verification
SubagentStartSubagent startsNODB connection setup, agent-specific env
NotificationClaude sends a notificationNODesktop/Slack alerts, logging
PreCompactBefore compactionNOTranscript backup, context preservation
ConfigChangeConfig file changes during sessionYES — blockAudit logging, block unauthorized changes
SetupVia --init or --maintenanceNORepository setup and maintenance
// SUBAGENT RECURSION

Hooks fire for subagent actions too. If Claude spawns a subagent, your PreToolUse and PostToolUse hooks execute for every tool the subagent uses. Without recursive hook enforcement, a subagent could bypass your safety gates.


Configuration: Where Hooks Live

FileScopeCommit?
~/.claude/settings.jsonUser-wide (all projects)NO
.claude/settings.jsonProject-level (whole team)YES — COMMIT THIS
.claude/settings.local.jsonLocal overridesNO (gitignored)
// BEST PRACTICE

Put non-negotiable security gates in .claude/settings.json (project-level, committed to repo). Every team member gets the same guardrails automatically. Personal preferences go in .claude/settings.local.json.


The 4 Handler Types

1. Command Hooks — type: "command"

Shell scripts that receive JSON via stdin. The workhorse for most use cases.

{ "type": "command", "command": ".claude/hooks/block-rm.sh" }

2. HTTP Hooks — type: "http"

POST requests to an endpoint. Slack notifications, audit logging, webhook CI/CD triggers.

{ "type": "http", "url": "https://your-webhook.example.com/hook" }

3. Prompt Hooks — type: "prompt"

Send a prompt to a Claude model for single-turn semantic evaluation. Perfect for decisions regex can't handle — "does this edit touch authentication logic?"

{ "type": "prompt", "prompt": "Does this change modify auth logic? Input: $ARGUMENTS" }

4. Agent Hooks — type: "agent"

Spawn subagents with access to Read, Grep, Glob for deep codebase verification. The most powerful handler for complex multi-file security checks.


5 Production Hooks You Should Deploy Today

HOOK 01

Block Destructive Shell Commands

Event: PreToolUse | Matcher: Bash

Prevent rm -rf, DROP TABLE, chmod 777, and other commands that would make any hacker wince. Your AI agent doesn't need to nuke filesystems or wipe databases. If it tries, something has gone very wrong and you want that action dead before it executes.

// .claude/hooks/block-dangerous.sh

#!/bin/bash
# Read JSON from stdin
INPUT=$(cat)
COMMAND=$(echo "$INPUT" | jq -r '.tool_input.command // empty')

# Define dangerous patterns
DANGEROUS_PATTERNS=(
  "rm -rf"
  "rm -fr"
  "chmod 777"
  "DROP TABLE"
  "DROP DATABASE"
  "mkfs"
  "> /dev/sda"
  ":(){ :|:& };:"
)

for pattern in "${DANGEROUS_PATTERNS[@]}"; do
  if echo "$COMMAND" | grep -qi "$pattern"; then
    echo "BLOCKED: Destructive command: $pattern" >&2
    jq -n '{
      hookSpecificOutput: {
        hookEventName: "PreToolUse",
        permissionDecision: "deny",
        permissionDecisionReason: "Blocked by security hook"
      }
    }'
    exit 2
  fi
done

exit 0

// settings.json config

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash",
        "hooks": [
          {
            "type": "command",
            "command": ".claude/hooks/block-dangerous.sh"
          }
        ]
      }
    ]
  }
}
HOOK 02

Auto-Format on Every File Write

Event: PostToolUse | Matcher: Write|Edit|MultiEdit

Every time Claude writes or edits a file, Prettier runs automatically. No prompt needed. No permission dialog. No exceptions.

{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Write|Edit|MultiEdit",
        "hooks": [
          {
            "type": "command",
            "command": "npx prettier --write \"$CLAUDE_TOOL_INPUT_FILE_PATH\""
          }
        ]
      }
    ]
  }
}
HOOK 03

Block Access to Sensitive Files

Event: PreToolUse | Matcher: Read|Edit|Write|MultiEdit|Bash

Prevent Claude from reading or modifying .env, private keys, credentials, kubeconfig, and other sensitive files. This is Least Privilege 101 — the same principle every pentester exploits when they find an overprivileged service account. Don't let your AI agent become the next one.

// .claude/hooks/block-sensitive.sh

#!/bin/bash
INPUT=$(cat)
FILE_PATH=$(echo "$INPUT" | jq -r '.tool_input.file_path // .tool_input.path // empty')

# Sensitive file patterns
SENSITIVE_PATTERNS=(
  "\.env$"      "\.env\."
  "secrets\."   "credentials"
  "\.pem$"      "\.key$"
  "id_rsa"      "id_ed25519"
  "\.pfx$"      "kubeconfig"
  "\.aws/credentials"
  "\.ssh/"      "vault\.json"
  "token\.json"
)

for pattern in "${SENSITIVE_PATTERNS[@]}"; do
  if echo "$FILE_PATH" | grep -qiE "$pattern"; then
    echo "BLOCKED: Sensitive file: $FILE_PATH" >&2
    jq -n '{
      hookSpecificOutput: {
        hookEventName: "PreToolUse",
        permissionDecision: "deny",
        permissionDecisionReason: "Sensitive file access blocked"
      }
    }'
    exit 2
  fi
done

exit 0
HOOK 04

Run Tests After Code Changes

Event: PostToolUse | Matcher: Write|Edit|MultiEdit

Automatically run your test suite on modified files. Catch regressions immediately instead of waiting for CI.

// .claude/hooks/run-tests.sh

#!/bin/bash
INPUT=$(cat)
FILE_PATH=$(echo "$INPUT" | jq -r '.tool_input.file_path // empty')

# Only run tests for source files
if echo "$FILE_PATH" | grep -qE '\.(js|ts|py|jsx|tsx)$'; then
  # Skip test files to avoid loops
  if echo "$FILE_PATH" | grep -qE '(test|spec|__test__)'; then
    exit 0
  fi

  # Detect framework and run
  if [ -f "package.json" ]; then
    npm test --silent 2>&1 | tail -5
  elif [ -f "pytest.ini" ] || [ -f "pyproject.toml" ]; then
    python -m pytest --tb=short -q 2>&1 | tail -10
  fi
fi

exit 0
HOOK 05

Slack / Desktop Notification on Completion

Event: Stop | Matcher: (any)

When Claude finishes a long-running task, get notified immediately. Never forget about a background session again.

// .claude/hooks/notify-complete.sh

#!/bin/bash
INPUT=$(cat)
STOP_REASON=$(echo "$INPUT" | jq -r '.stop_reason // "completed"')

# macOS notification
osascript -e "display notification \"Claude: $STOP_REASON\" with title \"Claude Code\""

# Optional: Slack webhook
SLACK_WEBHOOK="${SLACK_WEBHOOK_URL}"
if [ -n "$SLACK_WEBHOOK" ]; then
  curl -s -X POST "$SLACK_WEBHOOK" \
    -H 'Content-Type: application/json' \
    -d "{\"text\": \"Claude Code finished: $STOP_REASON\"}" \
    > /dev/null 2>&1
fi

exit 0

Advanced: PreToolUse Input Modification

Starting in v2.0.10, PreToolUse hooks can modify tool inputs before execution — without blocking the action. You intercept, modify, and let execution proceed with corrected parameters. The modification is invisible to Claude.

Use cases: automatic dry-run flags on destructive commands, secret redaction, path correction to safe directories, commit message formatting enforcement.

// Example — Force dry-run on kubectl delete:

#!/bin/bash
INPUT=$(cat)
COMMAND=$(echo "$INPUT" | jq -r '.tool_input.command // empty')

if echo "$COMMAND" | grep -q "kubectl delete" && \
   ! echo "$COMMAND" | grep -q "--dry-run"; then
  MODIFIED=$(echo "$COMMAND" | sed 's/kubectl delete/kubectl delete --dry-run=client/')
  jq -n --arg cmd "$MODIFIED" '{
    hookSpecificOutput: {
      hookEventName: "PreToolUse",
      permissionDecision: "allow",
      updatedInput: { command: $cmd }
    }
  }'
  exit 0
fi

exit 0

Advanced: Prompt Hooks for Semantic Security

Shell scripts handle pattern matching. But what about context-dependent decisions like "does this edit touch authentication logic?" or "does this query access PII columns?"

Prompt hooks delegate the decision to a lightweight Claude model:

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Edit|Write|MultiEdit",
        "hooks": [
          {
            "type": "prompt",
            "prompt": "You are a security reviewer. Does this change modify auth, authz, or session management? If yes: {\"hookSpecificOutput\": {\"hookEventName\": \"PreToolUse\", \"permissionDecision\": \"escalate\", \"permissionDecisionReason\": \"Auth logic — human review required\"}}. If no: {}. Change: $ARGUMENTS"
          }
        ]
      }
    ]
  }
}

The escalate decision surfaces the action to the user for manual approval — perfect for high-risk changes that need a human in the loop.


Security Considerations

// 01: HOOKS RUN WITH YOUR USER PERMISSIONS

There is no sandbox. Your hooks execute with the same privileges as your shell. A malicious hook has full access to your filesystem, network, and credentials. Treat hook scripts like production code. Review them. Version control them. Don't curl | bash random hook repos from some stranger's GitHub. You wouldn't run an unvetted binary — don't run unvetted hooks either.

// 02: EXIT 2 VS EXIT 1 — THIS MATTERS

exit 2 = action is BLOCKED. Claude sees the rejection and suggests alternatives.
exit 1 = non-blocking warning. Action still proceeds.
Every security hook must use exit 2. Exit 1 = you're logging, not enforcing.

// 03: SUBAGENT RECURSION LOOPS

A UserPromptSubmit hook that spawns subagents can create infinite loops if those subagents trigger the same hook. Check for a subagent indicator in hook input before spawning. Scope hooks to top-level agent sessions only.

// 04: PERFORMANCE IS THE REAL CONSTRAINT

Each hook runs synchronously, adding execution time to every matched tool call. Threshold: if a PostToolUse hook adds >500ms to every file edit, the session becomes sluggish. Profile with time. Keep each under 200ms.

// 05: CLAUDE.MD = ADVISORY. HOOKS = ENFORCED.

"Never modify .env files" in CLAUDE.md = a polite request. The model might ignore it. A prompt injection will definitely override it.
A PreToolUse hook blocking .env access with exit 2 = a locked door. The model doesn't have the key.
Stop writing rules. Start writing hooks.


Getting Started Checklist

  • Start with two hooks: Destructive command blocker (Hook 01) and sensitive file gate (Hook 03). These prevent the most common AI agent mistakes with zero maintenance.
  • Commit to .claude/settings.json in your repo so the whole team shares the same guardrails automatically.
  • Use claude --debug when hooks don't fire as expected — shows exactly what's matching and executing.
  • Keep hooks fast — under 200ms each. Profile with time. Ten fast hooks outperform two slow ones.
  • Use $CLAUDE_PROJECT_DIR prefix for hook paths in settings.json for reliable path resolution.
  • Toggle verbose mode with Ctrl+O to see stdout/stderr from hooks in real-time during a session.

// References

  • Anthropic Official Docs — docs.anthropic.com/en/docs/claude-code/hooks
  • Claude Code Hooks Reference — code.claude.com/docs/en/hooks
  • GitHub: claude-code-hooks-mastery — github.com/disler/claude-code-hooks-mastery
  • 5 Production Hooks Tutorial — blakecrosley.com/blog/claude-code-hooks-tutorial
  • SmartScope Complete Guide — smartscope.blog/en/generative-ai/claude/claude-code-hooks-guide
  • PromptLayer Docs — blog.promptlayer.com/understanding-claude-code-hooks-documentation

15/03/2026

🔗 Connecting Claude AI with Kali Linux & Burp Suite via MCP

The Practical Guide to AI-Augmented Penetration Testing in 2026
📅 March 2026 ✍️ altcoinwonderland ⏱️ 15 min read 🏷️ AppSec | Offensive Security | AI

⚡ TL;DR

  • MCP (Model Context Protocol) bridges Claude AI with Kali Linux and Burp Suite, enabling natural-language-driven pentesting
  • PortSwigger's official MCP extension and six2dez's Burp AI Agent are the two primary integration paths for Burp Suite
  • Kali's mcp-kali-server package (officially documented Feb 2026) exposes Nmap, Metasploit, SQLMap, and 10+ tools to Claude
  • The architecture is: Claude Desktop/Code → MCP → Kali/Burp → structured output → Claude analysis
  • Critical OPSEC warnings: prompt injection, tool poisoning, and cloud data leakage are real risks — treat MCP servers as untrusted code

Introduction: Why This Matters Now

In February 2026, Kali Linux officially documented a native AI-assisted penetration testing workflow using Anthropic's Claude via the Model Context Protocol (MCP). Weeks earlier, PortSwigger shipped their official MCP Server extension for Burp Suite. These aren't experimental toys — they represent a fundamental shift in how offensive security practitioners interact with their tooling.

Instead of memorising Nmap flags, crafting SQLMap syntax, or manually triaging hundreds of Burp proxy entries, you describe what you want in plain English. Claude interprets, plans, executes, and analyses — then iterates if needed. The entire recon-to-report loop becomes conversational.

This article walks you through the complete setup, the two Burp Suite integration paths, the Kali MCP architecture, practical prompt workflows, and — critically — the security risks you must understand before deploying this anywhere near a real engagement.


1. Understanding the Architecture

All three integration paths (Burp MCP, Burp AI Agent, Kali MCP) share the same core pattern: Claude communicates with your tools through MCP, a standardised protocol that Anthropic open-sourced in late 2024. Think of MCP as a universal API bridge that lets LLMs call external tools while maintaining session context.

You (Claude Desktop / Claude Code) Claude Sonnet (Cloud LLM) MCP Protocol Layer Kali / Burp Suite (Execution)

Structured Output Claude Analysis Tool Results

The three components in every setup are:

UI Layer Claude Desktop (macOS/Windows) or Claude Code (CLI). This is where you type prompts and receive results.
Intelligence Layer Claude Sonnet model (cloud-hosted). Interprets intent, selects tools, structures execution, analyses output.
Execution Layer Kali Linux (mcp-kali-server on port 5000) or Burp Suite (MCP extension on port 9876). Runs the actual commands.
Protocol Bridge MCP handles structured request/response between Claude and your tools over SSH (Kali) or localhost (Burp).

2. Path A: Burp Suite + Claude via PortSwigger's Official MCP Extension

PortSwigger maintains the official MCP Server extension in the BApp Store. It works with both Burp Pro and Community Edition.

Setup Steps

1Install the MCP Extension — Open Burp Suite → Extensions → BApp Store → search "MCP Server" → Install.

2Configure the MCP Server — The MCP tab appears in Burp. Default endpoint: http://127.0.0.1:9876. Enable/disable specific tools (send requests, create Repeater tabs, read proxy history, edit config).

3Install to Claude Desktop — Click "Install to Claude Desktop" button in the MCP tab. This auto-generates the JSON config. Alternatively, manually edit:

// macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
// Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "burp": {
      "command": "<path-to-java>",
      "args": [
        "-jar",
        "/path/to/mcp-proxy-all.jar",
        "--sse-url",
        "http://127.0.0.1:9876/sse"
      ]
    }
  }
}

4Restart Claude Desktop — Fully quit (check system tray), then relaunch. Verify under Settings → Developer → Burp integration active.

5Start Prompting — Claude now has access to your Burp proxy history, Repeater, and can send HTTP requests directly.


3. Path B: Burp AI Agent (six2dez) — The Power Option

The Burp AI Agent by six2dez is a more feature-rich alternative. It goes significantly beyond the official extension.

7 AI Backends Ollama, LM Studio, Generic OpenAI-compatible, Gemini CLI, Claude CLI, Codex CLI, OpenCode CLI
53+ MCP Tools Full autonomous Burp control — proxy, Repeater, Intruder, scanner integration
62 Vulnerability Classes Passive and Active AI scanners across injection, auth, crypto, and more
3 Privacy Modes STRICT / BALANCED / OFF — redact sensitive data before it leaves Burp

Setup

# Build from source (requires Java 21)
git clone https://github.com/six2dez/burp-ai-agent.git
cd burp-ai-agent
JAVA_HOME=/path/to/jdk-21 ./gradlew clean shadowJar

# Or download the JAR from Releases
# Load in Burp: Extensions → Add → Select JAR

Claude Desktop config for Burp AI Agent:

{
  "mcpServers": {
    "burp-ai-agent": {
      "command": "npx",
      "args": [
        "-y",
        "supergateway",
        "--sse",
        "http://127.0.0.1:9876/sse"
      ]
    }
  }
}
💡 Key advantage of Burp AI Agent: Right-click any request in Proxy → HTTP History → Extensions → Burp AI Agent → "Analyse this request" — opens a chat session with the AI analysis. The 3 privacy modes (STRICT/BALANCED/OFF) and JSONL audit logging with SHA-256 integrity hashing make it more suitable for professional engagements.

4. Kali Linux + Claude via mcp-kali-server

Officially documented by the Kali team in February 2026, mcp-kali-server is available via apt and exposes penetration testing tools through a Flask-based API on localhost:5000.

Supported Tools

ReconNmap, Gobuster, Dirb, enum4linux-ng
Web ScanningNikto, WPScan, SQLMap
ExploitationMetasploit Framework
Credential TestingHydra, John the Ripper

Setup

# On Kali Linux
sudo apt update
sudo apt install mcp-kali-server kali-server-mcp

# Start the MCP server
mcp-kali-server
# Runs Flask API on localhost:5000

Claude Desktop connects over SSH using stdio transport. Add to your config:

{
  "mcpServers": {
    "kali": {
      "command": "ssh",
      "args": [
        "kali@<KALI_IP>",
        "mcp-server"
      ]
    }
  }
}
💡 Linux Users: Claude Desktop has no official Linux build as of March 2026. Workarounds include WINE, unofficial Linux packages, or alternative MCP clients such as 5ire, AnythingLLM, Goose Desktop, and Witsy. Claude Code (CLI) works natively on Linux and is arguably the better option for Kali integration.

5. Practical Prompt Workflows — Optimising Your Skills

The integration is only as good as how you prompt it. Here are real-world workflow patterns that maximise Claude's value.

5.1 Recon Triage (Kali MCP)

"Run an Nmap service scan on 10.10.10.100 with version detection. If you find HTTP on any port, follow up with Gobuster using the common.txt wordlist. Summarise all findings with risk ratings."

Claude will chain: verify tool availability → execute nmap -sV → parse open ports → conditionally run gobuster → produce a structured summary with prioritised findings. One prompt replaces 3-4 manual steps.

5.2 Proxy History Analysis (Burp MCP)

"From the HTTP history in Burp, find all POST requests to API endpoints that accept JSON. Identify any that pass user IDs in the request body — I'm hunting for IDOR and BOLA vulnerabilities."

Claude reads your proxy history, filters by content type and method, identifies parameter patterns, and flags candidates for manual testing. This alone saves hours on large applications.

5.3 Automated Test Plan Generation (Burp MCP)

"Analyse the JavaScript files in Burp history. Extract API endpoints, identify authentication mechanisms, and generate a test plan covering OWASP API Security Top 10."

5.4 Collaborator-Assisted SSRF Testing (Burp MCP + Claude Code)

"Take the request in Repeater tab 1. Identify any parameters that accept URLs or hostnames. Create variations pointing to my Collaborator URL and send each one. Report back which triggered a DNS lookup."

5.5 Full Report Generation (Post-Engagement)

"Compile all findings from this session into a structured pentest report. Include: vulnerability title, severity (CVSS where possible), affected endpoint, proof of concept, and remediation steps."
💡 Skill Optimisation Tips:
Be specific with scope — "scan ports 1-1000" not just "scan the target"
Chain conditional logic — "if you find X, then do Y" leverages Claude's reasoning
Request structured output — "format as a markdown table" or "create Repeater tabs for each finding"
Use Claude Code over Desktop for Kali — CLI-native, works on Linux, better for multi-step chains
Iterate — Claude maintains session context, so you can refine: "now test that endpoint for SQLi"

6. Security Risks — Read This Before Deploying

This is where most guides stop. Don't be that person. MCP-enabled AI workflows introduce real, documented attack surfaces.

⚠️ CRITICAL: Known CVEs in MCP Ecosystem (January 2026)

Three vulnerabilities were disclosed in Anthropic's official Git MCP server, directly demonstrating that MCP servers are exploitable via prompt injection:

CVE-2025-68143 Path traversal via arbitrary path acceptance in git_init
CVE-2025-68144 Argument injection via unsanitised git CLI args in git_diff / git_checkout
CVE-2025-68145 Path validation weakness around repository scoping

Researchers demonstrated chaining these with a Filesystem MCP server to achieve code execution. This is not theoretical.

Threat Model for MCP-Assisted Pentesting

Prompt Injection: Malicious content in target responses (HTML, headers, error messages) can feed instructions back into Claude's reasoning loop. A target application could craft responses that manipulate Claude's next actions — classic "data becomes instructions" routed through a new control plane.

Tool Poisoning: CyberArk and Invariant Labs have documented scenarios where malicious instructions embedded in tool descriptions or command output can manipulate the LLM into unintended actions, including data exfiltration.

Cloud Data Leakage: Every prompt and tool output transits through Anthropic's cloud infrastructure. For client engagements with confidentiality requirements, this likely violates your engagement letter. Sending target data to a third-party API is a non-starter for most professional pentests.

Over-Permissioned Execution: The mcp-kali-server can execute terminal commands. A poorly scoped setup with root access is a catastrophic vulnerability if the LLM is manipulated.

Hardening Checklist

# OPSEC checklist for MCP-assisted pentesting

[ ] Run Kali in an isolated VM or container — disposable, no shared credentials
[ ] No SSH agent forwarding to the Kali execution host
[ ] Minimal outbound network — open only what you need
[ ] Use Burp AI Agent's STRICT privacy mode for client work
[ ] Enable JSONL audit logging with integrity hashing
[ ] Human-in-the-loop approval for destructive or high-risk commands
[ ] Never use on real client targets without explicit written authorisation for AI-assisted testing
[ ] Review all Claude-generated commands before execution on production targets
[ ] Treat MCP servers as untrusted third-party code — test for command injection, path traversal, SSRF
[ ] For air-gapped requirements: use Ollama + local models via Burp AI Agent instead of cloud Claude

7. Which Path Should You Choose?

PortSwigger MCP Extension ✅ Official, simple setup
✅ BApp Store install
❌ Fewer features
❌ No privacy modes
🎯 Best for: lab work, CTFs, learning
Burp AI Agent (six2dez) ✅ 53+ tools, 62 vuln classes
✅ 3 privacy modes + audit logging
✅ 7 AI backends (inc. local)
❌ Requires Java 21 build
🎯 Best for: professional engagements
Kali mcp-kali-server ✅ Full Kali toolset access
✅ Official Kali package
❌ Cloud dependency
❌ No Linux Claude Desktop
🎯 Best for: recon, enumeration, CTFs
Combined Stack ✅ Maximum coverage
✅ Burp for web + Kali for infra
❌ Complex setup
❌ Largest attack surface
🎯 Best for: comprehensive assessments

8. Conclusion: AI Won't Replace You — But It Will Change How You Work

Let's be clear about what this is and what it isn't. Claude + MCP is not autonomous pentesting. It doesn't exercise judgement, assess business impact, or make ethical decisions. What it does is eliminate the repetitive friction of context switching, command crafting, output parsing, and report formatting — the tasks that consume 60-70% of a typical engagement.

The practitioners who will thrive are those who use AI as an intelligent assistant while maintaining the critical thinking, methodology discipline, and OPSEC awareness that no LLM can replicate. Start with lab environments and CTFs. Build confidence with the tooling. Understand the security risks deeply. Then — and only then — consider how it fits into your professional workflow.

The command line remains powerful. Now it has a conversational layer. Use it wisely.


Sources & Further Reading

PortSwigger MCP Server ExtensionBurp AI Agent (six2dez)Kali Official Blog — LLM + Claude Desktopmcp-kali-server PackageSecEngAI — AI-Assisted Web PentestingPortSwigger MCP Server (GitHub)CybersecurityNews — Kali Integrates Claude AIModel Context Protocol (Official)Penligent — Critical Analysis of Kali + Claude MCP

#Claude #KaliLinux #BurpSuite #MCP #PenetrationTesting #AppSec #OffensiveSecurity #AIinCybersecurity #OSCP #BugBounty #ModelContextProtocol #altcoinwonderland