Red Team Assessment

CyberDoc's Red Team feature provides AI-powered autonomous penetration testing. A dual-model architecture pairs a PentestAgent (Claude Sonnet) for tool execution with an xAI Grok orchestrator for strategic reasoning. After the engagement completes, a multi-agent analysis team produces structured findings with CVSS-aligned severity, CVE research, attack chains, and threat intelligence.

Access: Red team assessments are available to Business ($99/mo — 3 scans/month) and Enterprise ($499/mo — 15 scans/month) subscribers. Enterprise includes Advanced, Expert, and Crew modes. Administrators have full access regardless of plan.

How It Works

Domain Verification — Users must verify ownership of the target domain via DNS TXT record or file upload before launching an engagement. Admins bypass this requirement.
Launch Engagement — Select a target, playbook (Recon, Web, Network, or Full), and infrastructure mode. The request is proxied to the PentestAgent backend.
AI Agent Loop — PentestAgent autonomously executes security tools, analyses results, and decides next steps via the MCP (Model Context Protocol) server. The agent runs until the task is complete or max iterations are reached.
Multi-Agent Analysis — Raw pentest output is sent to an xAI Grok multi-agent team (4 or 16 agents) that produces structured findings with CVSS severity, recon data, attack chains, CVE research, and threat intelligence gathered from web and X/Twitter searches.
Attack Chain Verification — Proposed attack chains can be executed directly against the target using automated Kali tool command mapping, with success validated via regex pattern matching.
Artifacts & Report — All output files (exploit proofs, screenshots, Metasploit output) are collected as artifacts. A comprehensive HTML report is generated with executive summary, findings by severity, attack chains, and remediation recommendations.

Agent Architecture

The Red Team system uses a dual-model architecture with specialised roles:

Role	Model	Purpose
PentestAgent	Claude Sonnet (claude-sonnet-4-20250514)	Tool execution, shell commands, browser interaction, structured note-taking
Orchestrator	xAI Grok (grok-4-1-fast-reasoning)	Strategic reasoning, task planning, adaptive decision-making
Analysis Team	xAI Grok Multi-Agent (grok-4.20-multi-agent)	Post-engagement structured analysis with 4 or 16 parallel agents

Infrastructure Modes

Mode	Backend	Tools	Max Iterations	Access
Standard	Docker container (App Runner)	PentestAgent + ProjectDiscovery tools (nmap, nuclei, subfinder, httpx, ffuf, nikto)	60	Business, Enterprise
Advanced	Docker container (App Runner)	Standard + Kali tools (metasploit, hydra, john, sqlmap, wpscan)	80	Enterprise only
Expert	Dedicated EC2 instance (Kali Linux)	Full Kali arsenal + SecLists wordlists, privileged network access, Cloudflare Tunnel	100	Enterprise only

Expert Mode Lifecycle: Expert instances run on dedicated EC2 machines and can be started/stopped on demand via the admin dashboard. A Lambda proxy manages the EC2 lifecycle (start, stop, status). Health checks verify readiness before launching engagements.

Playbooks

Playbook	Focus	Typical Duration
Recon	Subdomain enumeration, port scanning, service fingerprinting, DNS configuration, exposed endpoints	10–30 minutes
Web	OWASP Top 10, security headers, TLS config, cookie security, directory discovery, injection testing	30–60 minutes
Network	Port/service enumeration, known CVEs, default credentials, network segmentation	30–60 minutes
Full Red Team	All phases: Recon, Web, Network, then Report generation with remediation steps	1–3 hours

Agent Tools

The PentestAgent communicates via MCP (Model Context Protocol) and has access to these tool categories:

terminal — Execute shell commands (nmap, nuclei, curl, sqlmap, metasploit, hydra, wpscan, ffuf, gobuster, etc.). Output truncated at 50K chars.
browser — Playwright headless browser for web interaction (navigate, click, type, screenshot, extract links/forms).
notes — Structured finding storage with category validation (credential, vulnerability, finding, artifact, recon, infrastructure, report). Persists across the engagement.
web_search — Web search integration for OSINT and CVE lookup.

Multi-Agent Analysis

After the PentestAgent completes, raw output is analysed by an xAI Grok multi-agent team. The analysis produces a structured result with the following components:

Component	Description
Findings	Vulnerabilities with CVSS-aligned severity, CWE classification, evidence, impact, and specific remediation steps
Recon Data	IP addresses, subdomains, open ports, and detected technologies
Attack Chains	Multi-step exploit paths with risk level, step-by-step actions, and overall impact assessment
CVE Research	Relevant CVEs with exploit-in-the-wild status and patch availability (via live web search)
Threat Intel	Recent threat discussions from web and X/Twitter searches about the target's technology stack
Positive Controls	Security measures the target has correctly implemented
Risk Rating	Overall risk level with justification

Analysis can use either a 4-agent team (standard) or 16-agent team (deep analysis) and can be re-run on demand via the reanalyze endpoint.

Finding Severity

Findings use CVSS-aligned severity ratings assigned by the multi-agent analysis team:

Severity	CVSS Range	Examples
Critical	9.0+	RCE, auth bypass, credential exposure, actively exploited CVEs
High	7.0–8.9	SQLi, stored XSS, exposed SSH, SSRF, file upload vulnerabilities
Medium	4.0–6.9	Missing security headers, user enumeration, outdated software, info disclosure CVEs
Low	0.1–3.9	Verbose error messages, directory listings, minor misconfigurations
Info	—	Informational only, not a vulnerability

Attack Chain Verification

Attack chains identified by the multi-agent analysis can be verified by executing them directly against the target. The chain command mapper translates high-level chain steps into concrete Kali tool commands.

Supported Attack Patterns

WordPress — User enumeration (wpscan), plugin scanning, XML-RPC brute force, credential testing, shell upload
SQL Injection — Automated sqlmap execution with database enumeration
Directory Brute Forcing — ffuf, dirb, gobuster with custom wordlists
SSH Brute Force — Hydra with configurable wordlists (smart, custom, top1000, top10000 modes)
Exploitation — Metasploit framework integration
Lateral Movement — Post-exploitation and privilege escalation

Chain Execution Modes

Exploit Chain — Automated execution of analysis-identified chains with templated commands, timeout management, and regex-based success validation
Custom Chain — User-defined chains with custom parameters (usernames, passwords, target overrides, brute force mode selection)
Adaptive Chain — Dynamic chain execution that adapts based on results from previous steps

Artifacts

Engagement artifacts (exploit proofs, tool output, screenshots, loot) are automatically collected from the PentestAgent backend and stored for review:

Fetched from the backend /rt/artifacts endpoint (bulk or individual)
Stored in KV with 90-day retention
Tracked in the engagement_artifacts database table with file metadata
Downloadable via the artifacts API endpoints

Red Team Operator

The Red Team Operator is a voice and text AI agent interface for administrators to interactively manage engagements:

Voice conversations via xAI voice API with real-time transcript
Text-based chat with persistent conversation history
Can programmatically create engagements, launch attack chains, and fetch findings
Conversations are linked to engagements and stored in the redteam_conversations table
Admin-only access with unrestricted pentest safeguards

Domain Verification

Before launching a red team engagement, you must verify ownership of the target domain. Two methods are supported:

DNS TXT Record — Add a TXT record to the domain with a generated verification token. Checked via Google DNS.
File Upload — Place a file at /.well-known/cyberdoc-verify.txt containing the token. Checked via HTTP fetch.

Once verified, the domain remains verified for future engagements. Admins bypass this requirement.

Engagement Lifecycle

Launch — Create engagement with target, playbook, scope, and infrastructure mode
Poll Status — Monitor progress (queued, running, complete, failed, cancelled)
View Results — Retrieve structured findings, recon data, attack chains, and analysis
Reanalyze — Re-run multi-agent analysis on existing results with updated prompts
Chain Verification — Execute identified attack chains for proof-of-exploit
Report — Generate branded HTML report for download or print
Archive/Delete — Archive old engagements or permanently delete them
Cancel — Abort a running engagement

Security Guardrails

Tool output from untrusted sources is filtered through prompt injection guardrails adapted from the CAI framework:

40+ regex patterns detecting instruction overrides, hidden commands, encoding tricks
Unicode homograph normalization (Cyrillic/Greek to Latin)
Content sanitization with security delimiters
Prevents target servers from hijacking the agent via crafted responses

API Endpoints

All red team endpoints require authentication and are prefixed with /api/redteam. Business or Enterprise plan required (admins exempt).

Engagements

Method	Endpoint	Description
POST	`/api/redteam/launch`	Start engagement (target, playbook, scope, mode)
GET	`/api/redteam/status?id=`	Poll engagement status and progress
GET	`/api/redteam/result?id=`	Get full results with structured findings and analysis
GET	`/api/redteam/notes?id=`	Get raw PentestAgent notes for an engagement
POST	`/api/redteam/cancel`	Cancel a running engagement
GET	`/api/redteam/engagements`	List workspace engagements
GET	`/api/redteam/engagements/:id`	Get single engagement details
POST	`/api/redteam/engagement/:id/archive`	Archive an engagement
POST	`/api/redteam/engagement/:id/unarchive`	Restore an archived engagement
DELETE	`/api/redteam/engagement/:id`	Permanently delete an engagement (admin only)

Analysis & Chain Verification

Method	Endpoint	Description
POST	`/api/redteam/reanalyze/:id`	Re-run multi-agent analysis on existing results
POST	`/api/redteam/exploit-chain/:id`	Execute an attack chain with automated Kali commands
POST	`/api/redteam/custom-chain/:id`	Execute a custom chain with user-defined parameters

Artifacts & Reports

Method	Endpoint	Description
GET	`/api/redteam/artifacts/:id`	List artifacts for an engagement
GET	`/api/redteam/artifact/:id/:filename`	Download a specific artifact file
GET	`/api/redteam/report/:id`	Generate branded HTML report

Domain Verification

Method	Endpoint	Description
POST	`/api/redteam/verify-domain`	Request a domain verification token
POST	`/api/redteam/check-verification`	Check domain verification status
GET	`/api/redteam/domains`	List verified domains for the workspace

Operator & Conversations

Method	Endpoint	Description
POST	`/api/redteam/voice`	Get ephemeral voice token for Red Team operator
POST	`/api/redteam/conversation`	Create or send message in an operator conversation
GET	`/api/redteam/conversations`	List operator conversations
DELETE	`/api/redteam/conversation/:id`	Delete a conversation

Expert Instance Management

Method	Endpoint	Description
POST	`/api/redteam/expert/:action`	Start, stop, or check status of Expert EC2 instance
POST	`/api/redteam/expert-health`	Check Expert instance readiness before launching

Metrics (Admin)

Method	Endpoint	Description
GET	`/api/redteam/metrics`	Engagement statistics and usage metrics (admin only)