CYBERDOC DOCS

Red Team Assessment

CyberDoc's Red Team feature provides AI-powered autonomous penetration testing. A dual-model architecture pairs a PentestAgent (Claude Sonnet) for tool execution with an xAI Grok orchestrator for strategic reasoning. After the engagement completes, a multi-agent analysis team produces structured findings with CVSS-aligned severity, CVE research, attack chains, and threat intelligence.

Access: Red team assessments are available to Business ($99/mo — 3 scans/month) and Enterprise ($499/mo — 15 scans/month) subscribers. Enterprise includes Advanced, Expert, and Crew modes. Administrators have full access regardless of plan.

How It Works

  1. Domain Verification — Users must verify ownership of the target domain via DNS TXT record or file upload before launching an engagement. Admins bypass this requirement.
  2. Launch Engagement — Select a target, playbook (Recon, Web, Network, or Full), and infrastructure mode. The request is proxied to the PentestAgent backend.
  3. AI Agent Loop — PentestAgent autonomously executes security tools, analyses results, and decides next steps via the MCP (Model Context Protocol) server. The agent runs until the task is complete or max iterations are reached.
  4. Multi-Agent Analysis — Raw pentest output is sent to an xAI Grok multi-agent team (4 or 16 agents) that produces structured findings with CVSS severity, recon data, attack chains, CVE research, and threat intelligence gathered from web and X/Twitter searches.
  5. Attack Chain Verification — Proposed attack chains can be executed directly against the target using automated Kali tool command mapping, with success validated via regex pattern matching.
  6. Artifacts & Report — All output files (exploit proofs, screenshots, Metasploit output) are collected as artifacts. A comprehensive HTML report is generated with executive summary, findings by severity, attack chains, and remediation recommendations.

Agent Architecture

The Red Team system uses a dual-model architecture with specialised roles:

RoleModelPurpose
PentestAgentClaude Sonnet (claude-sonnet-4-20250514)Tool execution, shell commands, browser interaction, structured note-taking
OrchestratorxAI Grok (grok-4-1-fast-reasoning)Strategic reasoning, task planning, adaptive decision-making
Analysis TeamxAI Grok Multi-Agent (grok-4.20-multi-agent)Post-engagement structured analysis with 4 or 16 parallel agents

Infrastructure Modes

Mode Backend Tools Max Iterations Access
Standard Docker container (App Runner) PentestAgent + ProjectDiscovery tools (nmap, nuclei, subfinder, httpx, ffuf, nikto) 60 Business, Enterprise
Advanced Docker container (App Runner) Standard + Kali tools (metasploit, hydra, john, sqlmap, wpscan) 80 Enterprise only
Expert Dedicated EC2 instance (Kali Linux) Full Kali arsenal + SecLists wordlists, privileged network access, Cloudflare Tunnel 100 Enterprise only
Expert Mode Lifecycle: Expert instances run on dedicated EC2 machines and can be started/stopped on demand via the admin dashboard. A Lambda proxy manages the EC2 lifecycle (start, stop, status). Health checks verify readiness before launching engagements.

Playbooks

Playbook Focus Typical Duration
Recon Subdomain enumeration, port scanning, service fingerprinting, DNS configuration, exposed endpoints 10–30 minutes
Web OWASP Top 10, security headers, TLS config, cookie security, directory discovery, injection testing 30–60 minutes
Network Port/service enumeration, known CVEs, default credentials, network segmentation 30–60 minutes
Full Red Team All phases: Recon, Web, Network, then Report generation with remediation steps 1–3 hours

Agent Tools

The PentestAgent communicates via MCP (Model Context Protocol) and has access to these tool categories:

  • terminal — Execute shell commands (nmap, nuclei, curl, sqlmap, metasploit, hydra, wpscan, ffuf, gobuster, etc.). Output truncated at 50K chars.
  • browser — Playwright headless browser for web interaction (navigate, click, type, screenshot, extract links/forms).
  • notes — Structured finding storage with category validation (credential, vulnerability, finding, artifact, recon, infrastructure, report). Persists across the engagement.
  • web_search — Web search integration for OSINT and CVE lookup.

Multi-Agent Analysis

After the PentestAgent completes, raw output is analysed by an xAI Grok multi-agent team. The analysis produces a structured result with the following components:

ComponentDescription
FindingsVulnerabilities with CVSS-aligned severity, CWE classification, evidence, impact, and specific remediation steps
Recon DataIP addresses, subdomains, open ports, and detected technologies
Attack ChainsMulti-step exploit paths with risk level, step-by-step actions, and overall impact assessment
CVE ResearchRelevant CVEs with exploit-in-the-wild status and patch availability (via live web search)
Threat IntelRecent threat discussions from web and X/Twitter searches about the target's technology stack
Positive ControlsSecurity measures the target has correctly implemented
Risk RatingOverall risk level with justification

Analysis can use either a 4-agent team (standard) or 16-agent team (deep analysis) and can be re-run on demand via the reanalyze endpoint.

Finding Severity

Findings use CVSS-aligned severity ratings assigned by the multi-agent analysis team:

Severity CVSS Range Examples
Critical9.0+RCE, auth bypass, credential exposure, actively exploited CVEs
High7.0–8.9SQLi, stored XSS, exposed SSH, SSRF, file upload vulnerabilities
Medium4.0–6.9Missing security headers, user enumeration, outdated software, info disclosure CVEs
Low0.1–3.9Verbose error messages, directory listings, minor misconfigurations
InfoInformational only, not a vulnerability

Attack Chain Verification

Attack chains identified by the multi-agent analysis can be verified by executing them directly against the target. The chain command mapper translates high-level chain steps into concrete Kali tool commands.

Supported Attack Patterns

  • WordPress — User enumeration (wpscan), plugin scanning, XML-RPC brute force, credential testing, shell upload
  • SQL Injection — Automated sqlmap execution with database enumeration
  • Directory Brute Forcing — ffuf, dirb, gobuster with custom wordlists
  • SSH Brute Force — Hydra with configurable wordlists (smart, custom, top1000, top10000 modes)
  • Exploitation — Metasploit framework integration
  • Lateral Movement — Post-exploitation and privilege escalation

Chain Execution Modes

  • Exploit Chain — Automated execution of analysis-identified chains with templated commands, timeout management, and regex-based success validation
  • Custom Chain — User-defined chains with custom parameters (usernames, passwords, target overrides, brute force mode selection)
  • Adaptive Chain — Dynamic chain execution that adapts based on results from previous steps

Artifacts

Engagement artifacts (exploit proofs, tool output, screenshots, loot) are automatically collected from the PentestAgent backend and stored for review:

  • Fetched from the backend /rt/artifacts endpoint (bulk or individual)
  • Stored in KV with 90-day retention
  • Tracked in the engagement_artifacts database table with file metadata
  • Downloadable via the artifacts API endpoints

Red Team Operator

The Red Team Operator is a voice and text AI agent interface for administrators to interactively manage engagements:

  • Voice conversations via xAI voice API with real-time transcript
  • Text-based chat with persistent conversation history
  • Can programmatically create engagements, launch attack chains, and fetch findings
  • Conversations are linked to engagements and stored in the redteam_conversations table
  • Admin-only access with unrestricted pentest safeguards

Domain Verification

Before launching a red team engagement, you must verify ownership of the target domain. Two methods are supported:

  • DNS TXT Record — Add a TXT record to the domain with a generated verification token. Checked via Google DNS.
  • File Upload — Place a file at /.well-known/cyberdoc-verify.txt containing the token. Checked via HTTP fetch.

Once verified, the domain remains verified for future engagements. Admins bypass this requirement.

Engagement Lifecycle

  • Launch — Create engagement with target, playbook, scope, and infrastructure mode
  • Poll Status — Monitor progress (queued, running, complete, failed, cancelled)
  • View Results — Retrieve structured findings, recon data, attack chains, and analysis
  • Reanalyze — Re-run multi-agent analysis on existing results with updated prompts
  • Chain Verification — Execute identified attack chains for proof-of-exploit
  • Report — Generate branded HTML report for download or print
  • Archive/Delete — Archive old engagements or permanently delete them
  • Cancel — Abort a running engagement

Security Guardrails

Tool output from untrusted sources is filtered through prompt injection guardrails adapted from the CAI framework:

  • 40+ regex patterns detecting instruction overrides, hidden commands, encoding tricks
  • Unicode homograph normalization (Cyrillic/Greek to Latin)
  • Content sanitization with security delimiters
  • Prevents target servers from hijacking the agent via crafted responses

API Endpoints

All red team endpoints require authentication and are prefixed with /api/redteam. Business or Enterprise plan required (admins exempt).

Engagements

MethodEndpointDescription
POST/api/redteam/launchStart engagement (target, playbook, scope, mode)
GET/api/redteam/status?id=Poll engagement status and progress
GET/api/redteam/result?id=Get full results with structured findings and analysis
GET/api/redteam/notes?id=Get raw PentestAgent notes for an engagement
POST/api/redteam/cancelCancel a running engagement
GET/api/redteam/engagementsList workspace engagements
GET/api/redteam/engagements/:idGet single engagement details
POST/api/redteam/engagement/:id/archiveArchive an engagement
POST/api/redteam/engagement/:id/unarchiveRestore an archived engagement
DELETE/api/redteam/engagement/:idPermanently delete an engagement (admin only)

Analysis & Chain Verification

MethodEndpointDescription
POST/api/redteam/reanalyze/:idRe-run multi-agent analysis on existing results
POST/api/redteam/exploit-chain/:idExecute an attack chain with automated Kali commands
POST/api/redteam/custom-chain/:idExecute a custom chain with user-defined parameters

Artifacts & Reports

MethodEndpointDescription
GET/api/redteam/artifacts/:idList artifacts for an engagement
GET/api/redteam/artifact/:id/:filenameDownload a specific artifact file
GET/api/redteam/report/:idGenerate branded HTML report

Domain Verification

MethodEndpointDescription
POST/api/redteam/verify-domainRequest a domain verification token
POST/api/redteam/check-verificationCheck domain verification status
GET/api/redteam/domainsList verified domains for the workspace

Operator & Conversations

MethodEndpointDescription
POST/api/redteam/voiceGet ephemeral voice token for Red Team operator
POST/api/redteam/conversationCreate or send message in an operator conversation
GET/api/redteam/conversationsList operator conversations
DELETE/api/redteam/conversation/:idDelete a conversation

Expert Instance Management

MethodEndpointDescription
POST/api/redteam/expert/:actionStart, stop, or check status of Expert EC2 instance
POST/api/redteam/expert-healthCheck Expert instance readiness before launching

Metrics (Admin)

MethodEndpointDescription
GET/api/redteam/metricsEngagement statistics and usage metrics (admin only)