Most engagements force a choice: traditional pentest firm or AI safety lab. We run both layers in parallel and produce one cross-referenced report — so cross-layer attack chains (a leaked IAM credential enabling model exfiltration, a prompt injection escalating to infrastructure access) actually get found.
As AI systems expand the attack surface — agentic pipelines, model APIs, training supply chains, multi-agent orchestration — the boundary between "security" and "AI safety" has collapsed. Our methodology covers the whole surface.
Regulatory trigger — Australia and EU, 2026
The Australian Signals Directorate confirmed in May 2026 that Claude Mythos Preview is the first frontier model observed autonomously chaining individual cyber tasks into a complete end-to-end intrusion — executing a 32-step simulated corporate network attack without human guidance (UK AISI evaluation). Separately, Mozilla reported Mythos identified 271 vulnerabilities fixed in a single Firefox release, an order-of-magnitude increase over prior AI-assisted efforts.
ASD's key finding: open-weight models can already reproduce many Mythos techniques, and the assumption that adversaries lag frontier capabilities by months is no longer safe. Patch cycles and attack economics have both collapsed.
EU AI Act GPAI obligations begin 2 August 2026; high-risk system obligations 2 August 2027. Independent, reproducible pentest evidence — covering both the AI model layer and supporting infrastructure — is what regulators, auditors, and insurers now ask to see.
What We Test
A Failure-First engagement covers the full attack surface of an AI deployment: the infrastructure that runs it and the model layer itself. Findings from both layers appear in a single unified report, with cross-layer attack chains explicitly identified.
Traditional Layer
- Web application security
- OWASP Top 10, REST/GraphQL API endpoints, authentication flows
- Cloud infrastructure & IAM
- AWS/GCP/Azure misconfiguration, privilege escalation, exposed services
- Supply chain & dependencies
- Known CVEs in packages, container images, model artefacts
- Secrets & credential exposure
- Hardcoded keys, leaked tokens, environment variable leakage
- Static application security
- Code-level vulnerability patterns, unsafe deserialisation, injection sinks
AI Layer
- LLM adversarial testing
- Jailbreak taxonomy (81 techniques, 6 eras), prompt injection, refusal suppression
- Agentic system testing
- Tool misuse, chain exploitation, cross-agent injection, orchestration abuse
- Alignment auditing
- Deception, sycophancy, self-preservation, cooperation with harmful instructions
- Multi-agent pipeline attacks
- Inter-agent trust exploitation, context poisoning, state manipulation
- AI supply chain
- Model provenance, training data poisoning indicators, weight file integrity
Why Failure-First
AI layer findings are grounded in the largest open adversarial dataset for embodied and agentic AI — not hypothetical scenarios.
- Attack taxonomy validated across 100+ models spanning 6 research eras (2022–2025)
- 26 published research reports — methodology is public and peer-reviewable
- FLIP (Failure-Level Impact Protocol) grading with documented inter-rater reliability
- F1 Pipeline: proprietary corpus of 81 attack techniques run against your specific model endpoint
How an Engagement Works
Standard and Ongoing tiers follow the four-phase process below. Quick Scan compresses Phases 1–3 into 5–7 business days with a reduced scenario count. All tiers require a signed Authorisation to Test (ATT) before any active scanning begins.
Scoping & Threat Modelling
- System architecture and deployment context review
- Attack surface mapping — infrastructure and AI layer
- Regulatory framework identification (VAISS, EU AI Act, NIST AI RMF, ISO 42001)
- Signed Authorisation to Test (ATT) and rules of engagement
- Selection of attack scenarios from validated taxonomy
Parallel Testing Execution
- Traditional layer: web application, cloud/IAM, supply chain, secrets, SAST
- AI layer: LLM adversarial, agentic pipeline, alignment auditing
- Evidence capture and per-finding documentation throughout
- Critical/High findings flagged to client same-day
FLIP Grading, Analysis & Report
- All findings graded with FLIP (Failure-Level Impact Protocol)
- Cross-layer attack chain analysis — where infra findings compound AI findings
- Compliance mapping to applicable regulatory frameworks
- Core Technical Report, Compliance Mapping Report, Evidence Archive delivered
- Debrief call — findings walkthrough and remediation Q&A
Tools & Infrastructure
All third-party tools are open-source. Our own pipeline is documented in published research and reproducible from the public repository. Every finding references the tool, version, and configuration used — there are no black-box scanners in the stack.
Traditional Layer
| OWASP ZAP | Web application scanning — OWASP Top 10, active/passive scan |
| Nuclei | CVE and misconfiguration scanning — 10,000+ community templates |
| Gitleaks | Secrets and credential exposure across git history and working tree |
| Prowler | Cloud security posture — AWS, GCP, Azure IAM and configuration |
| Semgrep | Static application security testing — injection sinks, unsafe patterns |
| OSV-Scanner + Grype | Supply chain: known CVEs in packages, containers, and model artefacts |
AI Layer
| F1 Pipeline | Failure-First corpus — 81 attack techniques, 6 eras, graded against 100+ models; run against your specific endpoint |
| garak | LLM adversarial testing — jailbreak, prompt injection, data extraction |
| AgentDojo | Agentic system testing — tool misuse, injection, orchestration exploits |
| Petri (inspect-petri) | Alignment auditing — 38 judge dimensions across 173+ seed instructions |
| promptfoo | LLM red-teaming framework — configurable attack strategies, provider-agnostic |
| HarmBench | Standardised harmful content evaluation against academic benchmark behaviours |
| StrongREJECT | Refusal quality grading — distinguishes genuine refusals from over-refusal |
| agentic-radar | Agentic surface analysis — framework detection, trust boundary mapping |
Additional adapters available on request: PyRIT (Microsoft), DeepTeam.
What You Receive
All engagements begin with a signed Authorisation to Test (ATT) and mutually agreed rules of engagement. Engagements are covered by professional indemnity insurance. A mutual NDA is standard.
- Core Technical Report — all findings by severity (Critical / High / Medium / Low / Informational), tool attribution, reproduction steps, and evidence paths. FLIP-graded throughout.
- Compliance Mapping Report (Standard and Ongoing tiers) — findings mapped to EU AI Act Articles 9 and 15, Australia's Voluntary AI Safety Standard Guardrail 4, NIST AI RMF MEASURE 2.6/2.7, ISO/IEC 42001 A.6.2.4. Each finding carries a compliance status: satisfied / partial / gap.
- Executive Summary — 2–4 page board/audit committee narrative with risk posture, key findings, and recommended actions. Suitable for regulatory submissions and insurer reporting.
- Evidence Archive — raw tool output, grading transcripts, FLIP verdicts, and methodology references. Packaged for auditor inspection or notified body review.
- Remediation Roadmap — severity-prioritised fix list with acceptance criteria, enabling structured re-test after remediation.
- Coordinated Disclosure Agreement — vulnerabilities reported to you first with mutually agreed timelines before any public disclosure. Findings are not added to the public research corpus without explicit client consent.
Engagement Tiers
Three structured tiers mapped to deployment stage and regulatory need. All tiers cover both the traditional and AI layers — the difference is depth and scope.
Quick Scan
Contact us for pricing- Traditional layer: web app, secrets, top-3 cloud findings
- AI layer: top-5 attack families against your model endpoint
- FLIP-graded vulnerability profile
- Executive summary with corpus baseline comparison
- Compressed methodology: 5–7 business days
Best for: Pre-deployment sanity check, model selection, internal risk committees, VAISS spot check
Standard
Contact us for pricing- Full capability matrix — all traditional tools + full AI layer suite
- Cross-layer attack chain analysis
- Compliance Mapping Report (EU AI Act Art 9/15, VAISS, NIST AI RMF, ISO 42001)
- Remediation roadmap
- 4-week delivery
Best for: EU AI Act GPAI compliance (August 2026), VAISS full assessment, regulatory submissions, pre-Series B security diligence
Ongoing
Contact us for pricing- Monthly adversarial probe — traditional and AI layers
- New technique coverage as threats emerge
- GLI regulatory monitoring for your jurisdiction
- Quarterly threat landscape brief
- 48-hour incident response for disclosed AI vulnerabilities
Best for: Deployed systems, fleet operators, continuous compliance obligations, insurers requiring periodic testing evidence
Common Questions
- Do you need production system access?
- No. We test against staging or dedicated test endpoints. Where production access is required by scope, this is agreed explicitly in the Authorisation to Test.
- How do you handle model weights and proprietary data?
- All client artefacts remain within the agreed evidence boundary. We do not retain, transmit, or train on client model weights or data. A mutual NDA is standard.
- Can you work under our existing pentest MSA?
- Yes. We accept standard master security services agreements with reasonable modifications.
- Will findings appear in your public research corpus?
- No, without explicit written consent. Engagement findings are confidential by default under the coordinated disclosure agreement.
- What insurance do you carry?
- Professional indemnity and public liability. Coverage details provided at scoping.
- Do you test third-party-hosted SaaS or model APIs you do not operate?
- Only with written authorisation from the relevant third party. This is addressed during scoping.
Always out of scope: denial-of-service attacks, social engineering of client staff, physical access testing, third-party infrastructure without written authorisation from that party.
Get Started
Discovery calls are free. We scope engagements based on your deployment timeline, risk profile, and regulatory obligations. Typical scoping takes 5 business days.
Alternative: Contact form